Docs Language Reference

Language Reference

Canonical grammar reference for KynML. Every block, keyword, layer type, activation, loss, optimizer, scheduler, metric, and export format — with exact syntax, allowed values, defaults, and examples.


Syntax Fundamentals

Indentation and Structure

KynML is indentation-sensitive with a two-level structure:

<block-header>:       ← column 1, ends with ':'
    <key> = <value>   ← exactly 4 spaces indent
  • Block headers must start at column 1 (zero leading spaces).
  • Body lines must be indented by exactly 4 spaces. Tabs are not allowed.
  • Comments begin with # and are stripped before parsing. Comments may appear at any indentation level.
  • Empty lines are ignored.

Value Types

Type Syntax Example
Integer Bare digits, optional leading - 64, -1
Float Digits with a . 0.001, 0.8
Boolean true or false (lowercase) true, false
String Double-quoted "data/train.csv"
Identifier Bare word (letters, digits, _) relu, adam, auto
Function call name(key=value, ...) adam(lr=0.001)
List [item, item, ...] [mae, rmse]

Values in lists and function-call arguments follow the same parsing rules recursively.

Block Types

There are two categories:

Named blocks — take a user-defined identifier after the keyword:

dataset MyData:
    ...

model MyNet:
    ...

Simple blocks — no name, at most one per program:

train:
    ...

evaluate:
    ...

export:
    ...

A valid program must include at least one dataset, at least one model, and exactly one train block. evaluate and export are optional.


dataset Block

Declares a data source, split, and loading parameters. Multiple dataset blocks are allowed; each must have a unique name. The train block references one by name.

Syntax

dataset <Name>:
    source   = csv("<path>")
    target   = "<column>"
    split    = <float>
    normalize = <bool>
    shuffle  = <bool>
    num_workers = <int>
    pin_memory  = <bool>
    prefetch    = <int>

Options

Key Type Default Description
source function call required Data source. Currently only csv("<path>") is accepted by the semantic validator. See Datasets and Connectors for HuggingFace and object-store connectors (runtime-level, not spec-level).
target string required Name of the target/label column in the CSV.
split float 0.8 Train fraction. Must be strictly between 0 and 1. The remainder becomes the test split.
normalize bool false If true, apply sklearn.preprocessing.StandardScaler to feature columns (fit on train, transform test).
shuffle bool true Whether to shuffle data before splitting and during training DataLoader iteration.
num_workers int 0 DataLoader(num_workers=...). 0 means load in the main process. Increase for CPU-heavy preprocessing. Must be >= 0.
pin_memory bool false DataLoader(pin_memory=...). Speeds up host-to-GPU transfer when using CUDA.
prefetch int (none) DataLoader(prefetch_factor=...). Number of batches loaded in advance per worker. Only meaningful when num_workers > 0.

Source Function

source = csv("<relative-or-absolute-path>")

The path is resolved relative to the current working directory first; if not found there, relative to the directory containing the .kyn file. The generated script embeds the resolved absolute path at compile time.

Example

dataset HouseData:
    source = csv("data/housing.csv")
    target = "price"
    split = 0.8
    normalize = true
    shuffle = true
    num_workers = 4
    pin_memory = true
    prefetch = 2

model Block

Declares the neural network architecture as a sequence of layers. Multiple model blocks are allowed; the train block references one by name. The generated model is an nn.Sequential wrapped in a named nn.Module subclass.

Syntax

model <Name>:
    input <N>
    dense <N> <activation>
    dropout <P>
    batchnorm [<N>]

Layers are written one per line in order. The first layer must be input. At least one dense layer is required.

Layer Types

input <N>

Declares the input dimension. Must appear exactly once, as the first layer.

input 10    # 10 input features
  • N — positive integer, number of input features.

dense <N> <activation>

A fully-connected linear layer followed by an optional activation.

dense 64 relu
dense 1  linear
  • N — positive integer, number of output units.
  • activation — one of the supported activations. Required.

dropout <P>

Randomly zeros elements of the input tensor with probability P during training (nn.Dropout).

dropout 0.3
  • P — float in [0, 1). 0.0 is valid (no-op). 1.0 is rejected.

batchnorm [<N>]

Applies nn.BatchNorm1d. The feature size (N) is optional; if omitted, it is auto-inferred from the preceding input or dense layer's output width.

batchnorm        # infer num_features from preceding layer
batchnorm 64     # explicit num_features
  • N — optional positive integer. When provided, it must match the preceding layer's output width; a mismatch raises KynMLShapeError at compile time.
  • If neither a preceding layer nor an explicit N is available, KynMLShapeError is raised at compile time.

See Shape-Inference for the full inference rules.

Activations

Activations are written as the third token on a dense line.

Keyword PyTorch equivalent Notes
relu nn.ReLU() Default choice for hidden layers
leaky_relu nn.LeakyReLU() Avoids dying-ReLU with default negative slope
gelu nn.GELU() Common in transformer-style FFN layers
sigmoid nn.Sigmoid() Output layer for binary classification
tanh nn.Tanh() Output in (-1, 1)
softmax nn.Softmax(dim=1) Output layer for multiclass; use with cross_entropy
log_softmax nn.LogSoftmax(dim=1) Log-probability output; use with nll loss
linear (no activation added) Raw linear output; use for regression

Example

model ChurnNet:
    input 15
    dense 64 relu
    batchnorm
    dropout 0.3
    dense 32 leaky_relu
    dropout 0.2
    dense 1 sigmoid

train Block

Configures the training loop. Exactly one train block per program. All options listed below are recognised; model, data, loss, optimizer, epochs, and batch are required.

Syntax

train:
    model     = <model-name>
    data      = <dataset-name>
    loss      = <loss>
    optimizer = <optimizer-call>
    epochs    = <int>
    batch     = <int>
    device    = <device>
    precision = <precision>
    compile   = <bool>
    scheduler = <scheduler-call>
    early_stop = early_stop(<kwargs>)
    checkpoint = checkpoint(<kwargs>)

Required Options

Key Type Description
model identifier Name of a model block defined in this program.
data identifier Name of a dataset block defined in this program.
loss identifier Loss function keyword. See Losses.
optimizer function call Optimizer with hyperparameters. See Optimizers.
epochs int Number of training epochs. Must be > 0.
batch int Mini-batch size. Must be > 0.

Optional Options

Key Type Default Description
device identifier auto Compute device. One of auto, cpu, cuda. auto selects CUDA if available, else CPU.
precision identifier fp32 Floating-point precision. One of fp32, fp16, bf16. fp16/bf16 enable PyTorch AMP. See Speed Guide.
compile bool false If true, wraps the model with torch.compile() after instantiation (requires PyTorch 2.0+).
scheduler function call (none) Learning-rate scheduler. See Schedulers.
early_stop function call (none) Early stopping callback. See Early Stopping.
checkpoint function call (none) Checkpoint callback. See Checkpointing.
seed int (none) RNG seed applied before training (covers random, numpy, torch, and CUDA). When set, train_test_split also uses this seed for reproducible splits.
deterministic bool false When true, additionally enables torch.use_deterministic_algorithms(True) and torch.backends.cudnn.deterministic = True. Requires seed to be meaningful.

Losses

Keyword PyTorch class Use case
mse nn.MSELoss() Regression
mae nn.L1Loss() Regression (same as l1)
l1 nn.L1Loss() Regression (same as mae)
huber nn.HuberLoss() Regression, robust to outliers
bce nn.BCELoss() Binary classification (sigmoid output)
cross_entropy nn.CrossEntropyLoss() Multiclass classification; expects long targets
nll nn.NLLLoss() Multiclass; use with log_softmax output

Shape rules (enforced at compile time by the IR inference pass):

Loss Output units required Target dtype Error if violated
bce exactly 1 float32 KynMLShapeError
cross_entropy ≥ 2 int64 (torch.long) KynMLShapeError
nll ≥ 2 int64 (torch.long) KynMLShapeError
mse, mae, l1, huber any float32

Activation warnings (not errors): cross_entropy applies log-softmax internally. Using a final softmax or log_softmax activation with cross_entropy produces a compiler warning but is not rejected. Use linear activation with cross_entropy to avoid double-applying softmax.

These checks run during compile_to_ir, not during validate. See Shape-Inference.

Optimizers

Optimizers are specified as function calls with keyword arguments.

adam(lr=<float>)

optimizer = adam(lr=0.001)
  • lr — learning rate. Default: 0.001.
  • Generates: optim.Adam(model.parameters(), lr=<lr>)

adamw(lr=<float>, weight_decay=<float>)

optimizer = adamw(lr=0.001, weight_decay=0.01)
  • lr — learning rate. Default: 0.001.
  • weight_decay — L2 regularization coefficient. Default: 0.01.
  • Generates: optim.AdamW(model.parameters(), lr=<lr>, weight_decay=<weight_decay>)

sgd(lr=<float>, momentum=<float>)

optimizer = sgd(lr=0.01)
optimizer = sgd(lr=0.01, momentum=0.9)
  • lr — learning rate. Default: 0.01.
  • momentum — optional. If omitted, no momentum term is added.
  • Generates: optim.SGD(model.parameters(), lr=<lr>[, momentum=<momentum>])

rmsprop(lr=<float>, momentum=<float>)

optimizer = rmsprop(lr=0.001, momentum=0.0)
  • lr — learning rate. Default: 0.01.
  • momentum — Default: 0.
  • Generates: optim.RMSprop(model.parameters(), lr=<lr>, momentum=<momentum>)

Schedulers

Schedulers are specified as function calls. scheduler.step() is called once per epoch after the loss update.

step(step_size=<int>, gamma=<float>)

scheduler = step(step_size=10, gamma=0.1)

Decays LR by gamma every step_size epochs.

  • step_size — default 10.
  • gamma — default 0.1.
  • Generates: StepLR(optimizer, step_size=<step_size>, gamma=<gamma>)

cosine(t_max=<int>)

scheduler = cosine(t_max=50)

Cosine annealing over t_max epochs.

  • t_max — default 10.
  • Generates: CosineAnnealingLR(optimizer, T_max=<t_max>)

onecycle(max_lr=<float>)

scheduler = onecycle(max_lr=0.01)

OneCycleLR policy. epochs is taken from the train block.

  • max_lr — default 0.01.
  • Generates: OneCycleLR(optimizer, max_lr=<max_lr>, epochs=EPOCHS, steps_per_epoch=1)

Early Stopping

early_stop = early_stop(patience=5, mode="min")

Monitors training loss each epoch. If no improvement for patience consecutive epochs, training stops early with a printed message.

Kwarg Type Default Description
patience int 5 Number of epochs without improvement before stopping. Must be > 0.
mode string "min" "min" stops if loss stops decreasing. "max" stops if loss stops increasing.
metric identifier (none) If provided, must be one of mae, mse, rmse, accuracy. Validated but not yet wired to evaluation metrics — currently tracks epoch_loss.

Checkpointing

checkpoint = checkpoint(every_n=5, path="checkpoints/ckpt.pt", async_save=false)

Saves model and optimizer state dicts to disk every every_n epochs. On the next train run with the same config, the checkpoint is loaded and training resumes from the saved epoch.

Kwarg Type Default Description
every_n int 1 Save interval in epochs. Must be > 0.
path string "checkpoints/ckpt.pt" Path for the checkpoint file. Parent directories are created automatically.
async_save bool false If true, checkpoint writes are dispatched to a daemon thread, avoiding blocking the training loop.

Full Example

train:
    model = HousePriceModel
    data  = HouseData
    loss  = mse
    optimizer  = adamw(lr=0.0005, weight_decay=0.01)
    epochs     = 100
    batch      = 64
    device     = auto
    precision  = fp16
    compile    = true
    scheduler  = cosine(t_max=100)
    early_stop = early_stop(patience=10, mode="min")
    checkpoint = checkpoint(every_n=10, path="ckpts/run.pt", async_save=true)

evaluate Block

Computes metrics on the held-out test split after training. Optional.

Syntax

evaluate:
    metrics = [<metric>, ...]

Metrics

Keyword Description Regression Multiclass
mae Mean Absolute Error yes yes (on class indices)
mse Mean Squared Error yes yes (on class indices)
rmse Root Mean Squared Error yes yes (on class indices)
accuracy Fraction of correct predictions yes (threshold 0.5) yes (argmax)

Any metric not in this set raises a KynMLSemanticError at validation time.

Example

evaluate:
    metrics = [mae, rmse, accuracy]

export Block

Serialises the trained model to disk. Optional.

Syntax

export:
    format = <format>
    path   = "<output-path>"
    input_shape = [<int>, ...]   # required for onnx
    opset  = <int>               # onnx only

Options

Key Type Default Description
format identifier required Export format. One of torch, torchscript, onnx.
path string required Output file path, resolved relative to cwd at compile time.
input_shape list of ints (none) Required when format = onnx. Shape of the dummy input tensor, e.g. [1, 10].
opset int 17 ONNX opset version. Only used when format = onnx.

Export Formats

torch

Saves model.state_dict() using torch.save. Smallest file; requires the model class to reconstruct.

export:
    format = torch
    path = "models/net.pt"

Generates: torch.save(model.state_dict(), path)

torchscript

Traces and serialises the model with torch.jit.script. Self-contained; no Python class needed at inference time.

export:
    format = torchscript
    path = "models/net_scripted.pt"

Generates: torch.jit.script(model).save(path)

onnx

Exports to the Open Neural Network Exchange format. input_shape is mandatory. opset controls the ONNX opset version (default 17).

export:
    format = onnx
    path = "models/net.onnx"
    input_shape = [1, 10]
    opset = 17

Generates: torch.onnx.export(model, dummy, path, opset_version=opset, input_names=["input"], output_names=["output"])


Complete Program Example

# Multiclass iris classifier — all features demonstrated

dataset Iris:
    source      = csv("data/iris.csv")
    target      = "species"
    split       = 0.8
    normalize   = true
    shuffle     = true
    num_workers = 2
    pin_memory  = true
    prefetch    = 4

model IrisNet:
    input 4
    dense 32 relu
    batchnorm
    dropout 0.2
    dense 16 gelu
    dense 3 softmax

train:
    model      = IrisNet
    data       = Iris
    loss       = cross_entropy
    optimizer  = adamw(lr=0.001, weight_decay=0.01)
    epochs     = 50
    batch      = 16
    device     = auto
    precision  = fp16
    compile    = true
    scheduler  = cosine(t_max=50)
    early_stop = early_stop(patience=8, mode="min")
    checkpoint = checkpoint(every_n=5, path="ckpts/iris.pt")

evaluate:
    metrics = [accuracy]

export:
    format      = onnx
    path        = "models/iris.onnx"
    input_shape = [1, 4]
    opset       = 17

Composition

Three opt-in features extend the base language for reuse and experimentation. Programs that use none of them are unaffected. Full details and runnable examples are in Composition.

import statements

Pull dataset and model blocks from another .kyn file. Must appear at the top of the file, before any block.

import "shared/base.kyn"
  • Only dataset and model blocks are imported. train, evaluate, export, params, and sweep blocks in the imported file are ignored.
  • Paths are resolved relative to the importing file.
  • Circular imports and duplicate block names raise errors.
  • Requires source_path to be provided when calling compile_to_ir.

params block

Declare named hyperparameters with defaults. Reference them with $name anywhere a value is expected.

params:
    lr     = 0.001
    hidden = 64
    epochs = 20

model M:
    input 4
    dense $hidden relu
    dense 1 linear

train:
    model     = M
    data      = D
    loss      = mse
    optimizer = adam(lr=$lr)
    epochs    = $epochs
    batch     = 32
  • Values may be integers, floats, strings, booleans, or lists.
  • $name references are substituted before semantic validation and codegen.
  • An undefined $name (no params block and no CLI override) raises KynMLSemanticError.
  • CLI overrides: kynml compile model.kyn -o out.py --param lr=0.01 --param hidden=128

sweep block

Map parameter names to lists of values. kynml sweep expands the Cartesian product into one script per combination.

params:
    lr     = 0.001
    hidden = 32

sweep:
    lr     = [0.001, 0.01]
    hidden = [32, 64]
  • Each axis value must be a non-empty list. A bare scalar raises KynMLParseError.
  • expand_sweep returns the full Cartesian product: 2 × 2 = 4 combinations above.
  • kynml sweep model.kyn generates per-combo scripts and writes sweep_results.json.

Grammar Summary

program        ::= import_stmt* params_block? sweep_block?
                   (dataset_block | model_block)* train_block evaluate_block? export_block?

import_stmt    ::= "import" STRING NEWLINE
params_block   ::= "params" ":" NEWLINE (INDENT NAME "=" value NEWLINE)+
sweep_block    ::= "sweep"  ":" NEWLINE (INDENT NAME "=" list NEWLINE)+

dataset_block  ::= "dataset" NAME ":" NEWLINE body
model_block    ::= "model"   NAME ":" NEWLINE body
train_block    ::= "train"       ":" NEWLINE body
evaluate_block ::= "evaluate"    ":" NEWLINE body
export_block   ::= "export"      ":" NEWLINE body

body           ::= (INDENT assignment NEWLINE)+
assignment     ::= NAME "=" value

value          ::= bool | integer | float | string | identifier | param_ref
                 | function_call | list
bool           ::= "true" | "false"
integer        ::= "-"? DIGIT+
float          ::= "-"? (DIGIT+ "." DIGIT* | DIGIT* "." DIGIT+)
string         ::= '"' chars '"'
identifier     ::= LETTER (LETTER | DIGIT | "_")*
param_ref      ::= "$" LETTER (LETTER | DIGIT | "_")*   -- composition only
function_call  ::= NAME "(" [arg ("," arg)*] ")"
arg            ::= value | NAME "=" value
list           ::= "[" [value ("," value)*] "]"

model_layer    ::= input_layer | dense_layer | dropout_layer | batchnorm_layer
input_layer    ::= "input"    INTEGER
dense_layer    ::= "dense"    INTEGER activation
dropout_layer  ::= "dropout"  FLOAT
batchnorm_layer::= "batchnorm" INTEGER?

activation     ::= "relu" | "leaky_relu" | "gelu" | "sigmoid" | "tanh"
                 | "softmax" | "log_softmax" | "linear"

Model body lines are parsed differently from assignment bodiesinput, dense, dropout, and batchnorm are positional tokens, not key = value assignments.

Composition grammar notes:
- import_stmt lines appear before any block header.
- params_block and sweep_block appear before dataset / model blocks (enforced by convention; the parser accepts any order at the top level).
- param_ref ($name) is valid anywhere a value is expected. It is replaced by substitute_params before semantic validation.
- sweep_block axis values must be lists; bare scalars raise KynMLParseError.