Language Reference
Canonical grammar reference for KynML. Every block, keyword, layer type, activation, loss, optimizer, scheduler, metric, and export format — with exact syntax, allowed values, defaults, and examples.
Syntax Fundamentals
Indentation and Structure
KynML is indentation-sensitive with a two-level structure:
<block-header>: ← column 1, ends with ':'
<key> = <value> ← exactly 4 spaces indent
- Block headers must start at column 1 (zero leading spaces).
- Body lines must be indented by exactly 4 spaces. Tabs are not allowed.
- Comments begin with
#and are stripped before parsing. Comments may appear at any indentation level. - Empty lines are ignored.
Value Types
| Type | Syntax | Example |
|---|---|---|
| Integer | Bare digits, optional leading - |
64, -1 |
| Float | Digits with a . |
0.001, 0.8 |
| Boolean | true or false (lowercase) |
true, false |
| String | Double-quoted | "data/train.csv" |
| Identifier | Bare word (letters, digits, _) |
relu, adam, auto |
| Function call | name(key=value, ...) |
adam(lr=0.001) |
| List | [item, item, ...] |
[mae, rmse] |
Values in lists and function-call arguments follow the same parsing rules recursively.
Block Types
There are two categories:
Named blocks — take a user-defined identifier after the keyword:
dataset MyData:
...
model MyNet:
...
Simple blocks — no name, at most one per program:
train:
...
evaluate:
...
export:
...
A valid program must include at least one dataset, at least one model, and exactly one train block. evaluate and export are optional.
dataset Block
Declares a data source, split, and loading parameters. Multiple dataset blocks are allowed; each must have a unique name. The train block references one by name.
Syntax
dataset <Name>:
source = csv("<path>")
target = "<column>"
split = <float>
normalize = <bool>
shuffle = <bool>
num_workers = <int>
pin_memory = <bool>
prefetch = <int>
Options
| Key | Type | Default | Description |
|---|---|---|---|
source |
function call | required | Data source. Currently only csv("<path>") is accepted by the semantic validator. See Datasets and Connectors for HuggingFace and object-store connectors (runtime-level, not spec-level). |
target |
string | required | Name of the target/label column in the CSV. |
split |
float | 0.8 |
Train fraction. Must be strictly between 0 and 1. The remainder becomes the test split. |
normalize |
bool | false |
If true, apply sklearn.preprocessing.StandardScaler to feature columns (fit on train, transform test). |
shuffle |
bool | true |
Whether to shuffle data before splitting and during training DataLoader iteration. |
num_workers |
int | 0 |
DataLoader(num_workers=...). 0 means load in the main process. Increase for CPU-heavy preprocessing. Must be >= 0. |
pin_memory |
bool | false |
DataLoader(pin_memory=...). Speeds up host-to-GPU transfer when using CUDA. |
prefetch |
int | (none) | DataLoader(prefetch_factor=...). Number of batches loaded in advance per worker. Only meaningful when num_workers > 0. |
Source Function
source = csv("<relative-or-absolute-path>")
The path is resolved relative to the current working directory first; if not found there, relative to the directory containing the .kyn file. The generated script embeds the resolved absolute path at compile time.
Example
dataset HouseData:
source = csv("data/housing.csv")
target = "price"
split = 0.8
normalize = true
shuffle = true
num_workers = 4
pin_memory = true
prefetch = 2
model Block
Declares the neural network architecture as a sequence of layers. Multiple model blocks are allowed; the train block references one by name. The generated model is an nn.Sequential wrapped in a named nn.Module subclass.
Syntax
model <Name>:
input <N>
dense <N> <activation>
dropout <P>
batchnorm [<N>]
Layers are written one per line in order. The first layer must be input. At least one dense layer is required.
Layer Types
input <N>
Declares the input dimension. Must appear exactly once, as the first layer.
input 10 # 10 input features
N— positive integer, number of input features.
dense <N> <activation>
A fully-connected linear layer followed by an optional activation.
dense 64 relu
dense 1 linear
N— positive integer, number of output units.activation— one of the supported activations. Required.
dropout <P>
Randomly zeros elements of the input tensor with probability P during training (nn.Dropout).
dropout 0.3
P— float in[0, 1).0.0is valid (no-op).1.0is rejected.
batchnorm [<N>]
Applies nn.BatchNorm1d. The feature size (N) is optional; if omitted, it is auto-inferred from the preceding input or dense layer's output width.
batchnorm # infer num_features from preceding layer
batchnorm 64 # explicit num_features
N— optional positive integer. When provided, it must match the preceding layer's output width; a mismatch raisesKynMLShapeErrorat compile time.- If neither a preceding layer nor an explicit
Nis available,KynMLShapeErroris raised at compile time.
See Shape-Inference for the full inference rules.
Activations
Activations are written as the third token on a dense line.
| Keyword | PyTorch equivalent | Notes |
|---|---|---|
relu |
nn.ReLU() |
Default choice for hidden layers |
leaky_relu |
nn.LeakyReLU() |
Avoids dying-ReLU with default negative slope |
gelu |
nn.GELU() |
Common in transformer-style FFN layers |
sigmoid |
nn.Sigmoid() |
Output layer for binary classification |
tanh |
nn.Tanh() |
Output in (-1, 1) |
softmax |
nn.Softmax(dim=1) |
Output layer for multiclass; use with cross_entropy |
log_softmax |
nn.LogSoftmax(dim=1) |
Log-probability output; use with nll loss |
linear |
(no activation added) | Raw linear output; use for regression |
Example
model ChurnNet:
input 15
dense 64 relu
batchnorm
dropout 0.3
dense 32 leaky_relu
dropout 0.2
dense 1 sigmoid
train Block
Configures the training loop. Exactly one train block per program. All options listed below are recognised; model, data, loss, optimizer, epochs, and batch are required.
Syntax
train:
model = <model-name>
data = <dataset-name>
loss = <loss>
optimizer = <optimizer-call>
epochs = <int>
batch = <int>
device = <device>
precision = <precision>
compile = <bool>
scheduler = <scheduler-call>
early_stop = early_stop(<kwargs>)
checkpoint = checkpoint(<kwargs>)
Required Options
| Key | Type | Description |
|---|---|---|
model |
identifier | Name of a model block defined in this program. |
data |
identifier | Name of a dataset block defined in this program. |
loss |
identifier | Loss function keyword. See Losses. |
optimizer |
function call | Optimizer with hyperparameters. See Optimizers. |
epochs |
int | Number of training epochs. Must be > 0. |
batch |
int | Mini-batch size. Must be > 0. |
Optional Options
| Key | Type | Default | Description |
|---|---|---|---|
device |
identifier | auto |
Compute device. One of auto, cpu, cuda. auto selects CUDA if available, else CPU. |
precision |
identifier | fp32 |
Floating-point precision. One of fp32, fp16, bf16. fp16/bf16 enable PyTorch AMP. See Speed Guide. |
compile |
bool | false |
If true, wraps the model with torch.compile() after instantiation (requires PyTorch 2.0+). |
scheduler |
function call | (none) | Learning-rate scheduler. See Schedulers. |
early_stop |
function call | (none) | Early stopping callback. See Early Stopping. |
checkpoint |
function call | (none) | Checkpoint callback. See Checkpointing. |
seed |
int | (none) | RNG seed applied before training (covers random, numpy, torch, and CUDA). When set, train_test_split also uses this seed for reproducible splits. |
deterministic |
bool | false |
When true, additionally enables torch.use_deterministic_algorithms(True) and torch.backends.cudnn.deterministic = True. Requires seed to be meaningful. |
Losses
| Keyword | PyTorch class | Use case |
|---|---|---|
mse |
nn.MSELoss() |
Regression |
mae |
nn.L1Loss() |
Regression (same as l1) |
l1 |
nn.L1Loss() |
Regression (same as mae) |
huber |
nn.HuberLoss() |
Regression, robust to outliers |
bce |
nn.BCELoss() |
Binary classification (sigmoid output) |
cross_entropy |
nn.CrossEntropyLoss() |
Multiclass classification; expects long targets |
nll |
nn.NLLLoss() |
Multiclass; use with log_softmax output |
Shape rules (enforced at compile time by the IR inference pass):
| Loss | Output units required | Target dtype | Error if violated |
|---|---|---|---|
bce |
exactly 1 | float32 |
KynMLShapeError |
cross_entropy |
≥ 2 | int64 (torch.long) |
KynMLShapeError |
nll |
≥ 2 | int64 (torch.long) |
KynMLShapeError |
mse, mae, l1, huber |
any | float32 |
— |
Activation warnings (not errors): cross_entropy applies log-softmax internally. Using a final softmax or log_softmax activation with cross_entropy produces a compiler warning but is not rejected. Use linear activation with cross_entropy to avoid double-applying softmax.
These checks run during compile_to_ir, not during validate. See Shape-Inference.
Optimizers
Optimizers are specified as function calls with keyword arguments.
adam(lr=<float>)
optimizer = adam(lr=0.001)
lr— learning rate. Default:0.001.- Generates:
optim.Adam(model.parameters(), lr=<lr>)
adamw(lr=<float>, weight_decay=<float>)
optimizer = adamw(lr=0.001, weight_decay=0.01)
lr— learning rate. Default:0.001.weight_decay— L2 regularization coefficient. Default:0.01.- Generates:
optim.AdamW(model.parameters(), lr=<lr>, weight_decay=<weight_decay>)
sgd(lr=<float>, momentum=<float>)
optimizer = sgd(lr=0.01)
optimizer = sgd(lr=0.01, momentum=0.9)
lr— learning rate. Default:0.01.momentum— optional. If omitted, no momentum term is added.- Generates:
optim.SGD(model.parameters(), lr=<lr>[, momentum=<momentum>])
rmsprop(lr=<float>, momentum=<float>)
optimizer = rmsprop(lr=0.001, momentum=0.0)
lr— learning rate. Default:0.01.momentum— Default:0.- Generates:
optim.RMSprop(model.parameters(), lr=<lr>, momentum=<momentum>)
Schedulers
Schedulers are specified as function calls. scheduler.step() is called once per epoch after the loss update.
step(step_size=<int>, gamma=<float>)
scheduler = step(step_size=10, gamma=0.1)
Decays LR by gamma every step_size epochs.
step_size— default10.gamma— default0.1.- Generates:
StepLR(optimizer, step_size=<step_size>, gamma=<gamma>)
cosine(t_max=<int>)
scheduler = cosine(t_max=50)
Cosine annealing over t_max epochs.
t_max— default10.- Generates:
CosineAnnealingLR(optimizer, T_max=<t_max>)
onecycle(max_lr=<float>)
scheduler = onecycle(max_lr=0.01)
OneCycleLR policy. epochs is taken from the train block.
max_lr— default0.01.- Generates:
OneCycleLR(optimizer, max_lr=<max_lr>, epochs=EPOCHS, steps_per_epoch=1)
Early Stopping
early_stop = early_stop(patience=5, mode="min")
Monitors training loss each epoch. If no improvement for patience consecutive epochs, training stops early with a printed message.
| Kwarg | Type | Default | Description |
|---|---|---|---|
patience |
int | 5 |
Number of epochs without improvement before stopping. Must be > 0. |
mode |
string | "min" |
"min" stops if loss stops decreasing. "max" stops if loss stops increasing. |
metric |
identifier | (none) | If provided, must be one of mae, mse, rmse, accuracy. Validated but not yet wired to evaluation metrics — currently tracks epoch_loss. |
Checkpointing
checkpoint = checkpoint(every_n=5, path="checkpoints/ckpt.pt", async_save=false)
Saves model and optimizer state dicts to disk every every_n epochs. On the next train run with the same config, the checkpoint is loaded and training resumes from the saved epoch.
| Kwarg | Type | Default | Description |
|---|---|---|---|
every_n |
int | 1 |
Save interval in epochs. Must be > 0. |
path |
string | "checkpoints/ckpt.pt" |
Path for the checkpoint file. Parent directories are created automatically. |
async_save |
bool | false |
If true, checkpoint writes are dispatched to a daemon thread, avoiding blocking the training loop. |
Full Example
train:
model = HousePriceModel
data = HouseData
loss = mse
optimizer = adamw(lr=0.0005, weight_decay=0.01)
epochs = 100
batch = 64
device = auto
precision = fp16
compile = true
scheduler = cosine(t_max=100)
early_stop = early_stop(patience=10, mode="min")
checkpoint = checkpoint(every_n=10, path="ckpts/run.pt", async_save=true)
evaluate Block
Computes metrics on the held-out test split after training. Optional.
Syntax
evaluate:
metrics = [<metric>, ...]
Metrics
| Keyword | Description | Regression | Multiclass |
|---|---|---|---|
mae |
Mean Absolute Error | yes | yes (on class indices) |
mse |
Mean Squared Error | yes | yes (on class indices) |
rmse |
Root Mean Squared Error | yes | yes (on class indices) |
accuracy |
Fraction of correct predictions | yes (threshold 0.5) | yes (argmax) |
Any metric not in this set raises a KynMLSemanticError at validation time.
Example
evaluate:
metrics = [mae, rmse, accuracy]
export Block
Serialises the trained model to disk. Optional.
Syntax
export:
format = <format>
path = "<output-path>"
input_shape = [<int>, ...] # required for onnx
opset = <int> # onnx only
Options
| Key | Type | Default | Description |
|---|---|---|---|
format |
identifier | required | Export format. One of torch, torchscript, onnx. |
path |
string | required | Output file path, resolved relative to cwd at compile time. |
input_shape |
list of ints | (none) | Required when format = onnx. Shape of the dummy input tensor, e.g. [1, 10]. |
opset |
int | 17 |
ONNX opset version. Only used when format = onnx. |
Export Formats
torch
Saves model.state_dict() using torch.save. Smallest file; requires the model class to reconstruct.
export:
format = torch
path = "models/net.pt"
Generates: torch.save(model.state_dict(), path)
torchscript
Traces and serialises the model with torch.jit.script. Self-contained; no Python class needed at inference time.
export:
format = torchscript
path = "models/net_scripted.pt"
Generates: torch.jit.script(model).save(path)
onnx
Exports to the Open Neural Network Exchange format. input_shape is mandatory. opset controls the ONNX opset version (default 17).
export:
format = onnx
path = "models/net.onnx"
input_shape = [1, 10]
opset = 17
Generates: torch.onnx.export(model, dummy, path, opset_version=opset, input_names=["input"], output_names=["output"])
Complete Program Example
# Multiclass iris classifier — all features demonstrated
dataset Iris:
source = csv("data/iris.csv")
target = "species"
split = 0.8
normalize = true
shuffle = true
num_workers = 2
pin_memory = true
prefetch = 4
model IrisNet:
input 4
dense 32 relu
batchnorm
dropout 0.2
dense 16 gelu
dense 3 softmax
train:
model = IrisNet
data = Iris
loss = cross_entropy
optimizer = adamw(lr=0.001, weight_decay=0.01)
epochs = 50
batch = 16
device = auto
precision = fp16
compile = true
scheduler = cosine(t_max=50)
early_stop = early_stop(patience=8, mode="min")
checkpoint = checkpoint(every_n=5, path="ckpts/iris.pt")
evaluate:
metrics = [accuracy]
export:
format = onnx
path = "models/iris.onnx"
input_shape = [1, 4]
opset = 17
Composition
Three opt-in features extend the base language for reuse and experimentation. Programs that use none of them are unaffected. Full details and runnable examples are in Composition.
import statements
Pull dataset and model blocks from another .kyn file. Must appear at the top of the file, before any block.
import "shared/base.kyn"
- Only
datasetandmodelblocks are imported.train,evaluate,export,params, andsweepblocks in the imported file are ignored. - Paths are resolved relative to the importing file.
- Circular imports and duplicate block names raise errors.
- Requires
source_pathto be provided when callingcompile_to_ir.
params block
Declare named hyperparameters with defaults. Reference them with $name anywhere a value is expected.
params:
lr = 0.001
hidden = 64
epochs = 20
model M:
input 4
dense $hidden relu
dense 1 linear
train:
model = M
data = D
loss = mse
optimizer = adam(lr=$lr)
epochs = $epochs
batch = 32
- Values may be integers, floats, strings, booleans, or lists.
$namereferences are substituted before semantic validation and codegen.- An undefined
$name(noparamsblock and no CLI override) raisesKynMLSemanticError. - CLI overrides:
kynml compile model.kyn -o out.py --param lr=0.01 --param hidden=128
sweep block
Map parameter names to lists of values. kynml sweep expands the Cartesian product into one script per combination.
params:
lr = 0.001
hidden = 32
sweep:
lr = [0.001, 0.01]
hidden = [32, 64]
- Each axis value must be a non-empty list. A bare scalar raises
KynMLParseError. expand_sweepreturns the full Cartesian product: 2 × 2 = 4 combinations above.kynml sweep model.kyngenerates per-combo scripts and writessweep_results.json.
Grammar Summary
program ::= import_stmt* params_block? sweep_block?
(dataset_block | model_block)* train_block evaluate_block? export_block?
import_stmt ::= "import" STRING NEWLINE
params_block ::= "params" ":" NEWLINE (INDENT NAME "=" value NEWLINE)+
sweep_block ::= "sweep" ":" NEWLINE (INDENT NAME "=" list NEWLINE)+
dataset_block ::= "dataset" NAME ":" NEWLINE body
model_block ::= "model" NAME ":" NEWLINE body
train_block ::= "train" ":" NEWLINE body
evaluate_block ::= "evaluate" ":" NEWLINE body
export_block ::= "export" ":" NEWLINE body
body ::= (INDENT assignment NEWLINE)+
assignment ::= NAME "=" value
value ::= bool | integer | float | string | identifier | param_ref
| function_call | list
bool ::= "true" | "false"
integer ::= "-"? DIGIT+
float ::= "-"? (DIGIT+ "." DIGIT* | DIGIT* "." DIGIT+)
string ::= '"' chars '"'
identifier ::= LETTER (LETTER | DIGIT | "_")*
param_ref ::= "$" LETTER (LETTER | DIGIT | "_")* -- composition only
function_call ::= NAME "(" [arg ("," arg)*] ")"
arg ::= value | NAME "=" value
list ::= "[" [value ("," value)*] "]"
model_layer ::= input_layer | dense_layer | dropout_layer | batchnorm_layer
input_layer ::= "input" INTEGER
dense_layer ::= "dense" INTEGER activation
dropout_layer ::= "dropout" FLOAT
batchnorm_layer::= "batchnorm" INTEGER?
activation ::= "relu" | "leaky_relu" | "gelu" | "sigmoid" | "tanh"
| "softmax" | "log_softmax" | "linear"
Model body lines are parsed differently from assignment bodies — input, dense, dropout, and batchnorm are positional tokens, not key = value assignments.
Composition grammar notes:
- import_stmt lines appear before any block header.
- params_block and sweep_block appear before dataset / model blocks (enforced by convention; the parser accepts any order at the top level).
- param_ref ($name) is valid anywhere a value is expected. It is replaced by substitute_params before semantic validation.
- sweep_block axis values must be lists; bare scalars raise KynMLParseError.