Roadmap

What has shipped, what is next. Reconciled against docs/roadmap.md.

Shipped — Phase 0/1 (v0.3)

Typed IR, shape inference, composition (params/import/sweep), reproducibility, formatter, and LSP diagnostics.

Typed IR + Shape Inference

New kynml/ir/ package: types.py (TypeShape, DType), nodes.py (IRModule, IRGraph, IROp hierarchy), builder.py (AST → IR lowering), infer.py (shape/type inference pass), passes.py (pass registry)
in_features on LinearOp and num_features on BatchNorm1dOp are auto-inferred — users never specify them
Shape mismatches raise KynMLShapeError with source-located messages at compile/train time
Loss/output agreement enforced: bce requires exactly 1 output unit; cross_entropy/nll require ≥2 output units
Activation footgun detection: cross_entropy + softmax or log_softmax emits a warning (not an error)
compile_to_ir() in pipeline.py is the canonical front door for backends and tooling

Backend Seam

kynml/codegen/base.py: Backend ABC + EmitContext + get_backend()
kynml/codegen/pytorch_backend.py: PyTorchBackend reads exclusively from IRModule (post-inference); never touches the AST
kynml/codegen/pytorch.py is now a compatibility shim with unchanged public API

Composition

params: block — named values, referenced via $name inside any block
import "other.kyn" — merges dataset and model blocks; cycle detection; duplicate-name errors
sweep: block — grid axes; expand_sweep() returns Cartesian product; kynml sweep generates per-combo scripts and a sweep orchestrator that writes sweep_results.json
CLI --param key=value (repeatable) overrides params block defaults at compile/run time

Reproducibility

seed and deterministic fields on train: block
Generated scripts emit CONFIG_HASH (sha256 of the .kyn source), _set_seed(), and always write run_manifest.json
kynml/repro/manifest.py: RunManifest, config_hash(), data_hash(), write_manifest()
kynml/lock.py: create_lock(), check_lock(), LockMismatchError

Formatter

kynml fmt <file> [--write] [--check] — idempotent canonical form
python -m kynml.format <file> — module entry point
from kynml.format.formatter import format_source, format_file

LSP Diagnostics

kynml lsp — starts LSP server over stdio (requires pip install 'kynml[lsp]')
python -m kynml.lsp — module entry point
from kynml.lsp.diagnostics import diagnose — pure diagnostics API, no pygls required

New CLI subcommands

kynml sweep — expand + run all parameter combinations
kynml compare — render the checked-in competitive map as table, Markdown, or JSON
kynml fmt — format source
kynml lsp — start LSP server

Shipped — Phase 1 (v0.1 → v0.2)

The core compiler and all major CONNECT integrations are live.

Compiler core

Handwritten indentation-aware parser for all five block types: dataset, model, train, evaluate, export
Typed frozen-dataclass AST (ast_nodes.py)
Semantic validator with difflib typo suggestions
PyTorch codegen: complete self-contained training scripts, no import kynml in the output

Dataset

CSV sources via csv("path")
normalize, shuffle, split, num_workers, pin_memory, prefetch
Multiclass vs regression detected from loss; targets cast to long or float32 accordingly

Model layers

input N, dense N <act>, dropout P, batchnorm [N]
Activations: relu, leaky_relu, gelu, sigmoid, tanh, softmax, log_softmax, linear

Training

Losses: mse, bce, cross_entropy, huber, l1 / mae, nll
Optimizers: adam(lr), adamw(lr, weight_decay), sgd(lr, momentum), rmsprop(lr, momentum)
Schedulers: step(step_size, gamma), cosine(t_max), onecycle(max_lr)
Early stopping: early_stop(patience, mode)
Checkpointing: checkpoint(every_n, path, async_save) with resume on restart
Mixed precision: precision = fp16 | bf16 via torch.amp.autocast + GradScaler
Graph compilation: compile = true → torch.compile(model)
Device: auto, cpu, cuda

Evaluate

Metrics: mae, mse, rmse, accuracy

Export

torch (state dict .pt), onnx (configurable opset, default 17), torchscript

CLI

validate, ast, compile, train

CONNECT — Serving

kynml.serving.generate_service(program, model_path, out_dir) emits FastAPI app + Dockerfile
Optional extra: pip install 'kynml[serving]'

CONNECT — MCP

python -m kynml.mcp.server — stdio MCP server
Tools: kynml_validate, kynml_compile, kynml_train, kynml_predict
Optional extra: pip install 'kynml[mcp]'

CONNECT — Integrations

kynml.integrations.huggingface.load_hf_dataset()
kynml.integrations.objectstore.load_remote()
Optional extras: pip install 'kynml[hf]' / 'kynml[objectstore]'

Site

Static site at kynml.kynetra.dev on Cloudflare Pages
Competitive map section comparing KynML's compiler layer against Lightning, Keras, Ludwig, and ZenML

Phase 2 — Image and Text

Image datasets via PIL/torchvision loaders
CNN layers: conv2d, maxpool, flatten
Text datasets: tokenizer integration
Transformer blocks: attention, feedforward
Richer parser diagnostics with exact source spans
Dataset schema validation before training starts
First-class huggingface(...) and objectstore(...) source keywords in the .kyn syntax

Phase 3 — Production Tooling

Model registry with versioning
Experiment tracking integration (MLflow, W&B)
Generated script templates with stable extension hooks for post-processing
KynML package registry for community models and layer presets

Phase 4 — Scale

JAX backend
MLIR backend
Distributed training (DDP)
Cloud training job orchestration
Cloudflare-native training job dispatch (Workers AI, GPU Workers)

Phase 5 — Platform

KynML Studio — visual .kyn editor
AutoML layer — hyperparameter search
SaaS deployment and hosted training
Interactive compiler playground in the browser (Cloudflare Worker, compile-time PyTorch-free)
API generation from .kyn specs (FastAPI scaffolding beyond the current serving module)