Docs Roadmap

Roadmap

What has shipped, what is next. Reconciled against docs/roadmap.md.

Shipped — Phase 0/1 (v0.3)

Typed IR, shape inference, composition (params/import/sweep), reproducibility, formatter, and LSP diagnostics.

Typed IR + Shape Inference

  • New kynml/ir/ package: types.py (TypeShape, DType), nodes.py (IRModule, IRGraph, IROp hierarchy), builder.py (AST → IR lowering), infer.py (shape/type inference pass), passes.py (pass registry)
  • in_features on LinearOp and num_features on BatchNorm1dOp are auto-inferred — users never specify them
  • Shape mismatches raise KynMLShapeError with source-located messages at compile/train time
  • Loss/output agreement enforced: bce requires exactly 1 output unit; cross_entropy/nll require ≥2 output units
  • Activation footgun detection: cross_entropy + softmax or log_softmax emits a warning (not an error)
  • compile_to_ir() in pipeline.py is the canonical front door for backends and tooling

Backend Seam

  • kynml/codegen/base.py: Backend ABC + EmitContext + get_backend()
  • kynml/codegen/pytorch_backend.py: PyTorchBackend reads exclusively from IRModule (post-inference); never touches the AST
  • kynml/codegen/pytorch.py is now a compatibility shim with unchanged public API

Composition

  • params: block — named values, referenced via $name inside any block
  • import "other.kyn" — merges dataset and model blocks; cycle detection; duplicate-name errors
  • sweep: block — grid axes; expand_sweep() returns Cartesian product; kynml sweep generates per-combo scripts and a sweep orchestrator that writes sweep_results.json
  • CLI --param key=value (repeatable) overrides params block defaults at compile/run time

Reproducibility

  • seed and deterministic fields on train: block
  • Generated scripts emit CONFIG_HASH (sha256 of the .kyn source), _set_seed(), and always write run_manifest.json
  • kynml/repro/manifest.py: RunManifest, config_hash(), data_hash(), write_manifest()
  • kynml/lock.py: create_lock(), check_lock(), LockMismatchError

Formatter

  • kynml fmt <file> [--write] [--check] — idempotent canonical form
  • python -m kynml.format <file> — module entry point
  • from kynml.format.formatter import format_source, format_file

LSP Diagnostics

  • kynml lsp — starts LSP server over stdio (requires pip install 'kynml[lsp]')
  • python -m kynml.lsp — module entry point
  • from kynml.lsp.diagnostics import diagnose — pure diagnostics API, no pygls required

New CLI subcommands

  • kynml sweep — expand + run all parameter combinations
  • kynml compare — render the checked-in competitive map as table, Markdown, or JSON
  • kynml fmt — format source
  • kynml lsp — start LSP server

Shipped — Phase 1 (v0.1 → v0.2)

The core compiler and all major CONNECT integrations are live.

Compiler core

  • Handwritten indentation-aware parser for all five block types: dataset, model, train, evaluate, export
  • Typed frozen-dataclass AST (ast_nodes.py)
  • Semantic validator with difflib typo suggestions
  • PyTorch codegen: complete self-contained training scripts, no import kynml in the output

Dataset

  • CSV sources via csv("path")
  • normalize, shuffle, split, num_workers, pin_memory, prefetch
  • Multiclass vs regression detected from loss; targets cast to long or float32 accordingly

Model layers

  • input N, dense N <act>, dropout P, batchnorm [N]
  • Activations: relu, leaky_relu, gelu, sigmoid, tanh, softmax, log_softmax, linear

Training

  • Losses: mse, bce, cross_entropy, huber, l1 / mae, nll
  • Optimizers: adam(lr), adamw(lr, weight_decay), sgd(lr, momentum), rmsprop(lr, momentum)
  • Schedulers: step(step_size, gamma), cosine(t_max), onecycle(max_lr)
  • Early stopping: early_stop(patience, mode)
  • Checkpointing: checkpoint(every_n, path, async_save) with resume on restart
  • Mixed precision: precision = fp16 | bf16 via torch.amp.autocast + GradScaler
  • Graph compilation: compile = truetorch.compile(model)
  • Device: auto, cpu, cuda

Evaluate

  • Metrics: mae, mse, rmse, accuracy

Export

  • torch (state dict .pt), onnx (configurable opset, default 17), torchscript

CLI

  • validate, ast, compile, train

CONNECT — Serving

  • kynml.serving.generate_service(program, model_path, out_dir) emits FastAPI app + Dockerfile
  • Optional extra: pip install 'kynml[serving]'

CONNECT — MCP

  • python -m kynml.mcp.server — stdio MCP server
  • Tools: kynml_validate, kynml_compile, kynml_train, kynml_predict
  • Optional extra: pip install 'kynml[mcp]'

CONNECT — Integrations

  • kynml.integrations.huggingface.load_hf_dataset()
  • kynml.integrations.objectstore.load_remote()
  • Optional extras: pip install 'kynml[hf]' / 'kynml[objectstore]'

Site

  • Static site at kynml.kynetra.dev on Cloudflare Pages
  • Competitive map section comparing KynML's compiler layer against Lightning, Keras, Ludwig, and ZenML

Phase 2 — Image and Text

  • Image datasets via PIL/torchvision loaders
  • CNN layers: conv2d, maxpool, flatten
  • Text datasets: tokenizer integration
  • Transformer blocks: attention, feedforward
  • Richer parser diagnostics with exact source spans
  • Dataset schema validation before training starts
  • First-class huggingface(...) and objectstore(...) source keywords in the .kyn syntax

Phase 3 — Production Tooling

  • Model registry with versioning
  • Experiment tracking integration (MLflow, W&B)
  • Generated script templates with stable extension hooks for post-processing
  • KynML package registry for community models and layer presets

Phase 4 — Scale

  • JAX backend
  • MLIR backend
  • Distributed training (DDP)
  • Cloud training job orchestration
  • Cloudflare-native training job dispatch (Workers AI, GPU Workers)

Phase 5 — Platform

  • KynML Studio — visual .kyn editor
  • AutoML layer — hyperparameter search
  • SaaS deployment and hosted training
  • Interactive compiler playground in the browser (Cloudflare Worker, compile-time PyTorch-free)
  • API generation from .kyn specs (FastAPI scaffolding beyond the current serving module)

See Also