Docs Tooling

Tooling

Developer tools shipped with KynML: the canonical formatter (kynml fmt) and the diagnostics / Language Server Protocol integration (kynml lsp).


Formatter

The formatter parses .kyn source into the typed AST and re-emits it in canonical form. Because it round-trips through the parser it validates syntax as a side-effect and normalises equivalent representations (it is not a dumb text formatter).

Canonical rules

  • 4-space indentation for block bodies.
  • Exactly one blank line between top-level blocks.
  • No trailing whitespace; single trailing newline.
  • Canonical key order per block type (e.g. dataset: source, target, split, normalize, …; train: model, data, loss, optimizer, epochs, batch, …).
  • Normalised value spacing: dense 64 relu, optimizer = adam(lr=0.001).
  • $name param references round-trip correctly.
  • Idempotent: format_source(format_source(x)) == format_source(x).

Default-valued fields are omitted from the output (e.g. split = 0.8 is dropped because 0.8 is the default). Non-default fields are always emitted.

CLI

# Print canonical form to stdout
kynml fmt specs/model.kyn

# Overwrite in place
kynml fmt specs/model.kyn --write

# CI gate: exit 1 if the file would change
kynml fmt specs/model.kyn --check

Full flag reference in CLI-Reference § fmt.

Module entrypoint

# Equivalent to kynml fmt (without --check)
python -m kynml.format specs/model.kyn
python -m kynml.format specs/model.kyn --write

Python API

from kynml.format.formatter import format_source, format_file

# Format a string
canonical: str = format_source(raw_kyn_text)
canonical: str = format_source(raw_kyn_text, source_name="specs/model.kyn")

# Format a file; write=True overwrites it
canonical: str = format_file("specs/model.kyn")
canonical: str = format_file("specs/model.kyn", write=True)

format_source raises KynMLParseError on invalid input (same exception the parser raises). It never raises for semantically invalid but syntactically correct source — the formatter only requires a valid parse tree.

Example

Before formatting:

dataset  D:
    source=csv("data/x.csv")
    target="y"
    split=0.8
    shuffle=true

model M:
  input 4
  dense 64  relu
  dense 1 linear
train:
    model=M
    data=D
    loss=mse
    optimizer=adam(lr=0.001)
    epochs=20
    batch=32

After kynml fmt:

dataset D:
    source = csv("data/x.csv")
    target = "y"

model M:
    input 4
    dense 64 relu
    dense 1 linear

train:
    model = M
    data = D
    loss = mse
    optimizer = adam(lr=0.001)
    epochs = 20
    batch = 32

split and shuffle were at their defaults and are dropped. Indentation and spacing are normalised.


Diagnostics and LSP

diagnose() — pure diagnostics function

diagnose runs the full pipeline (parse → semantic validation → IR lowering → shape inference) on raw source text and returns a list of structured diagnostic dicts. It never raises; all errors are caught and converted.

from kynml.lsp.diagnostics import diagnose

source = open("specs/model.kyn").read()
diags = diagnose(source, source_name="specs/model.kyn")

for d in diags:
    print(f"{d['line']}:{d['col']}: [{d['severity']}] {d['message']}  ({d['code']})")

diagnose does not require pygls. It is always importable.

Diagnostic structure

Each entry is a TypedDict:

class Diagnostic(TypedDict):
    line:     int   # 1-based line number (best-effort from error message)
    col:      int   # 1-based column; 0 when unknown
    end_line: int   # same as line when no span available
    end_col:  int   # 0 when unknown
    severity: str   # "error" | "warning"
    message:  str   # human-readable description
    code:     str   # "parse" | "semantic" | "shape" | "warn"

code values

code Stage severity
parse Parser (syntax error) "error"
semantic Semantic validation "error"
shape IR shape inference (dimension mismatch, loss/output mismatch) "error"
warn IR inference warnings (e.g. cross_entropy + softmax redundancy) "warning"

Example output

For a spec with a shape mismatch:

[
  {
    "line": 1,
    "col": 0,
    "end_line": 1,
    "end_col": 0,
    "severity": "error",
    "code": "shape",
    "message": "Loss 'bce' requires exactly 1 output unit; model 'M' has 3."
  }
]

For a valid spec with a redundant activation warning:

[
  {
    "line": 1,
    "col": 0,
    "end_line": 1,
    "end_col": 0,
    "severity": "warning",
    "code": "warn",
    "message": "cross_entropy loss with softmax/log_softmax output: softmax is applied internally by cross_entropy; consider using 'linear' activation on the final layer."
  }
]

An empty list means the source is clean.

Pipeline stages in diagnose

diagnose stops at the first blocking stage:

  1. Parse — if this fails, returns immediately (no AST to continue with).
  2. Semanticvalidate_program on the AST.
  3. IR + shape inferencelower_programrun_passes.
  4. Warnings — non-fatal messages from IRModule.warnings are appended with severity="warning".

Results are sorted by line before return.

Language Server (LSP)

The LSP server wraps diagnose for editor integration. It requires pygls:

pip install 'kynml[lsp]'

Start the server:

kynml lsp
# or
python -m kynml.lsp

The server communicates over stdio and publishes diagnostics on textDocument/didOpen, textDocument/didChange, and textDocument/didSave. Diagnostics are zero-indexed for LSP (line numbers are d["line"] - 1).

Full CLI reference in CLI-Reference § lsp.

Neovim (via vim.lsp.start)

vim.api.nvim_create_autocmd("FileType", {
  pattern = "kyn",
  callback = function()
    vim.lsp.start({
      name    = "kynml",
      cmd     = { "kynml", "lsp" },
      root_dir = vim.fn.getcwd(),
    })
  end,
})

VS Code (generic LSP client, e.g. vscode-glspc)

Point the extension at command kynml lsp with argument --stdio.

Helix (languages.toml)

[[language]]
name = "kyn"
language-servers = ["kynml-lsp"]

[language-server.kynml-lsp]
command = "kynml"
args    = ["lsp"]