Tutorial: Binary Classification — Customer Churn

End-to-end walkthrough: predict a binary label (0/1) with BCE loss and sigmoid output.

What you will build

A model that ingests a churn CSV (15 features, binary churn target), trains with binary cross-entropy, evaluates accuracy on the held-out split, and saves a .pt state dict.

Prerequisites

pip install kynml

1. Data

The bundled example is data/churn.csv — 15 numeric features and a churn column with values 0 or 1:

f1,f2,f3,...,f15,churn
1,35,2,...,0,0
0,58,1,...,1,1
...

KynML treats the target as float32 for BCE loss. If your target column contains strings, convert it to 0/1 integers first.

2. Write the spec

Save as churn.kyn:

dataset CustomerChurn:
    source = csv("data/churn.csv")
    target = "churn"
    split = 0.75
    normalize = true

model ChurnModel:
    input 15
    dense 64 relu
    dense 32 relu
    dense 1 sigmoid

train:
    model = ChurnModel
    data = CustomerChurn
    loss = bce
    optimizer = adam(lr=0.001)
    epochs = 30
    batch = 64
    device = auto

evaluate:
    metrics = [accuracy]

export:
    format = torch
    path = "models/churn_model.pt"

Key decisions

Sigmoid output — the final dense 1 sigmoid layer produces a probability in (0, 1). Combined with loss = bce (nn.BCELoss()), this is the standard binary classification setup.

split = 0.75 — 75 % train, 25 % test. Adjust based on dataset size.

normalize = true — recommended when features span different scales (age, tenure, spend amount in the same model).

3. Extended spec: regularisation and scheduling

For production use, add dropout, batch normalisation, early stopping, and a learning rate scheduler:

dataset CustomerChurn:
    source = csv("data/churn.csv")
    target = "churn"
    split = 0.75
    normalize = true
    num_workers = 4
    pin_memory = true

model ChurnModel:
    input 15
    dense 128 relu
    batchnorm
    dropout 0.3
    dense 64 relu
    dropout 0.2
    dense 1 sigmoid

train:
    model = ChurnModel
    data = CustomerChurn
    loss = bce
    optimizer = adamw(lr=0.0005, weight_decay=0.01)
    epochs = 60
    batch = 64
    device = auto
    scheduler = cosine(t_max=60)
    early_stop = early_stop(patience=10)
    checkpoint = checkpoint(every_n=10, path="checkpoints/churn.pt")

evaluate:
    metrics = [accuracy]

export:
    format = torch
    path = "models/churn_model.pt"

batchnorm — placed after a dense layer, it inherits that layer's output width automatically. You can also specify it explicitly: batchnorm 128.

dropout 0.3 — nn.Dropout(p=0.3). Valid range: [0, 1).

early_stop = early_stop(patience=10) — halts training if training loss does not improve for 10 consecutive epochs.

scheduler = cosine(t_max=60) — CosineAnnealingLR over 60 epochs, matching epochs.

4. Run it

# Validate first
.venv/bin/python -m kynml.cli validate churn.kyn

# Train
.venv/bin/python -m kynml.cli train churn.kyn

Expected output:

Epoch 1/30 - loss: 0.6921
Epoch 2/30 - loss: 0.6814
...
Epoch 30/30 - loss: 0.4231
accuracy: 0.8350
Saved model to /path/to/models/churn_model.pt

5. What the generated PyTorch looks like

For binary classification the generated code is identical to the regression path with two key differences:

# Targets stay float32 (same as regression) — BCE requires float targets
y = target.astype("float32").to_numpy()
if y.ndim == 1:
    y = y.reshape(-1, 1)

# Loss
def build_criterion() -> nn.Module:
    return nn.BCELoss()

# Optimizer (AdamW variant)
def build_optimizer(model: nn.Module) -> optim.Optimizer:
    return optim.AdamW(model.parameters(), lr=0.0005, weight_decay=0.01)

# Model — note sigmoid at the end
class ChurnModel(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(15, 64),
            nn.ReLU(),
            nn.Linear(64, 32),
            nn.ReLU(),
            nn.Linear(32, 1),
            nn.Sigmoid(),
        )

Accuracy metric for regression-path predictions (threshold at 0.5):

values["accuracy"] = float(
    np.mean((predictions >= 0.5).astype(np.float32) == targets)
)

6. Serving predictions

After training, generate a FastAPI inference service from Python:

from kynml.parser import parse_file
from kynml.semantic import validate_program
from kynml.serving.generator import generate_service

program = parse_file("churn.kyn")
validate_program(program)
paths = generate_service(program, model_path="models/churn_model.pt", out_dir="service/")

This writes service/app.py, service/requirements.txt, and service/Dockerfile. Run with:

pip install 'kynml[serving]'
cd service && uvicorn app:app --host 0.0.0.0 --port 8000

Then call it:

curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"features": {"f1": 1.0, "f2": 35.0, "f3": 2.0}}'
# {"prediction": [0.1234]}

7. Next steps

Export as ONNX or TorchScript for edge deployment: Export Formats
Speed flags for large datasets: Speed Guide
Load data from S3 or HuggingFace: Datasets and Connectors
Multiclass classification (>2 classes): Tutorial: Multiclass

Troubleshooting

Loss stays near 0.693 — this is ln(2), the BCE loss of a model predicting ~0.5 for everything. The model has not converged. Try: lower lr, more epochs, or check that your labels are 0/1 (not 1/2 or strings).

dropout rate must be in [0, 1) — dropout probability is exclusive of 1. dropout 1.0 is rejected at validation time.

Early stopping fires immediately — default mode for early_stop is min (lower loss is better). This is correct for BCE. If you see it stop at epoch 1, your loss function may be returning NaN — check for missing values in your CSV.