Tag: ml

6 entries tagged "ml" — 6 posts, 0 links.

Posts

Apr 22, 2026 — 8 min — Platform & AI

Building an NPS Classifier You Can Actually Act On

A scikit-learn NPS ordinal classifier with SMOTE, probability calibration, utility-based thresholding, and PSI drift checks. The parts that make it useful to the retention team, not just accurate on a dashboard.

Outcome: Shipped a calibrated multiclass NPS model with a utility-driven operating threshold and a PSI-based drift loop, giving the retention team a per-customer detractor probability they can act on and a rule for when to retrain.

ml nps classification calibration drift evaluation

Jan 15, 2026 — 11 min — Platform & AI

When 0.3 Does Not Mean 30 Percent

How imbalanced classifiers can keep a strong AUC while producing probabilities that break thresholds, alerts, and cost-sensitive decisions in production.

Outcome: Defined a production calibration gate that logs Brier score, ECE, reliability diagrams, cost-sensitive thresholds, run metadata, and promotion criteria for imbalanced classifiers.

ml calibration classification evaluation probability reliability

Jan 12, 2026 — 5 min — Platform & AI

Compliant GCP Platform Playbook for Analytics and ML

A sanitized GCP platform case study where compliance, analytics delivery, and ML feature access had to be designed as one release path instead of three disconnected workstreams.

Outcome: Reduced governed dataset onboarding from weeks to days in the sanitized pattern while preserving auditability, cost visibility, and promotion rules for analytics and ML use cases.

gcp bigquery governance analytics ml

Oct 19, 2025 — 12 min — Platform & AI

When the Model Should Say It Doesn't Know: Conformal Prediction Sets with MAPIE

How to add coverage-guaranteed prediction sets, temperature scaling calibration, and risk-coverage curves to a classifier using MAPIE — the pieces that make uncertainty quantification operationally useful rather than decorative.

Outcome: Added coverage-guaranteed prediction sets and operational abstention gates to a classification pipeline, cutting acted-upon error rate without retraining the model.

ml conformal-prediction calibration uncertainty mapie selective-prediction

Oct 3, 2025 — 7 min — Platform & AI

The Three-Run Lab: How I Triage Slow PyTorch Training

A repeatable triage routine — the three-run baseline, DataLoader diagnosis, five profiler signatures, and a copy-paste scaffold — for finding where training time actually goes before touching the model.

Outcome: Identified and resolved training bottlenecks in under an hour by running the three-run baseline and reading profiler signatures before changing any model code.

pytorch ml training performance profiling debugging

Sep 29, 2025 — 7 min — Platform & AI

PyTorch Training Throughput: The Patterns That Actually Move the Number

torch.compile, mixed precision, gradient accumulation, DDP vs FSDP, and the profiler — the five levers I reach for before rethinking the model architecture.

Outcome: Cut training wall-clock time and GPU memory pressure by applying compile, AMP, and accumulation patterns in sequence before ever touching model architecture.

pytorch ml training performance gpu distributed-training

All tags