whoami

Saurabh Ghatnekar

AI Engineer — Systems, Data & Alignment

I work across the LLM stack — the systems that make training and inference fast, the data that makes models good, and the post-training that makes them useful. I write deep dives on how frontier models actually get built.

open to select collaborations

focus areas

systems/

Making training and inference fast and cheap: getting the most out of every GPU.
- Kernels
- Parallelism
- Quantization
- Activation checkpointing
- CPU offloading
- Inference
data/

The quiet determinant of model quality: what goes in, what gets cut, and how it is measured.
- Evaluation
- Curation
- Transformation
- Filtering
- Deduplication
- Mixing
alignment/

Turning base models into useful, reliable ones — and knowing when it worked.
- Supervised fine-tuning
- Reinforcement learning
- Preference data
- Synthetic data
- Verifiers

latest writing

2026-07-03 alignment/

SFT, DPO, or RLHF? Choosing the Right Post-Training Recipe

When supervised fine-tuning is enough, when preference optimization pays off, and where verifiable rewards fit — a practical decision guide.
2026-06-25 data/

Data Curation for LLMs: Filtering, Deduplication, and Mixing in Practice

A practical walkthrough of the LLM data pipeline — quality filtering, exact and near deduplication with MinHash, decontamination, and mixture weights.
2026-06-12 systems/

How to Fit Large Language Models on Small GPUs

Where GPU memory actually goes during LLM training, and how activation checkpointing, quantization, 8-bit optimizers, and CPU offloading win it back.

all posts →

Saurabh Ghatnekar

SFT, DPO, or RLHF? Choosing the Right Post-Training Recipe

Data Curation for LLMs: Filtering, Deduplication, and Mixing in Practice

How to Fit Large Language Models on Small GPUs

The Stack Trace