whoami
Saurabh Ghatnekar
AI Engineer — Systems, Data & Alignment
I work across the LLM stack — the systems that make training and inference fast, the data that makes models good, and the post-training that makes them useful. I write deep dives on how frontier models actually get built.
open to select collaborationsfocus areas
-
systems/
Making training and inference fast and cheap: getting the most out of every GPU.
- Kernels
- Parallelism
- Quantization
- Activation checkpointing
- CPU offloading
- Inference
-
data/
The quiet determinant of model quality: what goes in, what gets cut, and how it is measured.
- Evaluation
- Curation
- Transformation
- Filtering
- Deduplication
- Mixing
-
alignment/
Turning base models into useful, reliable ones — and knowing when it worked.
- Supervised fine-tuning
- Reinforcement learning
- Preference data
- Synthetic data
- Verifiers
latest writing
-
SFT, DPO, or RLHF? Choosing the Right Post-Training Recipe
When supervised fine-tuning is enough, when preference optimization pays off, and where verifiable rewards fit — a practical decision guide.
-
Data Curation for LLMs: Filtering, Deduplication, and Mixing in Practice
A practical walkthrough of the LLM data pipeline — quality filtering, exact and near deduplication with MinHash, decontamination, and mixture weights.
-
How to Fit Large Language Models on Small GPUs
Where GPU memory actually goes during LLM training, and how activation checkpointing, quantization, 8-bit optimizers, and CPU offloading win it back.