writing
Deep dives on the LLM stack
Long-form, code-level articles on systems, data, alignment. Written to be the reference I wish existed when I was learning it.
-
SFT, DPO, or RLHF? Choosing the Right Post-Training Recipe
When supervised fine-tuning is enough, when preference optimization pays off, and where verifiable rewards fit — a practical decision guide.
-
Data Curation for LLMs: Filtering, Deduplication, and Mixing in Practice
A practical walkthrough of the LLM data pipeline — quality filtering, exact and near deduplication with MinHash, decontamination, and mixture weights.
-
How to Fit Large Language Models on Small GPUs
Where GPU memory actually goes during LLM training, and how activation checkpointing, quantization, 8-bit optimizers, and CPU offloading win it back.