writing

Deep dives on the LLM stack

Long-form, code-level articles on systems, data, alignment. Written to be the reference I wish existed when I was learning it.

2026-07-03 alignment/

SFT, DPO, or RLHF? Choosing the Right Post-Training Recipe

When supervised fine-tuning is enough, when preference optimization pays off, and where verifiable rewards fit — a practical decision guide.
2026-06-25 data/

Data Curation for LLMs: Filtering, Deduplication, and Mixing in Practice

A practical walkthrough of the LLM data pipeline — quality filtering, exact and near deduplication with MinHash, decontamination, and mixture weights.
2026-06-12 systems/

How to Fit Large Language Models on Small GPUs

Where GPU memory actually goes during LLM training, and how activation checkpointing, quantization, 8-bit optimizers, and CPU offloading win it back.