Blog | Debdeep Paul

Notes on AI systems, optimization, and practical GenAI engineering.

Awesome Agentic Evaluation: A Curated Guide to Benchmarking AI Agents

March 21, 2026

A walkthrough of the landscape of agentic evaluation: benchmarks, tooling, design patterns, and best practices for measuring how well AI agents actually work.

CS336 Spring 2025, End to End: A Technical Summary of Stanford's Complete LLMs-From-Scratch Offering

February 27, 2026

A comprehensive technical walkthrough of Stanford CS336 Spring 2025: lectures, assignments, and the core engineering lessons behind building language models from scratch.

RLHF in Plain English: How LLMs Learn to Follow Humans (Without Becoming Robots)

February 16, 2026

A practical high-level guide to RLHF: what it is, why it exists, and how it works.