Section 7.1

Thu, 01 Jan 2026 00:00:00 +0000

Section 7.2

Thu, 01 Jan 2026 00:00:00 +0000

Karpathy autoresearch

Thu, 01 Jan 2026 00:00:00 +0000

Karpathy Autoresearch Explained

Introduction

This lesson introduces autoresearch as a practical workflow for letting an AI coding agent run experiments without waiting for a human to choose every next step. The basic pattern is simple: define the goal, freeze the evaluator, let the agent propose code changes, run the experiment, keep the change only if the metric improves, and repeat. The public examples make the idea concrete: single-GPU overnight runs improved val_bpb from 0.997900 to 0.969686 in 126 experiments on an H100, and those smaller depth-12 findings later transferred to larger depth-24 nanochat runs, reducing the “time to GPT-2” leaderboard entry from 2.02 hours to 1.80 hours, with a later entry at 1.65 hours. The rest of this section turns that workflow into a tutorial: first the naming and intuition, then the loop, comparisons, implementations, strengths, limitations, and a practical recipe for building a similar system.

Agentic Systems on Sange Mehrab

Section 7.1

Section 7.2

Karpathy autoresearch

Karpathy Autoresearch Explained

Introduction