is plan mode worth it?

we tested 3 modes (one-shot, plan+resume, plan+clear) across 22 tasks on real codebases like vLLM, bun, T3 Code, llama.cpp, unsloth, diffusers, transformers.js, and AI SDK. plan mode costs more, takes longer, and doesn't improve accuracy

accuracy

same accuracy

one-shot 87%vsplan mode 87%

cost per task

+122% pricier

one-shot $1.19vsplan mode $2.65

time per task

+79% slower

one-shot 6.3minvsplan mode 11.2min

results at a glance

score

one-shot87%

plan + resume87%

plan + clear87%

cost/task

one-shot$1.19

plan + resume$2.64

+121%

plan + clear$2.65

+123%

duration

one-shot6.3m

plan + resume10.2m

+62%

plan + clear12.3m

+96%

turns

one-shot32

plan + resume47

+47%

plan + clear58

+82%

all results

22 tasks, 5 runs per mode in Claude Code using sonnet 4.6 + opus 4.6

		one-shot			plan+resume			plan+clear
Project	Task↑	Score	Cost	Time	Score	Cost	Time	Score	Cost	Time
uv	add 'uv cache stats' subcommand	95	$1.47	8.0m	97	$2.99	12.1m	97	$2.87	13.1m
bun	add Bun.INI namespace API	95	$3.01	13.9m	86	$4.78	15.6m	95	$5.46	20.0m
diffusers	add cosine annealing noise scheduler	62	$1.50	7.8m	68	$2.89	11.4m	68	$2.99	14.2m
t3code	add custom theme system	81	$2.22	13.2m	77	$4.12	16.5m	76	$3.44	18.7m
openclaw	add fictional PulseBoard channel	75	$2.79	14.9m	74	$7.91	27.6m	75	$7.41	31.1m
transformers.js	add image-text-to-text pipeline	99	$0.67	4.5m	92	$3.06	11.7m	93	$3.05	13.6m
sglang	add JSON path constraint	95	$2.26	8.8m	94	$4.60	15.9m	91	$4.55	19.2m
generic	add JWT authentication system	80	$0.42	2.4m	79	$0.87	4.8m	85	$1.00	6.5m
llama.cpp	add Mistral Instruct v3 chat template	56	$0.84	5.8m	66	$2.44	11.6m	64	$3.39	18.9m
ollama	add model rename API endpoint and CLI command	100	$0.68	4.0m	99	$1.84	7.6m	100	$1.64	8.6m
ai	add new AI image provider	97	$1.53	7.2m	99	$2.89	10.8m	96	$2.85	12.1m
unsloth	add ONNX LoRA adapter export	94	$0.74	4.8m	89	$2.09	8.1m	94	$2.17	10.2m
generic	add pagination to users API	100	$0.10	57s	100	$0.29	2.2m	100	$0.28	2.1m
fastapi	add rate limiting middleware	41	$0.64	4.6m	59	$1.95	9.9m	50	$1.78	10.6m
prisma	add REST API client generator	83	$1.74	10.4m	93	$3.87	16.0m	95	$4.04	20.4m
generic	add search functionality to CLI	90	$0.22	1.6m	90	$0.53	2.8m	90	$0.49	2.8m
ui	add stepper component	95	$1.54	10.3m	90	$3.04	12.8m	78	$3.25	18.5m
vllm	add Top-A sampling strategy	91	$2.90	8.4m	83	$5.45	12.9m	86	$5.32	14.8m
generic	extract service layer from route handlers	83	$0.35	2.4m	77	$1.02	5.5m	77	$1.10	6.4m
generic	fix off-by-one error in loop	100	$0.11	38s	100	$0.21	1.3m	100	$0.23	2.1m
generic	implement full-featured event emitter	100	$0.22	1.3m	100	$0.42	2.4m	100	$0.41	2.7m
generic	optimize slow data processing pipeline	100	$0.27	1.9m	100	$0.79	4.4m	100	$0.67	3.9m

how each mode works

each mode gets the same task. the difference is how we prompt Claude Code and whether it plans before executing.

one-shot

just the task prompt. Claude reads, edits, and tests in a single session with no planning guidance.

Prompt

Execute

Done

plan + resume

two phases in the same session. Claude first plans using read-only tools (can only read files), then resumes with full permissions to execute. context is preserved.

Prompt

Plan (read-only)

Resume

Execute

Done

plan + clear

plan is saved to PLAN.md, then a brand new session reads it and executes. context is completely cleared between planning and execution.

Prompt

Plan (read-only)

Save PLAN.md

Clear context

New session

Execute

Done