Claude Code Cost Cheat Sheet
Only the flags that affect token spend. Set via env in settings.json or export in your shell.
Snapshot from v2.1.83 (14 Apr 2026). Flags are internal — verify against your version with
strings $(which claude) | grep -oE 'CLAUDE_CODE_[A-Z_]+'.
Big Wins
These target the largest token pools: thinking, system prompt size, and model selection.
| Flag | Value | What it does |
|---|---|---|
CLAUDE_CODE_DISABLE_THINKING | 1 | Eliminates all extended thinking tokens. The single biggest cost lever on complex tasks. |
CLAUDE_CODE_EFFORT_LEVEL | low / medium | Shrinks the thinking budget without fully disabling it. max is Opus-only. Overrides user settings. |
CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING | 1 | Switches from dynamic per-task thinking budgets to fixed ones. More predictable spend. |
CLAUDE_CODE_DISABLE_CLAUDE_MDS | 1 | Drops all CLAUDE.md files from the system prompt. Nuclear option — big savings if your CLAUDE.md files are large. |
CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS | 1 | Strips git status, branch info, recent commits, and the git workflow guide from the system prompt. |
CLAUDE_CODE_SUBAGENT_MODEL | sonnet / haiku | Routes all subagent (Agent tool) calls to a cheaper model. |
ANTHROPIC_SMALL_FAST_MODEL | e.g. claude-haiku-4-5-20251001 | Cheaper model for internal lightweight tasks: hook evaluation, classification, utility ops. |
Moderate Savings
Context window management, file reads, and feature toggles.
| Flag | Value | What it does |
|---|---|---|
CLAUDE_CODE_SM_COMPACT | 1 | Enables Session Memory compaction — auto-summarises conversation history (10k–40k token window). Major savings on long sessions. |
CLAUDE_CODE_AUTO_COMPACT_WINDOW | integer (tokens) | Overrides the compaction window size. Lower = earlier compaction = less context carried forward. |
CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS | integer (default ~25000) | Caps tokens per file read. Lowering this forces targeted reads instead of loading whole files. |
CLAUDE_CODE_MAX_OUTPUT_TOKENS | integer | Hard cap on output tokens per response. |
CLAUDE_CODE_DISABLE_AUTO_MEMORY | 1 | Stops auto-memory reads/writes. Removes memory content from the context window. |
CLAUDE_CODE_DISABLE_ATTACHMENTS | 1 | Disables file attachment generation (references, diagnostics). Reduces per-turn context. |
CLAUDE_CODE_DISABLE_ADVISOR_TOOL | 1 | Removes the Advisor tool schema from the system prompt. One fewer tool definition. |
CLAUDE_CODE_DISABLE_BACKGROUND_TASKS | 1 | Prevents background tasks from making additional API calls. |
Minor Savings
Small per-session or per-turn reductions that add up over time.
| Flag | Value | What it does |
|---|---|---|
CLAUDE_CODE_BRIEF | 1 | Encourages terser responses across all output. |
CLAUDE_CODE_SKIP_PROMPT_HISTORY | 1 | Skips loading previous prompt history at session start. |
CLAUDE_CODE_MCP_INSTR_DELTA | 1 | Sends only MCP instruction additions/removals instead of full sets. Saves tokens when MCP servers change. |
CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS | 1 | Disables beta features that may carry token overhead. |
CLAUDE_CODE_MAX_RETRIES | integer (default varies) | Fewer retries = fewer wasted tokens on re-sent requests after failures. |
Model Selection (cost, not tokens)
These don't change token counts but directly affect price per token.
| Flag | Value | What it does |
|---|---|---|
ANTHROPIC_MODEL | model ID | Override the primary model entirely. Cheaper model = lower cost. |
ANTHROPIC_DEFAULT_SONNET_MODEL | model ID | Override which model maps to the "sonnet" tier. |
ANTHROPIC_DEFAULT_HAIKU_MODEL | model ID | Override which model maps to the "haiku" tier. |
ANTHROPIC_DEFAULT_OPUS_MODEL | model ID | Override which model maps to the "opus" tier. |
Ready-to-Paste Config
A sensible cost-optimised settings.json. Keeps Opus as the main model but trims context aggressively and offloads lightweight work to cheaper models.
{ "effortLevel": "medium", "env": { "CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING": "1", "CLAUDE_CODE_DISABLE_AUTO_MEMORY": "1", "CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS": "1", "CLAUDE_CODE_SUBAGENT_MODEL": "sonnet", "CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS": "10000", "CLAUDE_CODE_MAX_OUTPUT_TOKENS": "16000", "CLAUDE_CODE_BRIEF": "1", "ANTHROPIC_SMALL_FAST_MODEL": "claude-haiku-4-5-20251001" } }
Aggressive variant — add these for maximum savings (at the cost of capability):
{ "effortLevel": "low", "env": { "CLAUDE_CODE_DISABLE_THINKING": "1", "CLAUDE_CODE_DISABLE_CLAUDE_MDS": "1", "CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS": "1", "CLAUDE_CODE_DISABLE_AUTO_MEMORY": "1", "CLAUDE_CODE_DISABLE_ATTACHMENTS": "1", "CLAUDE_CODE_DISABLE_ADVISOR_TOOL": "1", "CLAUDE_CODE_DISABLE_BACKGROUND_TASKS": "1", "CLAUDE_CODE_SUBAGENT_MODEL": "haiku", "CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS": "5000", "CLAUDE_CODE_MAX_OUTPUT_TOKENS": "8000", "CLAUDE_CODE_BRIEF": "1", "CLAUDE_CODE_SKIP_PROMPT_HISTORY": "1", "ANTHROPIC_SMALL_FAST_MODEL": "claude-haiku-4-5-20251001" } }
Applying Changes
You can edit settings.json directly, or use /update-config inside Claude Code. That command can be used to discuss config as well as updating it.
/update-config set effort level to low
/update-config add CLAUDE_CODE_BRIEF=1 to my env vars
/update-config use sonnet for subagents
/update-config disable auto memory and git instructions
/update-config {"env": {...}}
Three scopes: ~/.claude/settings.json (global), .claude/settings.json (project), .claude/settings.local.json (local override, gitignored).