Claude Code Token Cheat Sheet

Every env flag that affects token spend, grouped by impact.

14 April 20264 min read
AI-Ometerslop disclosure index
Mostly AI85%
 

Claude Code Cost Cheat Sheet

Only the flags that affect token spend. Set via env in settings.json or export in your shell.

Snapshot from v2.1.83 (14 Apr 2026). Flags are internal — verify against your version with strings $(which claude) | grep -oE 'CLAUDE_CODE_[A-Z_]+'.


Big Wins

These target the largest token pools: thinking, system prompt size, and model selection.

FlagValueWhat it does
CLAUDE_CODE_DISABLE_THINKING1Eliminates all extended thinking tokens. The single biggest cost lever on complex tasks.
CLAUDE_CODE_EFFORT_LEVELlow / mediumShrinks the thinking budget without fully disabling it. max is Opus-only. Overrides user settings.
CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING1Switches from dynamic per-task thinking budgets to fixed ones. More predictable spend.
CLAUDE_CODE_DISABLE_CLAUDE_MDS1Drops all CLAUDE.md files from the system prompt. Nuclear option — big savings if your CLAUDE.md files are large.
CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS1Strips git status, branch info, recent commits, and the git workflow guide from the system prompt.
CLAUDE_CODE_SUBAGENT_MODELsonnet / haikuRoutes all subagent (Agent tool) calls to a cheaper model.
ANTHROPIC_SMALL_FAST_MODELe.g. claude-haiku-4-5-20251001Cheaper model for internal lightweight tasks: hook evaluation, classification, utility ops.

Moderate Savings

Context window management, file reads, and feature toggles.

FlagValueWhat it does
CLAUDE_CODE_SM_COMPACT1Enables Session Memory compaction — auto-summarises conversation history (10k–40k token window). Major savings on long sessions.
CLAUDE_CODE_AUTO_COMPACT_WINDOWinteger (tokens)Overrides the compaction window size. Lower = earlier compaction = less context carried forward.
CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENSinteger (default ~25000)Caps tokens per file read. Lowering this forces targeted reads instead of loading whole files.
CLAUDE_CODE_MAX_OUTPUT_TOKENSintegerHard cap on output tokens per response.
CLAUDE_CODE_DISABLE_AUTO_MEMORY1Stops auto-memory reads/writes. Removes memory content from the context window.
CLAUDE_CODE_DISABLE_ATTACHMENTS1Disables file attachment generation (references, diagnostics). Reduces per-turn context.
CLAUDE_CODE_DISABLE_ADVISOR_TOOL1Removes the Advisor tool schema from the system prompt. One fewer tool definition.
CLAUDE_CODE_DISABLE_BACKGROUND_TASKS1Prevents background tasks from making additional API calls.

Minor Savings

Small per-session or per-turn reductions that add up over time.

FlagValueWhat it does
CLAUDE_CODE_BRIEF1Encourages terser responses across all output.
CLAUDE_CODE_SKIP_PROMPT_HISTORY1Skips loading previous prompt history at session start.
CLAUDE_CODE_MCP_INSTR_DELTA1Sends only MCP instruction additions/removals instead of full sets. Saves tokens when MCP servers change.
CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS1Disables beta features that may carry token overhead.
CLAUDE_CODE_MAX_RETRIESinteger (default varies)Fewer retries = fewer wasted tokens on re-sent requests after failures.

Model Selection (cost, not tokens)

These don't change token counts but directly affect price per token.

FlagValueWhat it does
ANTHROPIC_MODELmodel IDOverride the primary model entirely. Cheaper model = lower cost.
ANTHROPIC_DEFAULT_SONNET_MODELmodel IDOverride which model maps to the "sonnet" tier.
ANTHROPIC_DEFAULT_HAIKU_MODELmodel IDOverride which model maps to the "haiku" tier.
ANTHROPIC_DEFAULT_OPUS_MODELmodel IDOverride which model maps to the "opus" tier.

Ready-to-Paste Config

A sensible cost-optimised settings.json. Keeps Opus as the main model but trims context aggressively and offloads lightweight work to cheaper models.

{
  "effortLevel": "medium",
  "env": {
    "CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING": "1",
    "CLAUDE_CODE_DISABLE_AUTO_MEMORY": "1",
    "CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS": "1",
    "CLAUDE_CODE_SUBAGENT_MODEL": "sonnet",
    "CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS": "10000",
    "CLAUDE_CODE_MAX_OUTPUT_TOKENS": "16000",
    "CLAUDE_CODE_BRIEF": "1",
    "ANTHROPIC_SMALL_FAST_MODEL": "claude-haiku-4-5-20251001"
  }
}

Aggressive variant — add these for maximum savings (at the cost of capability):

{
  "effortLevel": "low",
  "env": {
    "CLAUDE_CODE_DISABLE_THINKING": "1",
    "CLAUDE_CODE_DISABLE_CLAUDE_MDS": "1",
    "CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS": "1",
    "CLAUDE_CODE_DISABLE_AUTO_MEMORY": "1",
    "CLAUDE_CODE_DISABLE_ATTACHMENTS": "1",
    "CLAUDE_CODE_DISABLE_ADVISOR_TOOL": "1",
    "CLAUDE_CODE_DISABLE_BACKGROUND_TASKS": "1",
    "CLAUDE_CODE_SUBAGENT_MODEL": "haiku",
    "CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS": "5000",
    "CLAUDE_CODE_MAX_OUTPUT_TOKENS": "8000",
    "CLAUDE_CODE_BRIEF": "1",
    "CLAUDE_CODE_SKIP_PROMPT_HISTORY": "1",
    "ANTHROPIC_SMALL_FAST_MODEL": "claude-haiku-4-5-20251001"
  }
}

Applying Changes

You can edit settings.json directly, or use /update-config inside Claude Code. That command can be used to discuss config as well as updating it.

/update-config set effort level to low
/update-config add CLAUDE_CODE_BRIEF=1 to my env vars
/update-config use sonnet for subagents
/update-config disable auto memory and git instructions
/update-config {"env": {...}}

Three scopes: ~/.claude/settings.json (global), .claude/settings.json (project), .claude/settings.local.json (local override, gitignored).