effort-levels

Purpose

This document provides comprehensive information about the effort parameter in Claude Opus 4.6, explaining the differences between effort levels (low, medium, high, max) and their impact on behavior, performance, and token usage.

Overview

The effort parameter allows developers to control how eager Claude is about spending tokens when responding to requests. This enables trading off between response thoroughness and token efficiency with a single model.

Supported Models:

Claude Opus 4.6
Claude Opus 4.5

Key Point: For Claude Opus 4.6, effort replaces budget_tokens as the recommended way to control thinking depth when using adaptive thinking (thinking: {type: "adaptive"}).

How Effort Works

By default, Claude uses high effort—spending as many tokens as needed for excellent results. You can:

Raise the effort level to max for absolute highest capability
Lower it to be more conservative with token usage, optimizing for speed and cost while accepting some reduction in capability

Important: Effort is a behavioral signal, not a strict token budget. At lower effort levels, Claude will still think on sufficiently difficult problems—it will just think less than it would at higher effort levels for the same problem.

Effort Levels Comparison

Level	Description	Typical Use Case	Token Usage
max	Absolute maximum capability with no constraints on token spending (Opus 4.6 only)	Tasks requiring the deepest possible reasoning and most thorough analysis	Highest
high (default)	High capability. Equivalent to not setting the parameter	Complex reasoning, difficult coding problems, agentic tasks	High
medium	Balanced approach with moderate token savings	Agentic tasks requiring balance of speed, cost, and performance	Moderate
low	Most efficient. Significant token savings with some capability reduction	Simpler tasks needing best speed and lowest costs, such as subagents	Lowest

Note: Requesting max effort on models other than Opus 4.6 will return an error.

Behavioral Differences by Effort Level

Token Spending

The effort parameter affects all tokens in the response, including:

Text responses and explanations
Tool calls and function arguments
Extended thinking (when enabled)

Thinking Behavior

At high and max effort:

Claude will almost always think deeply
Provides detailed reasoning
Explores multiple approaches

At medium effort:

Balanced thinking approach
May skip some redundant analysis

At low effort:

May skip thinking for simpler problems
Focuses on efficiency
Direct answers with minimal elaboration

Tool Use Behavior

Lower effort levels (low, medium) tend to:

Combine multiple operations into fewer tool calls
Make fewer overall tool calls
Proceed directly to action without preamble
Use terse confirmation messages after completion

Higher effort levels (high, max) may:

Make more tool calls
Explain the plan before taking action
Provide detailed summaries of changes
Include more comprehensive code comments

Performance Impact

Speed

Low effort: Fastest responses due to fewer tokens generated
Medium effort: Moderate speed with balanced quality
High effort (default): Slower responses due to thorough analysis
Max effort: Slowest responses with deepest reasoning

Cost

Token usage directly impacts cost:

Low effort: Significant cost savings (fewest tokens)
Medium effort: Moderate cost savings
High effort: Standard cost (baseline)
Max effort: Highest cost (most tokens)

Quality/Capability

Low effort: Some capability reduction, suitable for simple tasks
Medium effort: Solid performance with good quality
High effort: Excellent results, Claude’s best standard work
Max effort: Absolute highest capability and most thorough analysis

When to Use Each Level

Use Max Effort When:

You need the absolute highest capability with no constraints
Tasks require the most thorough reasoning and deepest analysis
Quality is paramount regardless of cost or latency
Only available on Opus 4.6

Use High Effort (Default) When:

You need Claude’s best work
Complex reasoning or nuanced analysis required
Difficult coding problems
Agentic tasks where quality is top priority
Any task where thoroughness matters most

Use Medium Effort When:

You want a balanced option
Need solid performance without full token expenditure of high effort
Agentic tasks requiring balance of speed, cost, and performance
Good middle ground for most production use cases

Use Low Effort When:

Optimizing for speed (fewer tokens = faster responses)
Optimizing for cost
Simple classification tasks
Quick lookups
High-volume use cases where marginal quality improvements don’t justify additional latency or spend
Subagent tasks where efficiency is critical

Usage with Adaptive Thinking

Claude Opus 4.6 uses adaptive thinking (thinking: {type: "adaptive"}), where effort is the recommended control for thinking depth.

Key points:

budget_tokens is deprecated on Opus 4.6 (still accepted but will be removed)
At high and max effort: Claude almost always thinks deeply
At lower effort levels: May skip thinking for simpler problems
Effort can be used with or without thinking enabled

API Usage Example

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=4096,
    messages=[{
        "role": "user",
        "content": "Analyze the trade-offs between microservices and monolithic architectures"
    }],
    output_config={
        "effort": "medium"
    }
)

print(response.content[0].text)

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const response = await client.messages.create({
  model: "claude-opus-4-6",
  max_tokens: 4096,
  messages: [{
    role: "user",
    content: "Analyze the trade-offs between microservices and monolithic architectures"
  }],
  output_config: {
    effort: "medium"
  }
});

console.log(response.content[0].text);

Best Practices

Start with high: Use lower effort levels to trade off performance for token efficiency
Use low for speed-sensitive or simple tasks: When latency matters or tasks are straightforward, low effort can significantly reduce response times and costs
Test your use case: The impact of effort levels varies by task type. Evaluate performance on your specific use cases before deploying
Consider dynamic effort: Adjust effort based on task complexity. Simple queries may warrant low effort while agentic coding and complex reasoning benefit from high effort

Common Misconceptions

“Effort is a strict token budget”: No, it’s a behavioral signal. Claude will still think when necessary at lower effort levels, just less than at higher levels.
“Setting effort requires thinking enabled”: No, effort affects all tokens (text, tool calls, thinking) and works with or without thinking enabled.
“Max effort is just slightly better than high”: Max effort provides the absolute highest capability with no constraints, potentially significantly more thorough than high effort.

Troubleshooting

If Opus 4.6 is “overthinking” simpler tasks:

Adjust effort parameter from default high to medium
This reduces cost and latency while maintaining good quality

If responses are too brief or missing details:

Increase effort level (from low to medium or high)
Consider if the task complexity warrants higher effort

If costs are too high:

Evaluate if all tasks need high effort
Consider using medium for balanced tasks
Use low for simple, high-volume operations