effort-levels
Purpose
This document provides comprehensive information about the effort parameter in Claude Opus 4.6, explaining the differences between effort levels (low, medium, high, max) and their impact on behavior, performance, and token usage.
Overview
The effort parameter allows developers to control how eager Claude is about spending tokens when responding to requests. This enables trading off between response thoroughness and token efficiency with a single model.
Supported Models:
- Claude Opus 4.6
- Claude Opus 4.5
Key Point: For Claude Opus 4.6, effort replaces budget_tokens as the recommended way to control thinking depth when using adaptive thinking (thinking: {type: "adaptive"}).
How Effort Works
By default, Claude uses high effort—spending as many tokens as needed for excellent results. You can:
- Raise the effort level to
maxfor absolute highest capability - Lower it to be more conservative with token usage, optimizing for speed and cost while accepting some reduction in capability
Important: Effort is a behavioral signal, not a strict token budget. At lower effort levels, Claude will still think on sufficiently difficult problems—it will just think less than it would at higher effort levels for the same problem.
Effort Levels Comparison
| Level | Description | Typical Use Case | Token Usage |
|---|---|---|---|
| max | Absolute maximum capability with no constraints on token spending (Opus 4.6 only) | Tasks requiring the deepest possible reasoning and most thorough analysis | Highest |
| high (default) | High capability. Equivalent to not setting the parameter | Complex reasoning, difficult coding problems, agentic tasks | High |
| medium | Balanced approach with moderate token savings | Agentic tasks requiring balance of speed, cost, and performance | Moderate |
| low | Most efficient. Significant token savings with some capability reduction | Simpler tasks needing best speed and lowest costs, such as subagents | Lowest |
Note: Requesting max effort on models other than Opus 4.6 will return an error.
Behavioral Differences by Effort Level
Token Spending
The effort parameter affects all tokens in the response, including:
- Text responses and explanations
- Tool calls and function arguments
- Extended thinking (when enabled)
Thinking Behavior
At high and max effort:
- Claude will almost always think deeply
- Provides detailed reasoning
- Explores multiple approaches
At medium effort:
- Balanced thinking approach
- May skip some redundant analysis
At low effort:
- May skip thinking for simpler problems
- Focuses on efficiency
- Direct answers with minimal elaboration
Tool Use Behavior
Lower effort levels (low, medium) tend to:
- Combine multiple operations into fewer tool calls
- Make fewer overall tool calls
- Proceed directly to action without preamble
- Use terse confirmation messages after completion
Higher effort levels (high, max) may:
- Make more tool calls
- Explain the plan before taking action
- Provide detailed summaries of changes
- Include more comprehensive code comments
Performance Impact
Speed
- Low effort: Fastest responses due to fewer tokens generated
- Medium effort: Moderate speed with balanced quality
- High effort (default): Slower responses due to thorough analysis
- Max effort: Slowest responses with deepest reasoning
Cost
Token usage directly impacts cost:
- Low effort: Significant cost savings (fewest tokens)
- Medium effort: Moderate cost savings
- High effort: Standard cost (baseline)
- Max effort: Highest cost (most tokens)
Quality/Capability
- Low effort: Some capability reduction, suitable for simple tasks
- Medium effort: Solid performance with good quality
- High effort: Excellent results, Claude’s best standard work
- Max effort: Absolute highest capability and most thorough analysis
When to Use Each Level
Use Max Effort When:
- You need the absolute highest capability with no constraints
- Tasks require the most thorough reasoning and deepest analysis
- Quality is paramount regardless of cost or latency
- Only available on Opus 4.6
Use High Effort (Default) When:
- You need Claude’s best work
- Complex reasoning or nuanced analysis required
- Difficult coding problems
- Agentic tasks where quality is top priority
- Any task where thoroughness matters most
Use Medium Effort When:
- You want a balanced option
- Need solid performance without full token expenditure of high effort
- Agentic tasks requiring balance of speed, cost, and performance
- Good middle ground for most production use cases
Use Low Effort When:
- Optimizing for speed (fewer tokens = faster responses)
- Optimizing for cost
- Simple classification tasks
- Quick lookups
- High-volume use cases where marginal quality improvements don’t justify additional latency or spend
- Subagent tasks where efficiency is critical
Usage with Adaptive Thinking
Claude Opus 4.6 uses adaptive thinking (thinking: {type: "adaptive"}), where effort is the recommended control for thinking depth.
Key points:
budget_tokensis deprecated on Opus 4.6 (still accepted but will be removed)- At
highandmaxeffort: Claude almost always thinks deeply - At lower effort levels: May skip thinking for simpler problems
- Effort can be used with or without thinking enabled
API Usage Example
import anthropic
client = anthropic.Anthropic()
response = client.messages.create( model="claude-opus-4-6", max_tokens=4096, messages=[{ "role": "user", "content": "Analyze the trade-offs between microservices and monolithic architectures" }], output_config={ "effort": "medium" })
print(response.content[0].text)import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({ model: "claude-opus-4-6", max_tokens: 4096, messages: [{ role: "user", content: "Analyze the trade-offs between microservices and monolithic architectures" }], output_config: { effort: "medium" }});
console.log(response.content[0].text);Best Practices
- Start with high: Use lower effort levels to trade off performance for token efficiency
- Use low for speed-sensitive or simple tasks: When latency matters or tasks are straightforward, low effort can significantly reduce response times and costs
- Test your use case: The impact of effort levels varies by task type. Evaluate performance on your specific use cases before deploying
- Consider dynamic effort: Adjust effort based on task complexity. Simple queries may warrant low effort while agentic coding and complex reasoning benefit from high effort
Common Misconceptions
-
“Effort is a strict token budget”: No, it’s a behavioral signal. Claude will still think when necessary at lower effort levels, just less than at higher levels.
-
“Setting effort requires thinking enabled”: No, effort affects all tokens (text, tool calls, thinking) and works with or without thinking enabled.
-
“Max effort is just slightly better than high”: Max effort provides the absolute highest capability with no constraints, potentially significantly more thorough than high effort.
Troubleshooting
If Opus 4.6 is “overthinking” simpler tasks:
- Adjust effort parameter from default
hightomedium - This reduces cost and latency while maintaining good quality
If responses are too brief or missing details:
- Increase effort level (from
lowtomediumorhigh) - Consider if the task complexity warrants higher effort
If costs are too high:
- Evaluate if all tasks need
higheffort - Consider using
mediumfor balanced tasks - Use
lowfor simple, high-volume operations