Purpose

This document provides comprehensive information about the effort parameter in Claude Opus 4.6, explaining the differences between effort levels (low, medium, high, max) and their impact on behavior, performance, and token usage.

Overview

The effort parameter allows developers to control how eager Claude is about spending tokens when responding to requests. This enables trading off between response thoroughness and token efficiency with a single model.

Supported Models:

  • Claude Opus 4.6
  • Claude Opus 4.5

Key Point: For Claude Opus 4.6, effort replaces budget_tokens as the recommended way to control thinking depth when using adaptive thinking (thinking: {type: "adaptive"}).

How Effort Works

By default, Claude uses high effort—spending as many tokens as needed for excellent results. You can:

  • Raise the effort level to max for absolute highest capability
  • Lower it to be more conservative with token usage, optimizing for speed and cost while accepting some reduction in capability

Important: Effort is a behavioral signal, not a strict token budget. At lower effort levels, Claude will still think on sufficiently difficult problems—it will just think less than it would at higher effort levels for the same problem.

Effort Levels Comparison

LevelDescriptionTypical Use CaseToken Usage
maxAbsolute maximum capability with no constraints on token spending (Opus 4.6 only)Tasks requiring the deepest possible reasoning and most thorough analysisHighest
high (default)High capability. Equivalent to not setting the parameterComplex reasoning, difficult coding problems, agentic tasksHigh
mediumBalanced approach with moderate token savingsAgentic tasks requiring balance of speed, cost, and performanceModerate
lowMost efficient. Significant token savings with some capability reductionSimpler tasks needing best speed and lowest costs, such as subagentsLowest

Note: Requesting max effort on models other than Opus 4.6 will return an error.

Behavioral Differences by Effort Level

Token Spending

The effort parameter affects all tokens in the response, including:

  • Text responses and explanations
  • Tool calls and function arguments
  • Extended thinking (when enabled)

Thinking Behavior

At high and max effort:

  • Claude will almost always think deeply
  • Provides detailed reasoning
  • Explores multiple approaches

At medium effort:

  • Balanced thinking approach
  • May skip some redundant analysis

At low effort:

  • May skip thinking for simpler problems
  • Focuses on efficiency
  • Direct answers with minimal elaboration

Tool Use Behavior

Lower effort levels (low, medium) tend to:

  • Combine multiple operations into fewer tool calls
  • Make fewer overall tool calls
  • Proceed directly to action without preamble
  • Use terse confirmation messages after completion

Higher effort levels (high, max) may:

  • Make more tool calls
  • Explain the plan before taking action
  • Provide detailed summaries of changes
  • Include more comprehensive code comments

Performance Impact

Speed

  • Low effort: Fastest responses due to fewer tokens generated
  • Medium effort: Moderate speed with balanced quality
  • High effort (default): Slower responses due to thorough analysis
  • Max effort: Slowest responses with deepest reasoning

Cost

Token usage directly impacts cost:

  • Low effort: Significant cost savings (fewest tokens)
  • Medium effort: Moderate cost savings
  • High effort: Standard cost (baseline)
  • Max effort: Highest cost (most tokens)

Quality/Capability

  • Low effort: Some capability reduction, suitable for simple tasks
  • Medium effort: Solid performance with good quality
  • High effort: Excellent results, Claude’s best standard work
  • Max effort: Absolute highest capability and most thorough analysis

When to Use Each Level

Use Max Effort When:

  • You need the absolute highest capability with no constraints
  • Tasks require the most thorough reasoning and deepest analysis
  • Quality is paramount regardless of cost or latency
  • Only available on Opus 4.6

Use High Effort (Default) When:

  • You need Claude’s best work
  • Complex reasoning or nuanced analysis required
  • Difficult coding problems
  • Agentic tasks where quality is top priority
  • Any task where thoroughness matters most

Use Medium Effort When:

  • You want a balanced option
  • Need solid performance without full token expenditure of high effort
  • Agentic tasks requiring balance of speed, cost, and performance
  • Good middle ground for most production use cases

Use Low Effort When:

  • Optimizing for speed (fewer tokens = faster responses)
  • Optimizing for cost
  • Simple classification tasks
  • Quick lookups
  • High-volume use cases where marginal quality improvements don’t justify additional latency or spend
  • Subagent tasks where efficiency is critical

Usage with Adaptive Thinking

Claude Opus 4.6 uses adaptive thinking (thinking: {type: "adaptive"}), where effort is the recommended control for thinking depth.

Key points:

  • budget_tokens is deprecated on Opus 4.6 (still accepted but will be removed)
  • At high and max effort: Claude almost always thinks deeply
  • At lower effort levels: May skip thinking for simpler problems
  • Effort can be used with or without thinking enabled

API Usage Example

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=4096,
messages=[{
"role": "user",
"content": "Analyze the trade-offs between microservices and monolithic architectures"
}],
output_config={
"effort": "medium"
}
)
print(response.content[0].text)
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 4096,
messages: [{
role: "user",
content: "Analyze the trade-offs between microservices and monolithic architectures"
}],
output_config: {
effort: "medium"
}
});
console.log(response.content[0].text);

Best Practices

  1. Start with high: Use lower effort levels to trade off performance for token efficiency
  2. Use low for speed-sensitive or simple tasks: When latency matters or tasks are straightforward, low effort can significantly reduce response times and costs
  3. Test your use case: The impact of effort levels varies by task type. Evaluate performance on your specific use cases before deploying
  4. Consider dynamic effort: Adjust effort based on task complexity. Simple queries may warrant low effort while agentic coding and complex reasoning benefit from high effort

Common Misconceptions

  1. “Effort is a strict token budget”: No, it’s a behavioral signal. Claude will still think when necessary at lower effort levels, just less than at higher levels.

  2. “Setting effort requires thinking enabled”: No, effort affects all tokens (text, tool calls, thinking) and works with or without thinking enabled.

  3. “Max effort is just slightly better than high”: Max effort provides the absolute highest capability with no constraints, potentially significantly more thorough than high effort.

Troubleshooting

If Opus 4.6 is “overthinking” simpler tasks:

  • Adjust effort parameter from default high to medium
  • This reduces cost and latency while maintaining good quality

If responses are too brief or missing details:

  • Increase effort level (from low to medium or high)
  • Consider if the task complexity warrants higher effort

If costs are too high:

  • Evaluate if all tasks need high effort
  • Consider using medium for balanced tasks
  • Use low for simple, high-volume operations

Sources

  1. Effort - Claude API Docs
  2. Anthropic Effort Parameter | liteLLM
  3. What’s new in Claude 4.6 - Claude API Docs
  4. Introducing Claude Opus 4.6
  5. TechCrunch: Anthropic releases Opus 4.6 with new ‘agent teams’
  6. CNBC: Anthropic launches Claude Opus 4.6
  7. MarkTechPost: Anthropic Releases Claude Opus 4.6