Best Practices for Using Claude Opus 4.7 with Claude Code

April 17, 2026

The Quick Start (If You Just Want the Best Settings) Already using Claude Code and want to skip the explanation? Here’s the short version. Set your model to claude-opus-4-7. The default effort level in Claude Code is now xhigh — leave it there for most coding work. Set thinking.type to "adaptive" in your API calls so the model decides when to reason deeply and when to skip it. When running xhigh or max effort, set max_tokens to at least 64,000 so the model has room to think across tool calls and subagent loops. If you’re running autonomous agents, use task_budget (minimum 20,000 tokens) to cap total spend per agentic loop. That’s the 80/20. Read on for why these settings matter and when to change them.

What Actually Changed in Opus 4.7

Three things matter for Claude Code users. First, the effort system got a new level. The scale used to run from low through medium, high, and max. Opus 4.7 adds xhigh — “extra high” — slotted between high and max. This isn’t just a label. Under xhigh, the model triggers its deep thinking mode more frequently, proactively reflects on intermediate results, backtracks when tool call paths fail, and initiates more exploratory tool calls. It’s the reasoning depth of max without the full token cost. Second, adaptive thinking replaced the old fixed-thinking model. Previously, extended thinking was either on or off. Now you can set thinking.type to "adaptive", and the model evaluates each request’s complexity before deciding whether and how much to think. Simple file renames get instant responses. Architecture decisions get deep reasoning. The model allocates its own cognitive budget per step. Third, Opus 4.7 respects effort levels more strictly than 4.6. At low and medium effort, the model now scopes its work to exactly what you asked rather than going above and beyond. This sounds minor. In practice, it means your effort setting actually controls behavior now — it’s not a suggestion the model sometimes ignores.

How Effort Levels Work Now

Five tiers. Each one meaningfully different. Low effort does exactly what’s asked and nothing more. Useful for simple file operations, quick lookups, formatting changes. The model won’t explore, won’t suggest alternatives, won’t refactor adjacent code. On Opus 4.7, low means low — the model took this literally. Medium effort balances token usage against intelligence. Good for cost-sensitive batch operations: running linters, applying known patterns across files, generating boilerplate. You’ll save 40-60% on tokens compared to xhigh, but the model won’t catch subtle bugs or suggest architectural improvements. High effort was the previous default for most configurations. Still solid for complex reasoning, difficult coding problems, and tasks where quality matters more than speed. On Opus 4.6 and Sonnet 4.6, this remains the default. Think of it as the balanced option — concurrent sessions, budget-conscious teams, tasks that need good-but-not-maximum intelligence. Xhigh effort is the new default for Opus 4.7 in Claude Code, and Anthropic’s recommended setting for most coding and agentic work. The model triggers adaptive thinking’s deep mode more frequently at this level. It will proactively reflect on what’s working, backtrack when a tool call path leads nowhere, and launch more exploratory calls. For designing APIs, migrating legacy code, reviewing large codebases — this is where you want to be. Max effort removes all constraints. The most thorough reasoning, the deepest analysis, the highest token consumption. Anthropic’s own documentation warns that max can be “prone to overthinking” and may show “diminishing returns from increased token usage.” Use it when your evals show measurable improvement over xhigh. For most coding tasks, xhigh gets you 95% of the way there at a fraction of the cost. The /effort command in Claude Code now opens an interactive slider when called without arguments. Arrow keys to navigate, Enter to confirm. You can also toggle effort mid-task — start at xhigh for the hard architecture work, drop to medium for the repetitive file changes.

Adaptive Thinking: When the Model Thinks for Itself

This is the biggest conceptual shift. Before Opus 4.7, extended thinking was binary: you turned it on, and every request got the full reasoning treatment. Expensive. Slow. Often unnecessary. Adaptive thinking makes reasoning optional at each step. Set thinking.type to "adaptive" in your API request, and the model decides per-interaction whether to activate deep reasoning. The decision depends on the complexity it detects in your request. Ask it to rename a variable across a file? No thinking needed. Direct response. Ask it to debug a race condition in concurrent database writes? Full reasoning mode activates. The effort parameter controls how eagerly the model activates thinking. At xhigh, thinking kicks in frequently — the model assumes most coding tasks benefit from reflection. At medium, it’s more selective. At low, thinking rarely activates at all. When to use adaptive vs. fixed thinking: Adaptive is the right choice for almost all Claude Code work. The model is good at judging complexity. Fixed thinking (setting thinking.type to "enabled") still makes sense in two situations: when every single request genuinely needs deep reasoning (formal verification, security audits), or when you need deterministic behavior and don’t want the model making its own decisions about when to think. One practical note: if you have large or complex system prompts, the model might activate thinking more often than you’d like — it interprets prompt complexity as task complexity. Anthropic’s prompting guide recommends adding explicit guidance to steer thinking behavior. Something as simple as “respond directly for straightforward requests” in your system prompt can reduce unnecessary thinking activations.

New Defaults in Claude Code and Why They Changed

The default effort for Opus 4.7 in Claude Code is xhigh across all plans and providers. For Opus 4.6 and Sonnet 4.6, the default remains high (or medium on Pro and Max plans). Why xhigh? Anthropic’s benchmarks show Opus 4.7 with xhigh effort delivers its biggest gains on advanced software engineering tasks — the kind Claude Code users actually do. The gap between high and xhigh is most pronounced on multi-step problems: debugging across files, refactoring with dependency tracking, architectural decisions that ripple through a codebase. The max_tokens recommendation also changed. When running xhigh or max effort, Anthropic suggests starting at 64,000 tokens and tuning from there. This gives the model room to think and act across subagent spawns and tool call chains. Too-small token limits at high effort levels mean the model cuts reasoning short exactly when it shouldn’t. Task budgets entered public beta alongside Opus 4.7. These are advisory caps on full agentic loops — thinking, tool calls, tool results, final output. The model sees a running countdown and self-paces, prioritizing work and finishing gracefully as the budget gets consumed. Minimum value is 20,000 tokens. Use task_budget for overall moderation and max_tokens as the per-request hard limit.

Opus 4.7 vs. Sonnet: When to Use Which

This isn’t a contest. They’re different tools for different jobs. Use Opus 4.7 for work that used to need your close supervision: multi-file debugging sessions, legacy code migration, architecture design, complex refactoring where one wrong move cascades. Opus 4.7 verifies its own outputs before reporting back. It handles long-running autonomous tasks with the kind of consistency that lets you walk away and check back later. Use Sonnet 4.6 for high-volume, well-defined tasks: generating tests from specs, applying lint fixes, writing documentation, filling in boilerplate, handling straightforward code generation where the pattern is clear. Sonnet is faster, cheaper, and doesn’t need the reasoning overhead that Opus provides. Cost math: Opus 4.7 is priced at $5 per million input tokens and $25 per million output tokens. Same as Opus 4.6. Sonnet costs significantly less. If you’re running 50 coding sessions a day and half of them are routine, putting those routine tasks on Sonnet instead of Opus cuts your bill substantially without quality loss. A practical split many teams use: Opus 4.7 at xhigh for the first pass on any complex task (architecture, initial implementation, debugging), then Sonnet for iteration, testing, and cleanup.

Real Workflow Examples

Debugging a production issue across microservices. Set effort to xhigh. Let adaptive thinking do its job. The model will trace call chains across files, check error handling paths, and verify its fix against the original failure mode before suggesting changes. Opus 4.7’s improved instruction following means it won’t wander off into unrelated refactoring unless you ask. Code review on a large PR. Xhigh effort. The model reads the full diff, checks for subtle bugs, evaluates architectural implications, and flags inconsistencies. At high effort, it catches surface issues. At xhigh, it catches the interaction effects — the kind of bugs that only show up when you think about how changes in file A affect the behavior of file B. Refactoring a legacy module. Start at max effort for the initial analysis — understanding the existing code, mapping dependencies, identifying safe extraction points. Drop to xhigh for the actual refactoring work. Drop to medium for updating tests and documentation. Scaffolding a new feature. Medium or high effort is usually enough. The patterns are known, the structure is defined. Save your xhigh and max tokens for the parts that actually require deep thinking.

Common Mistakes

Running everything at max. Max effort’s diminishing returns are real. Unless your evals prove otherwise, xhigh outperforms max on a cost-per-quality basis for nearly all coding tasks. Max triggers overthinking — the model second-guesses correct solutions, explores dead-end alternatives, and burns tokens on reasoning that doesn’t improve the output. Setting max_tokens too low at high effort levels. If the model runs out of tokens mid-thought at xhigh, the output quality degrades sharply. 64k is the starting point. Tune up from there for complex tasks. Ignoring the effort slider between tasks. Effort isn’t a set-and-forget setting. Toggle it. The /effort slider in Claude Code exists for this reason. Architecture review at xhigh, then boilerplate generation at medium, then back up for the tricky integration work. Not using task budgets for autonomous work. Without a budget, a long-running agent can consume tokens unchecked. Task budgets give the model a countdown it can pace against. Set them. Expecting identical subagent behavior. Opus 4.7 spawns fewer subagents by default than 4.6 did. If your workflow depends on parallel subagent execution, add explicit prompting guidance about when subagents are desirable. The behavior is steerable — but you have to steer it.

Performance and Cost Considerations

Adaptive thinking at xhigh uses roughly 20-30% fewer tokens than fixed thinking at the same effort level, because the model skips reasoning on steps that don’t benefit from it. Over a full day of Claude Code usage, that adds up. Opus 4.7 maintains the same pricing as 4.6: $5 per million input tokens, $25 per million output. The cost efficiency gains come from smarter token allocation, not lower prices. Agent teams — where multiple Claude instances work on a task simultaneously — use roughly 3-4x the tokens of a single sequential session doing the same work. Faster, yes. But budget accordingly. The new task budgets feature helps here: set a per-team-member budget so no single agent runs away with your token allocation. For teams watching costs closely: the combination of adaptive thinking + effort toggling + task budgets gives you three independent levers to control spend without sacrificing output quality. Use all three.

Sources

Anthropic · Claude API Docs — Effort · Claude API Docs — Adaptive Thinking · Claude Code Docs — Model Configuration · Claude API Docs — Prompting Best Practices

This article is AI-generated.