Claude Opus 4.7 Just Dropped — Here’s What’s New

The Distinction That Matters Before Anything Else

When Anthropic released Claude Mythos Preview on April 7, 2026, the announcement came with an important qualifier: it was not generally available. Access was restricted to 12 core partner organizations and roughly 40 additional companies as part of Project Glasswing, the company's controlled cybersecurity initiative. For the vast majority of developers and enterprise teams, Mythos was news they could read about but not use.

Claude Opus 4.7, released on April 16, 2026, is the answer to that gap. It is Anthropic's most capable model available to anyone today, accessible across all Claude products, the API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. Pricing is unchanged from Opus 4.6 at $5 per million input tokens and $25 per million output tokens.

The model scores 87.6 percent on SWE-bench Verified and 64.3 percent on SWE-bench Pro. Both numbers put it ahead of GPT-5.4 and Gemini 3.1 Pro on the benchmarks that most directly map to real software engineering work. It trails Claude Mythos Preview on every reported benchmark, which is expected and intentional. What Anthropic has built with Opus 4.7 is a generally deployable model with meaningful advances on the tasks developers run in production, plus a new security strategy designed to pave the way toward eventually releasing Mythos-class capabilities more broadly.

What Is New in Opus 4.7

Coding performance. The headline numbers are real and significant. SWE-bench Verified climbed from 80.8 percent on Opus 4.6 to 87.6 percent, a 6.8 point gain in a single release. SWE-bench Pro moved from 53.4 percent to 64.3 percent. On CursorBench, which measures coding assistance quality in the editor environment most developers actually use, the score jumped from 58 percent to 70 percent, a 12 point improvement and the largest single-benchmark gain in the release. Rakuten reported 3x more production tasks resolved compared to Opus 4.6, with double-digit gains in code quality and test quality scores. CodeRabbit measured over 10 percent recall improvement.

Anthropic's own description of what changed at the capability level is precise: the model handles complex, long-running tasks with more rigor and consistency, pays closer attention to instructions, and devises ways to verify its own outputs before reporting back. The self-verification behavior is a genuine shift. Where Opus 4.6 would complete a task and return the result, Opus 4.7 checks its work first.

Vision at 3x the resolution. Maximum image resolution increased from 1,568 pixels on the long edge, roughly 1.15 megapixels, to 2,576 pixels, roughly 3.75 megapixels. This is more than three times the pixel capacity of previous Claude models, and the practical implications are significant for specific workflows.

Computer use and screenshot understanding benefit directly because the model's pixel coordinates now map one-to-one with actual screen pixels, removing the scale-factor math that was previously required. Document analysis improves because smaller text, fine diagram details, and handwritten notes that were previously illegible or unreliable become readable. Visual reasoning on CharXiv jumped from 68.7 percent without tools to 82.1 percent without tools. The OSWorld-Verified computer use benchmark improved from 72.7 percent to 78.0 percent, ahead of GPT-5.4 at 75.0 percent.

Higher-resolution images consume more tokens. Teams passing images where fine detail is not the primary concern can downsample before sending to keep costs flat.

The xhigh effort level. Opus 4.6 exposed four effort levels: low, medium, high, and max. Opus 4.7 inserts a new xhigh level between high and max. The purpose is to give developers finer control over the reasoning depth versus latency tradeoff on hard problems, without committing to the full cost and latency of max effort.

For most coding and agentic workflows, Anthropic recommends starting with high or xhigh. Claude Code has been updated to default to xhigh for all plans starting from release day. As Hex's CTO observed in early testing, low-effort Opus 4.7 is roughly equivalent to medium-effort Opus 4.6, meaning the efficiency improvement at lower effort levels effectively reduces cost for teams that do not need maximum reasoning depth.

Task budgets in public beta. Task budgets allow developers to cap and prioritize token spend across longer agentic runs. Combined with the xhigh effort level, they give teams a practical mechanism for saying "think hard on this, but stop at N tokens." For production agentic systems where uncapped token spend is a cost risk, task budgets are a meaningful addition to the control surface.

File system memory for multi-session work. The model is better at using file system-based memory to retain important notes across long, multi-session workflows. In agentic contexts where tasks span multiple sessions, Opus 4.7 uses stored memory to reduce the up-front context required when resuming work, rather than starting from scratch each time.

The /ultrareview command in Claude Code. A new slash command runs a dedicated review session that reads through changes and flags what a careful reviewer would catch. Anthropic is offering three free ultrareviews at launch for Pro and Max users.

The Cybersecurity Strategy Built Into the Release

Opus 4.7 is not merely a performance upgrade. It carries specific policy weight tied to the Mythos situation.

Anthropic stated directly that it "experimented with efforts to differentially reduce Opus 4.7's cyber capabilities during training." The model's cybersecurity capabilities are intentionally less advanced than Mythos Preview. It ships with automated safeguards that detect and block requests indicating prohibited or high-risk cybersecurity uses. Security professionals who need the model for legitimate cybersecurity work can apply through a formal Cyber Verification Program.

The explicit framing from Anthropic is that what the company learns from real-world deployment of these safeguards on Opus 4.7 will inform its eventual goal of a broad release of Mythos-class models. Opus 4.7 is the testing ground for the controls that would make Mythos safe to deploy at scale. The release is both a model upgrade and a live experiment in responsible AI deployment infrastructure.

Benchmarks Against the Competition

Opus 4.7 leads among generally available models on the benchmarks most relevant to software engineering and agentic work. SWE-bench Pro at 64.3 percent beats GPT-5.4 at 57.7 percent and Gemini 3.1 Pro at 54.2 percent. On MCP-Atlas for scaled tool use, Opus 4.7 reaches 77.3 percent, ahead of GPT-5.4 at 68.1 percent and Gemini at 73.9 percent. On GPQA Diamond for graduate-level reasoning, all three models have converged near 94 percent, effectively saturating that benchmark.

GPT-5.4 leads on BrowseComp at 89.3 percent versus Opus 4.7's 79.3 percent and on research-intensive web workflows. Gemini 3.1 Pro is significantly cheaper at $2 per million input tokens and $12 per million output tokens, with a larger 2M token context window, making it a strong option for cost-sensitive workloads where peak coding performance is not the primary requirement.

For coding, long-running agentic work, computer use, and enterprise knowledge work, the benchmarks support Opus 4.7 as the current leader among deployable options.

What Teams Upgrading From Opus 4.6 Need to Know

The migration is not fully drop-in. Two changes affect cost calculations and one changes behaviour in ways that require prompt review.

Updated tokenizer. The same input maps to approximately 1.0 to 1.35 times more tokens depending on content type. Anthropic has adjusted rate limits upward to partially offset this, but teams running high-volume workloads should benchmark their specific content before switching production systems.

Deeper thinking at higher effort levels. Opus 4.7 produces more output tokens at xhigh and max effort levels than Opus 4.6 did at equivalent settings, because it thinks more on hard problems. The quality improvement is real. So is the cost increase. Task budgets are the mechanism for managing this.

Stricter instruction following. Opus 4.7 takes instructions more literally than Opus 4.6 did. Prompts that relied on the older model's loose interpretation or willingness to infer intent from vague phrasing may produce different results. Anthropic explicitly recommends reviewing and retesting prompts when migrating. CLAUDE.md files tuned for Opus 4.6 should be audited before switching.

Where to Access It

Claude Opus 4.7 is available now across all Claude products on claude.ai, the Claude API using model ID claude-opus-4-7, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. It is rolling out on GitHub Copilot for Pro+, Business, and Enterprise users with a 7.5x premium request multiplier at promotional pricing through April 30, replacing Opus 4.5 and Opus 4.6 in the model picker.

Pricing is $5 per million input tokens and $25 per million output tokens, unchanged from Opus 4.6.

If you are building agentic AI workflows, enterprise knowledge work tools, or software engineering products and want guidance on integrating Claude Opus 4.7 into production systems, evaluating the right model tier for your workload, or designing systems that account for token budget management, please reach out to MonkDA. We work with development teams building AI-powered products at every stage.

Claude Opus 4.7 Just Dropped — Here’s What’s New

The Distinction That Matters Before Anything Else

What Is New in Opus 4.7

The Cybersecurity Strategy Built Into the Release

Benchmarks Against the Competition

What Teams Upgrading From Opus 4.6 Need to Know

Where to Access It

How to Add AI Features to Existing SaaS Products: The Complete 2026 Guide

How to Build an AI-Powered Lead Scoring System That Sales Teams Actually Trust

ChatGPT Images 2.0 is Here: Everything New in OpenAI’s Latest Model

Ready to take your idea to market?