Anthropic's Claude Opus 4.8 Hits GitHub Copilot at a Steep 15x Cost

GitHub

May 28, 2026

2 min read

DEVELOPMENT

code_generation

LLMS

long_context vision_language

May 28, 2026

DEVELOPMENT

code_generation

LLMS

long_context vision_language

2 min read

Claude Opus 4.8 is now generally available inside GitHub Copilot. Anthropic's latest flagship model brings meaningful gains in agentic coding, large-codebase navigation, and self-reported code reliability, and it lands at the same API price as Opus 4.7. The catch: it carries a 15x premium request multiplier in Copilot, making it the most expensive model in the picker by a wide margin.

What actually changed under the hood

Opus 4.8 builds on Opus 4.7 with improvements across benchmarks and is described by Anthropic as a more effective collaborator. The headline numbers back that up. On SWE-bench Verified, the model scores 88.6%, up from 87.6% on Opus 4.7 and 80.8% on Opus 4.6. On the harder SWE-bench Pro benchmark, it hits 69.2%, up from 64.3%. SWE-bench Pro is a benchmark that tests whether a model can autonomously resolve real GitHub issues, making it a strong proxy for practical coding ability.

Opus 4.8 is also the strongest computer-use and browser-agent model Anthropic has tested, scoring 84% on Online-Mind2Web, a meaningful jump over both Opus 4.7 and GPT-5.5. That said, Terminal-Bench 2.1 for agentic terminal coding still belongs to GPT-5.5 at 78.2%, with Opus 4.8 coming in at 74.6%.

The reliability story is the real headline

Raw benchmark scores are only part of the picture. The more interesting shift is in what Anthropic calls "honesty" -- the model's tendency to flag its own mistakes rather than silently ship broken code into your pipeline.

Anthropic's evaluations found the model to be around four times less likely than its predecessor to leave flaws in its own code unremarked.
Early testers found Opus 4.8 sharper in judgment when performing agentic tasks, more likely to flag uncertainty, and less likely to make unsupported claims.

Don't miss what's next in AI

Join 300,000+ engineers and researchers who get the signal, not the noise.

Full access to in-depth AI research breakdowns
Be the first to know what's trending before it hits mainstream
Daily curated papers, repos, and industry moves

Anthropic's Claude Opus 4.8 Hits GitHub Copilot at a Steep 15x Cost

Takeaways

What actually changed under the hood

The reliability story is the real headline

Don't miss what's next in AI