Google's Gemini 3.5 Flash Beats Pro While Antigravity 2.0 Kills the CLI

EDITORIAL LEADERBOARD

Google for Developers

Google's Gemini 3.5 Flash Beats Pro While Antigravity 2.0 Kills the CLI

May 26, 2026

2 min read

AGENTS

agent_frameworks code_agents

DEVELOPMENT

code_agents

May 26, 2026

AGENTS

agent_frameworks code_agents

DEVELOPMENT

code_agents

2 min read

Google I/O 2026 was not a product showcase. It was a platform declaration. Every major announcement pointed at the same target: turning AI from a tool you query into infrastructure that acts on your behalf. Google is no longer building a chatbot. It's building an agent platform. And at the center of that shift are three developer-facing announcements that landed generally available the same day they were announced.

A Flash model that beats the old Pro

Gemini 3.5 Flash is Google's biggest Flash-tier model launch ever, and it shipped generally available the same day it was announced. As of the keynote, it is the default model in the Gemini app and AI Mode in Google Search worldwide. That alone would be notable. What makes it genuinely surprising is the benchmark story.

The headline model announcement at Google I/O 2026 is Gemini 3.5 Flash, and it's a strange one: a Flash-tier model that beats Gemini 3.1 Pro across coding, reasoning, and multimodal benchmarks. That's never happened before in Google's model lineup. Flash models have historically been the cheaper, faster, lower-quality option. That tradeoff is now gone.

The numbers back it up. It outperforms Gemini 3.1 Pro on key benchmarks: Terminal-Bench 2.1 at 76.2%, GDPval-AA at 1656 Elo, and MCP Atlas at 83.6%, while leading in multimodal understanding with CharXiv at 84.2%. And it does this while running fast. Google says 3.5 Flash delivers four times the output token generation speed of competing frontier models.

There is one area where it still trails. It falls behind Gemini 3.1 Pro on long-context tasks (MRCR v2 at 128k) and Humanity's Last Exam, where raw knowledge depth matters more than agentic capability. For most production workloads, though, that's a narrow tradeoff.

3.5 Flash offers a balance of performance and speed, ideal for tackling long-horizon agentic tasks, often at less than half the cost of comparable models. The model runs at 289 tokens per second with $1.50/$9 pricing (input/output per million tokens).

The model ID to use in the API is gemini-3.5-flash. The stable API model ID is gemini-3.5-flash with no preview suffix, replacing the gemini-3-flash-preview identifier used during the preview window. There's also a breaking API change to watch: the thinking_budget integer parameter has been replaced with a string enum called thinking_level, with values minimal

Don't miss what's next in AI

Join 300,000+ engineers and researchers who get the signal, not the noise.

Full access to in-depth AI research breakdowns
Be the first to know what's trending before it hits mainstream
Daily curated papers, repos, and industry moves

Takeaways

A Flash model that beats the old Pro

Don't miss what's next in AI