
Pydantic AI v2 is out, and it ships with a single new idea that reshapes how you build agents: the capability. The inner loop of an agent is settled by now -- call the model, run a tool, feed the result back. The real leverage is in the layer around it: the hooks that rewrite what the model sees mid-run, context management, steering, and loading the right tools just in time. v2 turns that whole layer into one thing you compose: the capability.
After seven betas, Pydantic AI v2 is now stable. The team shipped Pydantic AI v1 last September and put out more than a hundred releases since, without once breaking user code. v2 is the first major version bump, and it comes with a deliberate architectural shift.
One primitive to rule the loop
The capability lets you build agents from composable units that bundle tools, hooks, instructions, and model settings into reusable pieces. Think of it as the plugin format for agent behavior. Instead of scattering your memory system, guardrails, or coding toolkit across separate config objects and decorators, you package them as a single capability and attach it to an agent.
Here is what that looks like in practice:
from pydantic_ai import Agent
from pydantic_ai.capabilities import Capability, Thinking, ToolSearch, WebSearch
from pydantic_ai.mcp import MCPToolset
from pydantic_ai_harness import CodeMode
agent = Agent(
'anthropic:claude-opus-4-7',
instructions='Research thoroughly and cite your sources.',
capabilities=[
Thinking(effort='high'), # extended thinking, unified across providers
CodeMode(), # replaces N tool calls with one sandboxed run_code call
WebSearch(), # native where the provider supports it, local fallback otherwise
ToolSearch(), # discover tools on demand instead of listing hundreds upfront
Capability(
id='github',
description='Look up GitHub issues, pull requests, and code.',
instructions='Use the GitHub tools when a question is about a repository.',
toolset=MCPToolset('https://mcp.example.com/github'),
defer_loading=True, # stays out of the prompt until the model loads it on demand
),
],
)The defer_loading=True flag is worth calling out specifically. The capability is why so much has landed lately -- on-demand loading so a deferred capability stays out of the prompt until the model needs it, a pending message queue for steering a run mid-flight, and even durable execution. Because capabilities are serializable, an agent can be loaded from a spec file, and the surface is small enough that an LLM can write one.
Don't miss what's next in AI
Join 300,000+ engineers and researchers who get the signal, not the noise.
- Full access to in-depth AI research breakdowns
- Be the first to know what's trending before it hits mainstream
- Daily curated papers, repos, and industry moves

