
Fugu-Ultra, the flagship model from Tokyo-based Sakana AI, is now available on Vercel's AI Gateway. The listing is notable not just for the distribution deal, but for what Fugu-Ultra actually is: not a single model, but a learned orchestration system that dynamically assembles a team of frontier models to answer your query. Think of it as a general contractor that hires specialists instead of doing every job itself.
A model that hires other models
Sakana Fugu is a new product that delivers a full multi-agent orchestration system as a single foundation model. Fugu dynamically orchestrates the world's best models to tackle complex, multi-step tasks, accessible through a single model API. The key insight is that Fugu is a family of learned orchestrators that expose a multi-agent system through a single model interface. Given a user query, a Fugu model constructs an agentic scaffold over a pool of frontier LLM workers, deciding which workers to involve, what instructions or roles to assign, how intermediate outputs should be combined or verified, and when to synthesize the final answer. The user interacts with Fugu as if calling a single model, while internally the system can route, delegate, and coordinate across multiple specialized agents.
This is fundamentally different from a simple router or a hand-coded workflow. Fugu is not simply a routing rule written by an engineer. Sakana says Fugu itself is a language model trained to call models in an agent pool. Given a task, it decides which model to use, whether to delegate planning or execution, whether to ask another agent to check the answer, and how to combine the results.
Sakana Fugu comes in two models, Fugu and Fugu Ultra, both available through one OpenAI-compatible API. You can pick the model that fits your workload, or switch between them without changing your integration.
- Fugu: Balances strong performance with low latency, making it suitable for everyday interactive use and configurable deployment constraints.
- Fugu-Ultra: Prioritizes answer quality, using deeper orchestration over a larger worker pool at the cost of additional latency. It is optimized for performance, composing workflows of multiple agents per input, and is intended for the most complex tasks that benefit from combining multiple specializations.
Don't miss what's next in AI
Join 300,000+ engineers and researchers who get the signal, not the noise.
- Full access to in-depth AI research breakdowns
- Be the first to know what's trending before it hits mainstream
- Daily curated papers, repos, and industry moves

