Anthropic's Frontier Red Team just published the results of Project Fetch Phase Two, and the numbers are striking. Less than a year after Claude helped human teams program a quadruped robodog, the latest model did the same tasks entirely on its own , and did them roughly 20 times faster than the fastest human team from the original experiment. This is not a robotics paper. It's a capability benchmark, and what it reveals about the pace of AI progress is worth paying attention to.

A quick recap of Phase One

Back in August 2024, Anthropic ran an internal experiment: two teams of non-expert employees were given a Unitree Go2 quadruped robodog and asked to program it to fetch a beach ball. One team was randomly assigned to use Claude (Team Claude), the other had to rely only on the internet and their own ingenuity (Team Claude-less). Overall, Team Claude accomplished more tasks and completed them faster , for the tasks both teams finished, Team Claude succeeded in about half the time it took Team Claude-less.

Before running Phase Two, the team checked whether Opus 4.1 (the model powering Phase One) could handle the tasks entirely on its own. It could not. Like the human team without Claude, it got stuck on the very first step: connecting to the robot. That was less than a year ago.

What changed in Phase Two

For this autonomous update, researchers ran three trials of Opus 4.7 using adaptive thinking with effort set to maximum in Claude Code. Adaptive thinking is Anthropic's term for dynamic compute allocation , the model decides how much reasoning to apply based on task difficulty, spending more tokens on harder problems. The role of the human researcher was limited to plugging a laptop running Claude Code into the robodog, entering the initial prompt, approving commands, and approving the model to move to the next task.

Alpha Signal

Don't miss what's next in AI

Join 300,000+ engineers and researchers who get the signal, not the noise.

  • Full access to in-depth AI research breakdowns
  • Be the first to know what's trending before it hits mainstream
  • Daily curated papers, repos, and industry moves