Nous Research's Hermes Agent Browses the Web 60x Faster and 49x Cheaper

Nous Research

2H AGO

2 min read

AGENTS

agent_frameworks tool_use

INFRA

inference_optimization

2 hrs ago

AGENTS

agent_frameworks tool_use

INFRA

inference_optimization

2 min read

Hermes Agent, the open-source autonomous agent from Nous Research, just shipped a major overhaul to how it reads the web. The headline numbers are striking: up to 60x faster and 49x cheaper web extraction. But the real story is in the engineering decisions that got there.

The problem with reading the web at scale

When an agent browses the web, the naive approach is to fetch a page, dump the full HTML or markdown into the model's context window, and let it figure things out. That works fine for a short blog post. It falls apart fast when you hit a documentation site, a forum thread, or a news article with embedded comments , pages that can balloon to hundreds of thousands of characters.

The old behavior meant two things: slow round-trips because the agent was processing redundant content, and expensive LLM calls because every token in that bloated page costs money. If you're running an agent that browses dozens of pages per task, this compounds quickly.

What changed under the hood

The update targets two bottlenecks directly:

Cleaner scraping backends , providers like Firecrawl now pass clean, pre-processed content straight to the agent, cutting out redundant intermediate steps that were adding latency without adding value.
On-demand page paging , large pages are saved locally and served to the agent in chunks as needed, rather than being loaded into context all at once. Same quality, fraction of the cost.

Don't miss what's next in AI

Join 300,000+ engineers and researchers who get the signal, not the noise.

Full access to in-depth AI research breakdowns
Be the first to know what's trending before it hits mainstream
Daily curated papers, repos, and industry moves

Nous Research's Hermes Agent Browses the Web 60x Faster and 49x Cheaper

Takeaways

The problem with reading the web at scale

What changed under the hood

Don't miss what's next in AI