Perplexity is bringing what it calls hybrid agentic inference to its Personal Computer agent, a system that automatically decides which parts of a task should run on a local model on your machine and which parts should be routed to frontier models in the cloud. The company unveiled what it calls the first hybrid local-server inference orchestrator at Computex 2026, demonstrating software that autonomously decides, in real time and mid-task, which AI workloads stay on a user's device and which get routed to frontier models in the cloud.

CEO Aravind Srinivas demonstrated the system onstage alongside Intel CEO Lip-Bu Tan during Intel's keynote, using Perplexity's Personal Computer agent to process confidential deal materials. The feature is coming to Perplexity Computer in July, demoed on Intel Core Ultra Series 3 processors and currently exclusive to the Windows PC app.

A router that decides where your tokens go

The orchestrator works on a simple division of labor. A compact model runs locally on your device to determine when sensitive data should be kept local, while work that needs a frontier model's full capability runs on the server. Rather than requiring users to manually choose between local and cloud processing, the system makes those decisions automatically for each request.

Alpha Signal

Don't miss what's next in AI

Join 300,000+ engineers and researchers who get the signal, not the noise.

  • Full access to in-depth AI research breakdowns
  • Be the first to know what's trending before it hits mainstream
  • Daily curated papers, repos, and industry moves