

EXO Labs , the team behind the open-source distributed inference framework that lets you chain together Macs and workstations into a local AI cluster , just teased its next big move: a comprehensive, free benchmarking website covering every major model, every quantization level, and every consumer hardware configuration across every price point. The site, benchmarks.exolabs.net, is live but currently shows "Coming soon."
The question everyone keeps asking
The most common question EXO Labs gets is about benchmarking , things like "How many tokens per second can I get if I connect 3 Macs and run Llama 70B?" Right now, answering that requires either running the experiment yourself or piecing together scattered community results. EXO wants to make that a one-stop lookup.
The plan is to launch a free benchmarking website providing detailed hardware configuration comparisons, helping users choose the best LLM operating solution based on their needs and budget. The tweet makes clear the ambition is broad: every model, every quant, every hardware setup, every price point.
What EXO actually is
EXO is an open-source framework for running large language models efficiently across mixed hardware setups. Rather than treating inference as a task bound to a single GPU or accelerator, EXO automatically spreads workloads across whatever devices you have , turning a cluster of desktops, laptops, workstations, servers, tablets, or even smartphones into a cooperative AI mesh.
Don't miss what's next in AI
Join 300,000+ engineers and researchers who get the signal, not the noise.
- Full access to in-depth AI research breakdowns
- Be the first to know what's trending before it hits mainstream
- Daily curated papers, repos, and industry moves
