#96

Inception

LLM startup competing in the fast-inference space. Inception builds diffusion-based language models (dLLMs) that generate tokens in parallel rather than one at a time, hitting 1,000+ tokens per second on standard NVIDIA GPUs. Its Mercury model family targets coding and reasoning workloads, with an OpenAI-compatible API.

Categories

BUSINESS

LLMS

Subcategories

LONG CONTEXT

Links

LAST 30 DAYS

Inception Inception's Mercury 2 Hits 1,000 Tokens per Second, Now Chasing Enterprise

Inception's Mercury 2 Hits 1,000 Tokens per Second, Now Chasing Enterprise

Inception

llms

May 26