North Mini Code is Cohere's first open-source model built specifically for developers, and the community response has been unusually fast. Within 48 hours of release, independent contributors had already shipped llama.cpp support, MLX support for Apple Silicon, community GGUF quants, and a CLI task manager built on top of the model , all before Cohere's own official tooling had fully landed.

A coding model that punches above its weight class

North Mini Code 1.0 is a 30 billion parameter Mixture-of-Experts model, launched under the Apache 2.0 license and freely available on Hugging Face. The key number is not 30B, though. The model routes each query to a small subset of specialized expert networks, with 30 billion parameters total but only 3 billion active at any given time, keeping inference costs dramatically lower than a dense 30B model would require.

North Mini Code is a sparse mixture-of-experts model with 128 experts, of which 8 activate per token. This is the same architectural trick that made Mixtral and Qwen3 so efficient , you get the capacity of a large model at the compute cost of a small one. The model supports a context length of 256K tokens and can generate outputs up to 64K tokens.

Built for agents, not just autocomplete

North Mini Code is the first model in Cohere's new family of models, and is specifically designed and trained for agentic software engineering tasks. The distinction matters. Most coding models are fine-tuned general-purpose models. This one was built from the ground up for multi-step agentic workflows where the model needs to call tools, read terminal output, and iterate.

It is built for agentic workflows, including understanding and orchestrating sub-agents, mapping systems architecture, and running code reviews. It has integrated tool-use capabilities and supports interleaved thinking, which Cohere says improves performance across multi-step agentic work.

The training pipeline reflects this focus. Cohere trained the model through two stages of supervised fine-tuning followed by reinforcement learning with verifiable rewards across more than 70,000 verifiable tasks spanning approximately 5,000 repositories, deduplicated against SWE-Bench. The RL stage (RLVR) rewarded the model for actually solving tasks, not just generating plausible-looking code.

Alpha Signal

Don't miss what's next in AI

Join 300,000+ engineers and researchers who get the signal, not the noise.

  • Full access to in-depth AI research breakdowns
  • Be the first to know what's trending before it hits mainstream
  • Daily curated papers, repos, and industry moves