Tencent's Hy-MT2 Beats Commercial Translation APIs at 440MB on Device

Tencent Hy

Tencent's Hy-MT2 Beats Commercial Translation APIs at 440MB on Device

May 26, 2026

2 min read

OPEN_SOURCE

LLMS

small_models translation

May 26, 2026

OPEN_SOURCE

LLMS

small_models translation

2 min read

Tencent's Hunyuan team just open-sourced Hy-MT2, a family of translation-specialized language models that punches well above its weight class. The lineup spans three sizes -- 1.8B, 7B, and a 30B Mixture-of-Experts model that activates only 3B parameters per inference -- and all three support 33 languages out of the box. Within days of launch, the 1.8B model hit #1 and the 30B-A3B hit #4 on Hugging Face's open-source trending leaderboard, with over 7,000 downloads in the first few days.

The "Fast-Thinking" Bet

The core design philosophy here is what Tencent calls "fast-thinking" translation. Standard LLMs used for translation tend to reason slowly -- they process full semantics before generating output. Traditional LLM translation uses a "slow thinking" approach, while Hy-MT2 introduces a "fast-thinking" paradigm that reacts like a professional human translator, cutting unnecessary reasoning overhead. The practical result: faster inference without sacrificing quality, which matters enormously for real-time applications like video subtitles, live meetings, or mobile apps.

The payoff is real. The 7B and 30B models outperform open-source models such as DeepSeek-V4-Pro and Kimi K2.6 in fast-thinking mode, while the lightweight 1.8B model also surpasses mainstream commercial APIs from providers such as Microsoft and Doubao overall. That last point is particularly striking -- a sub-2B model beating cloud translation services is not something you see every day.

How It Was Built

The training pipeline is a three-stage process described in the Hy-MT2 technical paper. Each stage builds on the previous one:

MT-oriented Mid-training: Starting from a general Hunyuan pretraining model, the team continued training on approximately 1 trillion tokens of multilingual translation data, covering both monolingual corpora and parallel translation pairs across general, domain-specific, and real-world scenarios.
Family-Centric Post-training (FCPT): Rather than mixing all languages together, the team split training into language-family branches (Western European, East Asian, Middle Eastern, etc.). Each branch got its own specialized teacher model -- called a "Chimera Teacher" -- built by fusing outputs from multiple Hunyuan reference models. Each branch was then fine-tuned with GRPO (a reinforcement learning algorithm), using a 5-dimensional quality reward system scoring terminology, accuracy, linguistic conventions, style, and instruction-following. Finally, a cross-family distillation step merged all the family-specific experts back into a single unified model.

Don't miss what's next in AI

Join 300,000+ engineers and researchers who get the signal, not the noise.

Full access to in-depth AI research breakdowns
Be the first to know what's trending before it hits mainstream
Daily curated papers, repos, and industry moves

Tencent's Hy-MT2 Beats Commercial Translation APIs at 440MB on Device

Takeaways

The "Fast-Thinking" Bet

How It Was Built

Don't miss what's next in AI