
Liquid AI just dropped two new open-weight models targeting the Japanese language market, and the headline number is hard to ignore: a 1.5B-parameter audio model that beats a 7.7B competitor in conversational benchmarks. The two releases are LFM2.5-Audio-1.5B-JP, the company's first Japanese speech-to-speech model, and LFM2.5-1.2B-JP-202606, an updated Japanese text model. Both are available now on Hugging Face.
One model, no pipeline glue
The audio model is the more technically interesting of the two. Most production voice systems are stitched together from three separate components: a speech recognizer (ASR) to transcribe the user, a language model to generate a response, and a text-to-speech engine (TTS) to speak it back. That pipeline adds latency at every seam and creates failure modes at each handoff.
LFM2.5-Audio-1.5B-JP is an end-to-end multimodal speech and text language model that does not require separate ASR and TTS components. Designed with low latency and real-time conversation in mind, it enables seamless Japanese conversational interaction at only 1.5 billion parameters.
The model consists of a pretrained LFM2.5 backbone, a FastConformer-based audio encoder to handle continuous audio inputs, and an RQ-transformer generating discrete tokens coupled with a lightweight audio detokenizer for audio output. The FastConformer encoder (115M parameters) is based on NVIDIA's Canary checkpoint, and audio output uses Kyutai's Mimi codec with 8 codebooks at 24kHz.
Two generation modes for different tasks
The audio model supports two distinct generation routines. Interleaved generation enables real-time speech-to-speech conversational chatbot capabilities where audio generation latency is key. Sequential generation is suited for non-conversational tasks such as ASR or TTS, and allows the model to switch generated modality on the fly.
Don't miss what's next in AI
Join 300,000+ engineers and researchers who get the signal, not the noise.
- Full access to in-depth AI research breakdowns
- Be the first to know what's trending before it hits mainstream
- Daily curated papers, repos, and industry moves

