#78

kyutai

Paris-based nonprofit AI research lab focused on open-weights speech and language models. Best known for Moshi, a full-duplex speech-native dialogue model that processes audio directly without text conversion, and Pocket TTS, a 100M-parameter voice-cloning model that runs in real time on CPU with no GPU required.

Topics

AUDIO

OPEN_SOURCE

POST_TRAINING

Subtopics

MUSIC GENERATIONRLALIGNMENT

Links

LAST 30 DAYS