Blog
- Neural audio codecs: how to get audio into LLMs2025-10-21Why modeling audio is harder than text, and how to make it feasible with neural audio codecs.
- Kyutai TTS and Unmute now open-source2025-07-03Announcing the open-source release of Kyutai TTS and Unmute, with benchmarks and project details.
- Kyutai Speech-To-Text released as open-source2025-06-19Announcing the open-source release of Kyutai STT, the streaming speech-to-text model powering Unmute.
- Unmute: Make LLMs listen and speak2025-05-22Modular voice AI that empowers any text LLM with real-time speech-to-text and text-to-speech.
- Helium 1: a modular and multilingual LLM2025-04-30Announcing Helium 1: a 2B parameter modular and multilingual language model, open-sourced for reproducibility.
- MoshiVis: Teaching Moshi to Converse about Images2025-03-21An open-source Vision Speech Model with low-latency and natural conversation skills.
- Simultanenous, on-device, high fidelity speech-to-speech translation with Hibiki2025-02-10Announcing Hibiki: simultaneous, on-device, high fidelity speech-to-speech translation.
- Announcing Helium-1 Preview2025-01-13Preview release of Helium-1, a lightweight multilingual language model for edge and mobile devices.
- Moshi open-source release: run Moshi locally!2024-09-18Release announcement and technical details for Moshi, Helium, and Mimi.
- Meet Moshi, the first real-time voice AI2024-07-03Introducing Moshi, a real-time voice AI that brings expressive, spontaneous spoken interaction to machines.
- Hello Kyutai!2023-11-17Announcing Kyutai, a Paris-based non-profit AI research lab dedicated to open science.