Kyutai TTS and Unmute now open-source

July 03, 2025 | Author: Kyutai Team

It’s finally here! We’ve open sourced both Unmute and Kyutai TTS, the text-to-speech model that gives Unmute its voice.

Kyutai Speech-To-Text released as open-source

June 19, 2025 | Author: Kyutai Team

Good news! Kyutai STT, the speech-to-text model powering Unmute, is now open-source! This is the first part of the Unmute release.

Unmute: Make LLMs listen and speak

May 22, 2025 | Author: Kyutai Team

Talk to Unmute, the most modular voice AI around. Empower any text LLM with voice, instantly, by wrapping it with our new speech-to-text and text-to-speech. Any personality, any voice. Interruptible, smart turn-taking. We’ll open-source everything within the next few weeks.

Helium 1: a modular and multilingual LLM

April 30, 2025 | Author: Kyutai Team

Today, we are thrilled to announce our latest text large language model called Helium 1 — a lightweight yet powerful model with 2 billion parameters, designed to set a new benchmark within its size category. Helium 1 achieves state-of-the-art performance among models of similar scale when evaluated across a diverse...

MoshiVis — Teaching Moshi to Converse about Images

March 21, 2025 | Author: Kyutai Team

With Moshi, we paved a new road towards natural speech-to-speech interactions with large language models. Today, we introduce MoshiVis, an open-source Vision Speech Model (VSM) with the same low-latency and natural conversation skills as Moshi, with the additional ability to discuss visual inputs. To do...

Simultanenous, on-device, high fidelity speech-to-speech translation with Hibiki

February 10, 2025 | Author: Kyutai Team

As part of our on-going effort to push the boundary for speech-to-speech models, we have released Hibiki, a model for simultaneous, on-device, high fidelity speech-to-speech translation. We build on the same ideas and architecture underpinning our speech conversational agent Moshi, allowing for frugal training with in-house generated synthetic...

Announcing Helium-1 Preview

January 13, 2025 | Author: Kyutai Team

We are excited to release a preview of our new backbone language model called Helium-1. As the chemical element it is named after, Helium-1 is a lightweight model with around 2B parameters. Our goal with this model is to enable the development of A.I. systems running on edge and mobile...

Moshi open-source release: run Moshi locally!

September 18, 2024 | Author: Kyutai Team

Today, we release several Moshi artifacts: a long technical report with all the details behind our model, weights for Moshi and its Mimi codec, along with streaming inference code in PyTorch, Rust and MLX.

Meet Moshi, the first real-time voice AI

July 03, 2024 | Author: Kyutai Team

We don’t speak like we write, we don’t read like we listen. We all experience, every day, between humans, the fundamental differences between written communication and oral communication. Two complementary ways of using language. The former is often more precise, compact and efficient, but slower to produce and hardly interactive....

Hello Kyutai!

November 17, 2023 | Author: Kyutai Team

Today, we are six AI researchers jumping on stage, at Station F, to announce the start of a brand new scientific adventure named Kyutai (“sphere” in Japanese): Kyutai is a non-profit laboratory dedicated to open research in artificial intelligence, supported by Xavier Niel (Iliad), Rodolphe Saadé (CMA-CGM) and Eric Schmidt...