With Moshi, we paved a new road towards natural speech-to-speech interactions with large language models. Today, we introduce MoshiVis, an open-source Vision Speech Model (VSM) with the same low-latency and natural conversation skills as Moshi, with the additional ability to discuss visual inputs. To do...
MoshiVis — Teaching Moshi to Converse about Images
Simultanenous, on-device, high fidelity speech-to-speech translation with Hibiki
As part of our on-going effort to push the boundary for speech-to-speech models, we have released Hibiki, a model for simultaneous, on-device, high fidelity speech-to-speech translation. We build on the same ideas and architecture underpinning our speech conversational agent Moshi, allowing for frugal training with in-house generated synthetic...
Announcing Helium-1 Preview
We are excited to release a preview of our new backbone language model called Helium-1. As the chemical element it is named after, Helium-1 is a lightweight model with around 2B parameters. Our goal with this model is to enable the development of A.I. systems running on edge and mobile...
Moshi open-source release: run Moshi locally!
Today, we release several Moshi artifacts: a long technical report with all the details behind our model, weights for Moshi and its Mimi codec, along with streaming inference code in PyTorch, Rust and MLX.
Meet Moshi, the first real-time voice AI
We don’t speak like we write, we don’t read like we listen. We all experience, every day, between humans, the fundamental differences between written communication and oral communication. Two complementary ways of using language. The former is often more precise, compact and efficient, but slower to produce and hardly interactive....
Hello Kyutai!
Today, we are six AI researchers jumping on stage, at Station F, to announce the start of a brand new scientific adventure named Kyutai (“sphere” in Japanese): Kyutai is a non-profit laboratory dedicated to open research in artificial intelligence, supported by Xavier Niel (Iliad), Rodolphe Saadé (CMA-CGM) and Eric Schmidt...