Papers

This is a list of all scientific publications by Kyutai.

CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Moritz Böhle, Amélie Royer, Juliette Marrie, Edouard Grave, Patrick Pérez
2025

ARC-Encoder: learning compressed text representations for large language models

Hippolyte Pilchen, Edouard Grave, Patrick Pérez
2025

Streaming Sequence-to-Sequence Learning with Delayed Streams Modeling

Neil Zeghidour, Eugene Kharitonov, Manu Orsini, Václav Volhejn, Gabriel de Marmiesse, Edouard Grave, Patrick Pérez, Laurent Mazaré, Alexandre Défossez
2025

Continuous Audio Language Models

Simon Rouard, Manu Orsini, Axel Roebel, Neil Zeghidour, Alexandre Défossez
2025

Aligning Spoken Dialogue Models from User Interactions

Anne Wu, Laurent Mazaré, Neil Zeghidour, Alexandre Défossez
ICML 2025

Vision-Speech Models: Teaching Speech Models to Converse about Images

Amélie Royer, Moritz Böhle, Gabriel de Marmiesse, Laurent Mazaré, Neil Zeghidour, Alexandre Défossez, Patrick Pérez
2025

High-Fidelity Simultaneous Speech-To-Speech Translation

Tom Labiausse, Laurent Mazaré, Edouard Grave, Alexandre Défossez, Neil Zeghidour
ICML 2025

Neutral Residues: Revisiting Adapters for Model Extension

Franck Signe Talla, Edouard Grave, Hervé Jégou
ICML 2025

Moshi: a speech-text foundation model for real-time dialogue

Alexandre Défossez, Laurent Mazaré, Manu Orsini, Amélie Royer, Patrick Pérez, Hervé Jégou, Edouard Grave, Neil Zeghidour
2024