Papers
arxiv:2505.22759

FAMA: The First Large-Scale Open-Science Speech Foundation Model for English and Italian

Published on May 28
ยท Submitted by spapi on May 30
Authors:
,
,
,
,
,
,
,

Abstract

FAMA, an open science family of speech foundation models, provides transparency and competitive performance by leveraging open-source training data and code.

AI-generated summary

The development of speech foundation models (SFMs) like Whisper and SeamlessM4T has significantly advanced the field of speech processing. However, their closed nature--with inaccessible training data and code--poses major reproducibility and fair evaluation challenges. While other domains have made substantial progress toward open science by developing fully transparent models trained on open-source (OS) code and data, similar efforts in speech remain limited. To fill this gap, we introduce FAMA, the first family of open science SFMs for English and Italian, trained on 150k+ hours of OS speech data. Moreover, we present a new dataset containing 16k hours of cleaned and pseudo-labeled speech for both languages. Results show that FAMA achieves competitive performance compared to existing SFMs while being up to 8 times faster. All artifacts, including code, datasets, and models, are released under OS-compliant licenses, promoting openness in speech technology research.

Community

Paper submitter

๐Ÿš€ New tech report out! Meet FAMA, a new open-science speech foundation model family for both Automatic Speech Recognition (ASR) and Speech Translation (ST) in ๐Ÿ‡ฌ๐Ÿ‡ง English and ๐Ÿ‡ฎ๐Ÿ‡น Italian.

๐Ÿ”— The models are live and ready to try on here on Huggingface

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 4

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2505.22759 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2505.22759 in a Space README.md to link it from this page.

Collections including this paper 2