naver
/

mHuBERT-147-ASR-fr

Automatic Speech Recognition

Model card Files Files and versions Community

mzboito commited on Aug 23, 2024

Commit

9959545

·

verified ·

1 Parent(s): 90df5cd

Update README.md

Files changed (1) hide show

README.md +30 -1

README.md CHANGED Viewed

@@ -12,4 +12,33 @@ metrics:
 - wer
 - cer
 pipeline_tag: automatic-speech-recognition
----

 - wer
 - cer
 pipeline_tag: automatic-speech-recognition
+---
+**This is a CTC-based Automatic Speech Recognition system for French.**
+This model is part of the SLU demo available here: [LINK TO THE DEMO GOES HERE]
+It is based on the [mHuBERT-147](https://huggingface.co/utter-project/mHuBERT-147) speech foundation model.
+* Training data: XX hours
+* Normalization: Whisper normalization
+* Performance:
+# Table of Contents:
+1. Training Parameters
+2. [ASR Model class](https://huggingface.co/naver/mHuBERT-147-ASR-fr#ASR-Model-class)
+3. Running inference
+## Training Parameters
+The training parameters are available in config.yaml.
+We downsample the commonvoice dataset to 70,000 utterances.
+## ASR Model class
+We use the mHubertForCTC class for our model, which is nearly identical to the existing HubertForCTC class.
+The key difference is that we've added a few additional hidden layers at the end of the Transformer stack, just before the lm_head.
+The code is available in [CTC_model.py](https://huggingface.co/naver/mHuBERT-147-ASR-fr/blob/main/CTC_model.py).
+## Running inference
+The run_asr.py file illustrates how to load the model for inference (**load_asr_model**), and how to produce transcription for a file (**run_asr_inference**).
+Please follow the requirements.txt to avoid incorrect model loading.