Update README.md
Browse files
README.md
CHANGED
@@ -12,4 +12,33 @@ metrics:
|
|
12 |
- wer
|
13 |
- cer
|
14 |
pipeline_tag: automatic-speech-recognition
|
15 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
- wer
|
13 |
- cer
|
14 |
pipeline_tag: automatic-speech-recognition
|
15 |
+
---
|
16 |
+
|
17 |
+
**This is a CTC-based Automatic Speech Recognition system for French.**
|
18 |
+
This model is part of the SLU demo available here: [LINK TO THE DEMO GOES HERE]
|
19 |
+
It is based on the [mHuBERT-147](https://huggingface.co/utter-project/mHuBERT-147) speech foundation model.
|
20 |
+
|
21 |
+
* Training data: XX hours
|
22 |
+
* Normalization: Whisper normalization
|
23 |
+
* Performance:
|
24 |
+
|
25 |
+
|
26 |
+
# Table of Contents:
|
27 |
+
1. Training Parameters
|
28 |
+
2. [ASR Model class](https://huggingface.co/naver/mHuBERT-147-ASR-fr#ASR-Model-class)
|
29 |
+
3. Running inference
|
30 |
+
|
31 |
+
## Training Parameters
|
32 |
+
The training parameters are available in config.yaml.
|
33 |
+
We downsample the commonvoice dataset to 70,000 utterances.
|
34 |
+
|
35 |
+
## ASR Model class
|
36 |
+
|
37 |
+
We use the mHubertForCTC class for our model, which is nearly identical to the existing HubertForCTC class.
|
38 |
+
The key difference is that we've added a few additional hidden layers at the end of the Transformer stack, just before the lm_head.
|
39 |
+
The code is available in [CTC_model.py](https://huggingface.co/naver/mHuBERT-147-ASR-fr/blob/main/CTC_model.py).
|
40 |
+
|
41 |
+
## Running inference
|
42 |
+
|
43 |
+
The run_asr.py file illustrates how to load the model for inference (**load_asr_model**), and how to produce transcription for a file (**run_asr_inference**).
|
44 |
+
Please follow the requirements.txt to avoid incorrect model loading.
|