GogetaBlueMUI commited on
Commit
88e7210
ยท
verified ยท
1 Parent(s): d45a1c0

End of training

Browse files
Files changed (2) hide show
  1. README.md +52 -45
  2. generation_config.json +1 -1
README.md CHANGED
@@ -1,78 +1,85 @@
1
  ---
2
  library_name: transformers
 
 
3
  license: apache-2.0
4
- base_model: GogetaBlueMUI/whisper-medium-ur-jalandhary
5
  tags:
6
- - automatic-speech-recognition
7
- - ASR
8
- - Urdu
9
- - Whisper
10
- - speech-to-text
11
  - generated_from_trainer
12
  datasets:
13
- - common_voice_11_0
14
  metrics:
15
  - wer
16
- inference: true
17
- widget:
18
- - example_title: "Test Urdu Audio"
19
- src: "https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/test.flac"
20
  model-index:
21
- - name: whisper-medium-ur
22
  results:
23
  - task:
24
  name: Automatic Speech Recognition
25
  type: automatic-speech-recognition
26
  dataset:
27
- name: common_voice_11_0
28
- type: common_voice_11_0
29
  config: ur
30
  split: test
31
- args: ur
32
  metrics:
33
  - name: Wer
34
  type: wer
35
- value: 0.2744
36
  ---
37
 
38
- # ๐Ÿ—ฃ๏ธ Whisper-Medium-Ur: Urdu Speech Recognition Model
 
39
 
40
- This model is a fine-tuned version of **[GogetaBlueMUI/whisper-medium-ur-jalandhary](https://huggingface.co/GogetaBlueMUI/whisper-medium-ur-jalandhary)** on the **Common Voice 11.0 Urdu dataset**. It is designed for **automatic speech recognition (ASR)** in Urdu and achieves the following results on the evaluation set:
41
 
42
- - **Loss:** 0.5375
43
- - **WER (Word Error Rate):** 27.44%
44
- - **CER (Character Error Rate):** 12.37%
 
45
 
46
- ---
47
 
48
- ## **๐Ÿ“Œ Model Description**
49
- - The model is based on **OpenAI's Whisper-Medium**.
50
- - It is fine-tuned specifically for **Urdu speech transcription**.
51
- - Works best on **clear audio recordings** with minimal background noise.
52
 
53
- ---
54
 
55
- ## **๐Ÿ› ๏ธ Intended Uses & Limitations**
56
- ### โœ… **Intended Uses**
57
- - **Transcribing Urdu speech** into text.
58
- - **Generating subtitles** for Urdu videos.
59
- - **Building Urdu speech-to-text applications**.
60
 
61
- ### โŒ **Limitations**
62
- - May struggle with **noisy environments**.
63
- - May not perform well on **regional Urdu dialects**.
64
- - Limited **code-mixing** support (Urdu + English).
65
 
66
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
 
68
- ## **๐Ÿ’ป Usage**
69
- You can use this model with the **Hugging Face Transformers pipeline**:
 
 
 
 
70
 
71
- ```python
72
- from transformers import pipeline
73
 
74
- pipe = pipeline("automatic-speech-recognition", model="GogetaBlueMUI/whisper-medium-ur")
75
 
76
- # Run inference on an audio file
77
- result = pipe("path/to/your_audio_file.wav")
78
- print(result["text"])
 
 
1
  ---
2
  library_name: transformers
3
+ language:
4
+ - ur
5
  license: apache-2.0
6
+ base_model: openai/whisper-medium
7
  tags:
 
 
 
 
 
8
  - generated_from_trainer
9
  datasets:
10
+ - fsicoli/common_voice_19_0
11
  metrics:
12
  - wer
 
 
 
 
13
  model-index:
14
+ - name: Whisper Medium Ur - Your Name
15
  results:
16
  - task:
17
  name: Automatic Speech Recognition
18
  type: automatic-speech-recognition
19
  dataset:
20
+ name: Common Voice 19.0
21
+ type: fsicoli/common_voice_19_0
22
  config: ur
23
  split: test
24
+ args: 'config: ur, split: test'
25
  metrics:
26
  - name: Wer
27
  type: wer
28
+ value: 27.349454082657914
29
  ---
30
 
31
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
32
+ should probably proofread and complete it, then remove this comment. -->
33
 
34
+ # Whisper Medium Ur - Your Name
35
 
36
+ This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the Common Voice 19.0 dataset.
37
+ It achieves the following results on the evaluation set:
38
+ - Loss: 0.3613
39
+ - Wer: 27.3495
40
 
41
+ ## Model description
42
 
43
+ More information needed
 
 
 
44
 
45
+ ## Intended uses & limitations
46
 
47
+ More information needed
 
 
 
 
48
 
49
+ ## Training and evaluation data
 
 
 
50
 
51
+ More information needed
52
+
53
+ ## Training procedure
54
+
55
+ ### Training hyperparameters
56
+
57
+ The following hyperparameters were used during training:
58
+ - learning_rate: 5e-06
59
+ - train_batch_size: 8
60
+ - eval_batch_size: 8
61
+ - seed: 42
62
+ - gradient_accumulation_steps: 2
63
+ - total_train_batch_size: 16
64
+ - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
65
+ - lr_scheduler_type: linear
66
+ - lr_scheduler_warmup_steps: 40
67
+ - training_steps: 800
68
+ - mixed_precision_training: Native AMP
69
+
70
+ ### Training results
71
 
72
+ | Training Loss | Epoch | Step | Validation Loss | Wer |
73
+ |:-------------:|:------:|:----:|:---------------:|:-------:|
74
+ | 0.5093 | 0.2623 | 200 | 0.4290 | 29.3009 |
75
+ | 0.4283 | 0.5246 | 400 | 0.3918 | 29.4996 |
76
+ | 0.4435 | 0.7869 | 600 | 0.3705 | 27.1239 |
77
+ | 0.2939 | 1.0485 | 800 | 0.3613 | 27.3495 |
78
 
 
 
79
 
80
+ ### Framework versions
81
 
82
+ - Transformers 4.49.0
83
+ - Pytorch 2.5.1+cu121
84
+ - Datasets 3.4.0
85
+ - Tokenizers 0.21.0
generation_config.json CHANGED
@@ -246,5 +246,5 @@
246
  "transcribe": 50359,
247
  "translate": 50358
248
  },
249
- "transformers_version": "4.47.0"
250
  }
 
246
  "transcribe": 50359,
247
  "translate": 50358
248
  },
249
+ "transformers_version": "4.49.0"
250
  }