comodoro
/

wav2vec2-xls-r-300m-cs-250

@@ -23,10 +23,10 @@ model-index:
     metrics:
        - name: Test WER
          type: wer
-         value: 47.46
        - name: Test CER
-         type: cer
-         value: 10.88
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
@@ -35,8 +35,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the common_voice 8.0 dataset.
 It achieves the following results on the evaluation set:
-- WER: 0.47455377483706096
-- CER: 0.10877155235645618
 ## Model description
@@ -80,7 +81,10 @@ print("Reference:", test_dataset[:2]["sentence"])
 ## Evaluation
-The model can be evaluated using the attached `eval.py` script.
 ## Training and evaluation data
@@ -90,7 +94,8 @@ The Common Voice 8.0 `train` and `validation` datasets were used for training
 ### Training hyperparameters
-The following hyperparameters were used during training:
 - learning_rate: 7e-05
 - train_batch_size: 32
 - eval_batch_size: 8
@@ -103,6 +108,20 @@ The following hyperparameters were used during training:
 - num_epochs: 150
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss | Wer    | Cer    |
@@ -126,6 +145,17 @@ The following hyperparameters were used during training:
 | 0.0527        | 137.09 | 4250 | 0.6652          | 0.4749 | 0.1090 |
 | 0.0506        | 145.16 | 4500 | 0.6958          | 0.4846 | 0.1133 |
 ### Framework versions

     metrics:
        - name: Test WER
          type: wer
+         value: 16.1
        - name: Test CER
+         type: cer
+         value: 3.8
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the common_voice 8.0 dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.2327
+- Wer: 0.1608
+- Cer: 0.0376
 ## Model description
 ## Evaluation
+The model can be evaluated using the attached `eval.py` script:
+```
+python eval.py --model_id comodoro/wav2vec2-xls-r-300m-cs-cv8 --dataset mozilla-foundation/common-voice_8_0 --split test --config cs
+```
 ## Training and evaluation data
 ### Training hyperparameters
+The following hyperparameters were used during first stage of training:
 - learning_rate: 7e-05
 - train_batch_size: 32
 - eval_batch_size: 8
 - num_epochs: 150
 - mixed_precision_training: Native AMP
+The following hyperparameters were used during second stage of training:
+- learning_rate: 0.001
+- train_batch_size: 32
+- eval_batch_size: 8
+- seed: 42
+- gradient_accumulation_steps: 20
+- total_train_batch_size: 640
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 500
+- num_epochs: 50
+- mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss | Wer    | Cer    |
 | 0.0527        | 137.09 | 4250 | 0.6652          | 0.4749 | 0.1090 |
 | 0.0506        | 145.16 | 4500 | 0.6958          | 0.4846 | 0.1133 |
+Further fine-tuning with slightly different architecture and higher learning rate:
+| Training Loss | Epoch | Step | Validation Loss | Wer    | Cer    |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|
+| 0.576         | 8.06  | 250  | 0.2411          | 0.2340 | 0.0502 |
+| 0.2564        | 16.13 | 500  | 0.2305          | 0.2097 | 0.0492 |
+| 0.2018        | 24.19 | 750  | 0.2371          | 0.2059 | 0.0494 |
+| 0.1549        | 32.25 | 1000 | 0.2298          | 0.1844 | 0.0435 |
+| 0.1224        | 40.32 | 1250 | 0.2288          | 0.1725 | 0.0407 |
+| 0.1004        | 48.38 | 1500 | 0.2327          | 0.1608 | 0.0376 |
 ### Framework versions