File size: 2,651 Bytes
a6e8d57 90d66ba a6e8d57 90d66ba a5d9ae7 90d66ba a6e8d57 492c67f a6e8d57 f405841 a6e8d57 b0b1764 a6e8d57 90d66ba a6e8d57 f405841 a6e8d57 f405841 a6e8d57 f405841 a6e8d57 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
---
license: apache-2.0
language: as
tags:
- audio
- automatic-speech-recognition
- speech
- xlsr-fine-tuning
- as
- robust-speech-event
datasets:
- common_voice
model-index:
- name: XLS-R-300M - Assamese
results:
- task:
name: Automatic Speech Recognition
type: automatic-speech-recognition
dataset:
name: Common Voice 7
type: mozilla-foundation/common_voice_7_0
args: as
metrics:
- name: Test WER
type: wer
value: 72.64
- name: Test CER
type: cer
value: 27.35
---
# wav2vec2-large-xls-r-300m-assamese
This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the common_voice_7_0 dataset.
It achieves the following results on the evaluation set:
- WER: 0.7954545454545454
- CER: 0.32341269841269843
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
To compute the evaluation parameters
```bash
cd wav2vec2-large-xls-r-300m-odia; python eval.py --model_id ./ --dataset mozilla-foundation/common_voice_7_0 --config or --split test --log_outputs
```
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-4
- train_batch_size: 16
- eval_batch_size: 8
- seed: not given
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 400
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss | Wer |
|:-------------:|:------:|:----:|:---------------:|:------: |
| 1.584065 | NA | 400 | 1.584065 | 0.915512 |
| 1.658865 | Na | 800 | 1.658865 | 0.805096 |
| 1.882352 | NA | 1200 | 1.882352 | 0.820742 |
| 1.881240 | NA | 1600 | 1.881240 | 0.810907 |
| 2.159748 | NA | 2000 | 2.159748 | 0.804202 |
| 1.992871 | NA | 2400 | 1.992871 | 0.803308 |
| 2.201436 | NA | 2800 | 2.201436 | 0.802861 |
| 2.165218 | NA | 3200 | 2.165218 | 0.793920 |
| 2.253643 | NA | 3600 | 2.253643 | 0.796603 |
| 2.265880 | NA | 4000 | 2.265880 | 0.790344 |
| 2.293935 | NA | 4400 | 2.293935 | 0.797050 |
| 2.288851 | NA | 4800 | 2.288851 | 0.784086 |
### Framework versions
- Transformers 4.11.3
- Pytorch 1.10.0+cu113
- Datasets 1.13.3
- Tokenizers 0.10.3
|