Wave2Vec2-Bert2.0 - Kiran Pantha

This model is a fine-tuned version of facebook/w2v-bert-2.0 on the OpenSLR54 dataset. It achieves the following results on the evaluation set:

  • Loss: 10.8771
  • Wer: 1.0005
  • Cer: 0.9690

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.SGD and the args are: No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
16.7814 0.1800 300 16.3800 1.0007 3.1059
16.2838 0.3599 600 15.8109 1.0005 2.9213
15.5569 0.5399 900 15.0093 1.0005 2.5754
15.0336 0.7199 1200 14.1309 1.0002 2.0061
13.9247 0.8998 1500 13.2986 1.0002 1.5023
13.1967 1.0798 1800 12.5663 1.0002 1.2076
12.4844 1.2597 2100 11.9662 1.0002 1.0769
11.8394 1.4397 2400 11.4978 1.0005 1.0134
11.4607 1.6197 2700 11.1599 1.0005 0.9855
11.2266 1.7996 3000 10.9534 1.0005 0.9733
11.0877 1.9796 3300 10.8771 1.0005 0.9690

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cxx11.abi
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
5
Safetensors
Model size
606M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kiranpantha/w2v-bert-2.0-nepali

Finetuned
(241)
this model
Finetunes
4 models

Evaluation results