bengali_qa_microsoft_model

This model is a fine-tuned version of microsoft/mdeberta-v3-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7354
  • Exact Match: 47.8571
  • F1 Score: 64.4066

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 3407
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Exact Match F1 Score
6.6257 0.0053 1 6.6659 0.1504 23.7019
6.6148 0.0107 2 6.5377 0.7519 36.0663
6.5217 0.0160 3 6.3226 1.9549 43.4507
6.1229 0.0214 4 5.6034 3.7594 56.8030
5.7015 0.0267 5 5.3006 5.7143 56.8976
5.3474 0.0321 6 5.1196 10.9023 56.6826
5.2375 0.0374 7 4.8802 16.9925 57.8356
5.067 0.0428 8 4.6180 19.8496 57.7216
4.7947 0.0481 9 4.3354 22.7820 58.3592
4.4271 0.0534 10 4.0533 26.0902 58.7345
4.3096 0.0588 11 3.7947 31.5789 60.0493
4.1219 0.0641 12 3.5726 35.2632 61.0031
3.9806 0.0695 13 3.4024 38.4962 62.6457
3.653 0.0748 14 3.2782 41.7293 63.9058
3.474 0.0802 15 3.1569 43.9850 65.0367
3.3639 0.0855 16 3.0200 45.4887 64.9433
3.2411 0.0908 17 2.8749 46.4662 64.8581
3.0711 0.0962 18 2.7349 47.8195 65.1471
3.0726 0.1015 19 2.6103 48.4962 64.6205
2.9879 0.1069 20 2.4996 49.2481 64.7963
2.9237 0.1122 21 2.3950 50.6767 65.0445

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.4.0
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
16
Safetensors
Model size
278M params
Tensor type
F32
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Mediocre-Judge/bengali_qa_microsoft_model

Finetuned
(213)
this model