Mediocre-Judge
/

bengali_qa_microsoft_model

+---
+library_name: transformers
+license: mit
+base_model: microsoft/mdeberta-v3-base
+tags:
+- generated_from_trainer
+model-index:
+- name: bengali_qa_microsoft_model
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# bengali_qa_microsoft_model
+This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.7354
+- Exact Match: 47.8571
+- F1 Score: 64.4066
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 3407
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 64
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- training_steps: 50
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Exact Match | F1 Score |
+|:-------------:|:------:|:----:|:---------------:|:-----------:|:--------:|
+| 6.6257        | 0.0053 | 1    | 6.6659          | 0.1504      | 23.7019  |
+| 6.6148        | 0.0107 | 2    | 6.5377          | 0.7519      | 36.0663  |
+| 6.5217        | 0.0160 | 3    | 6.3226          | 1.9549      | 43.4507  |
+| 6.1229        | 0.0214 | 4    | 5.6034          | 3.7594      | 56.8030  |
+| 5.7015        | 0.0267 | 5    | 5.3006          | 5.7143      | 56.8976  |
+| 5.3474        | 0.0321 | 6    | 5.1196          | 10.9023     | 56.6826  |
+| 5.2375        | 0.0374 | 7    | 4.8802          | 16.9925     | 57.8356  |
+| 5.067         | 0.0428 | 8    | 4.6180          | 19.8496     | 57.7216  |
+| 4.7947        | 0.0481 | 9    | 4.3354          | 22.7820     | 58.3592  |
+| 4.4271        | 0.0534 | 10   | 4.0533          | 26.0902     | 58.7345  |
+| 4.3096        | 0.0588 | 11   | 3.7947          | 31.5789     | 60.0493  |
+| 4.1219        | 0.0641 | 12   | 3.5726          | 35.2632     | 61.0031  |
+| 3.9806        | 0.0695 | 13   | 3.4024          | 38.4962     | 62.6457  |
+| 3.653         | 0.0748 | 14   | 3.2782          | 41.7293     | 63.9058  |
+| 3.474         | 0.0802 | 15   | 3.1569          | 43.9850     | 65.0367  |
+| 3.3639        | 0.0855 | 16   | 3.0200          | 45.4887     | 64.9433  |
+| 3.2411        | 0.0908 | 17   | 2.8749          | 46.4662     | 64.8581  |
+| 3.0711        | 0.0962 | 18   | 2.7349          | 47.8195     | 65.1471  |
+| 3.0726        | 0.1015 | 19   | 2.6103          | 48.4962     | 64.6205  |
+| 2.9879        | 0.1069 | 20   | 2.4996          | 49.2481     | 64.7963  |
+| 2.9237        | 0.1122 | 21   | 2.3950          | 50.6767     | 65.0445  |
+### Framework versions
+- Transformers 4.46.3
+- Pytorch 2.4.0
+- Datasets 3.1.0
+- Tokenizers 0.20.3

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cd829b42b2585e044ee1a283216a25f40a564b76c213ef4e4d2ebcdb5292593c
 size 1112905680

 version https://git-lfs.github.com/spec/v1
+oid sha256:d171a094122ab8eef16df3cf34c6c7f7df360aed823d907fa033a43530f31eb6
 size 1112905680