Mediocre-Judge
/

bengali_qa_model_AGGRO_roberta-base

Question Answering

Generated from Trainer

Model card Files Files and versions Community

Mediocre-Judge commited on Dec 9, 2024

Commit

cd63264

·

verified ·

1 Parent(s): eb19472

End of training

Files changed (1) hide show

README.md +81 -0

README.md ADDED Viewed

	@@ -0,0 +1,81 @@

+---
+library_name: transformers
+license: mit
+base_model: FacebookAI/roberta-base
+tags:
+- generated_from_trainer
+model-index:
+- name: bengali_qa_model_AGGRO_roberta-base
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# bengali_qa_model_AGGRO_roberta-base
+This model is a fine-tuned version of [FacebookAI/roberta-base](https://huggingface.co/FacebookAI/roberta-base) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.7563
+- Exact Match: 70.7143
+- F1 Score: 79.6578
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 1e-05
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 3407
+- gradient_accumulation_steps: 16
+- total_train_batch_size: 64
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- training_steps: 50
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Exact Match | F1 Score |
+|:-------------:|:------:|:----:|:---------------:|:-----------:|:--------:|
+| 5.9583        | 0.0053 | 1    | 5.9716          | 0.0         | 5.2517   |
+| 5.948         | 0.0107 | 2    | 5.9377          | 0.0         | 5.5327   |
+| 5.9426        | 0.0160 | 3    | 5.8689          | 0.0         | 6.7345   |
+| 5.8563        | 0.0214 | 4    | 5.7663          | 0.0         | 11.2969  |
+| 5.8071        | 0.0267 | 5    | 5.6429          | 0.1504      | 26.3481  |
+| 5.6804        | 0.0321 | 6    | 5.5006          | 12.3308     | 40.0080  |
+| 5.5603        | 0.0374 | 7    | 5.3531          | 42.1053     | 58.0172  |
+| 5.4208        | 0.0428 | 8    | 5.1925          | 52.4060     | 64.8440  |
+| 5.2257        | 0.0481 | 9    | 5.0174          | 57.5188     | 69.2798  |
+| 5.0287        | 0.0535 | 10   | 4.8260          | 60.7519     | 71.5328  |
+| 4.9646        | 0.0588 | 11   | 4.6193          | 62.7068     | 73.5651  |
+| 4.6784        | 0.0641 | 12   | 4.4020          | 63.7594     | 75.2683  |
+| 4.623         | 0.0695 | 13   | 4.1678          | 64.0602     | 75.7455  |
+| 4.3488        | 0.0748 | 14   | 3.9101          | 64.9624     | 76.7744  |
+| 4.059         | 0.0802 | 15   | 3.6405          | 66.8421     | 77.6939  |
+| 3.7381        | 0.0855 | 16   | 3.3842          | 68.8722     | 78.5505  |
+| 3.574         | 0.0909 | 17   | 3.1525          | 67.3684     | 77.7160  |
+| 3.4082        | 0.0962 | 18   | 2.9428          | 66.2406     | 77.6532  |
+| 3.2202        | 0.1016 | 19   | 2.7619          | 69.3985     | 78.5549  |
+### Framework versions
+- Transformers 4.46.3
+- Pytorch 2.4.0
+- Datasets 3.1.0
+- Tokenizers 0.20.3