Buseak
/

spell_corrector_mt5_01012024_v2_inbalanced_mistakes

Text2Text Generation

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

Buseak commited on Jan 1, 2024

Commit

acf73e2

·

1 Parent(s): b8747c1

End of training

Files changed (2) hide show

README.md +75 -0
generation_config.json +6 -0

README.md ADDED Viewed

	@@ -0,0 +1,75 @@

+---
+license: apache-2.0
+base_model: Buseak/spell_corrector_mt5_01012024
+tags:
+- generated_from_trainer
+metrics:
+- bleu
+model-index:
+- name: spell_corrector_mt5_01012024_v2_inbalanced_mistakes
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# spell_corrector_mt5_01012024_v2_inbalanced_mistakes
+This model is a fine-tuned version of [Buseak/spell_corrector_mt5_01012024](https://huggingface.co/Buseak/spell_corrector_mt5_01012024) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4841
+- Bleu: 35.9177
+- Gen Len: 15.7215
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 15
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Bleu    | Gen Len |
+|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
+| 1.1495        | 1.0   | 976   | 0.6422          | 32.0204 | 15.7943 |
+| 1.0698        | 2.0   | 1952  | 0.6209          | 32.5167 | 15.7969 |
+| 0.9987        | 3.0   | 2928  | 0.5965          | 33.0715 | 15.7826 |
+| 0.9755        | 4.0   | 3904  | 0.5770          | 33.4511 | 15.7809 |
+| 0.949         | 5.0   | 4880  | 0.5583          | 34.2697 | 15.7524 |
+| 0.9232        | 6.0   | 5856  | 0.5379          | 34.6321 | 15.7416 |
+| 0.9036        | 7.0   | 6832  | 0.5254          | 34.9265 | 15.7377 |
+| 0.8923        | 8.0   | 7808  | 0.5141          | 35.161  | 15.7364 |
+| 0.8771        | 9.0   | 8784  | 0.5077          | 35.3906 | 15.7275 |
+| 0.8675        | 10.0  | 9760  | 0.5017          | 35.5138 | 15.7251 |
+| 0.8517        | 11.0  | 10736 | 0.4940          | 35.7429 | 15.7199 |
+| 0.8623        | 12.0  | 11712 | 0.4915          | 35.7791 | 15.7264 |
+| 0.8569        | 13.0  | 12688 | 0.4881          | 35.8203 | 15.7232 |
+| 0.8544        | 14.0  | 13664 | 0.4860          | 35.9037 | 15.7224 |
+| 0.8657        | 15.0  | 14640 | 0.4841          | 35.9177 | 15.7215 |
+### Framework versions
+- Transformers 4.35.2
+- Pytorch 2.1.0+cu121
+- Datasets 2.16.1
+- Tokenizers 0.15.0

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "decoder_start_token_id": 0,
+  "eos_token_id": 1,
+  "pad_token_id": 0,
+  "transformers_version": "4.35.2"
+}