End of training

Browse files

Files changed (6) hide show

README.md +98 -0
generation_config.json +7 -0
pytorch_model.bin +1 -1
special_tokens_map.json +5 -0
spiece.model +3 -0
tokenizer_config.json +13 -0

README.md ADDED Viewed

	@@ -0,0 +1,98 @@

+---
+license: apache-2.0
+base_model: google/mt5-base
+tags:
+- generated_from_trainer
+metrics:
+- rouge
+- sacrebleu
+model-index:
+- name: mT5-TextSimp-LT-BatchSize2-lr1e-4
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# mT5-TextSimp-LT-BatchSize2-lr1e-4
+This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.0672
+- Rouge1: 0.7548
+- Rouge2: 0.5989
+- Rougel: 0.7509
+- Sacrebleu: 49.0373
+- Gen Len: 38.0501
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 500
+- num_epochs: 8
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Sacrebleu | Gen Len |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
+| 25.6783       | 0.24  | 200  | 16.0497         | 0.0109 | 0.0005 | 0.0107 | 0.0029    | 512.0   |
+| 1.9593        | 0.48  | 400  | 0.7780          | 0.014  | 0.0005 | 0.0136 | 0.0146    | 42.685  |
+| 0.2778        | 0.72  | 600  | 0.1429          | 0.4924 | 0.3128 | 0.4803 | 20.3057   | 38.0382 |
+| 0.1325        | 0.96  | 800  | 0.1039          | 0.6193 | 0.4369 | 0.6098 | 33.687    | 38.0501 |
+| 0.1702        | 1.2   | 1000 | 0.0958          | 0.6697 | 0.5016 | 0.6613 | 38.0391   | 38.0501 |
+| 0.13          | 1.44  | 1200 | 0.0880          | 0.6737 | 0.5051 | 0.6644 | 38.62     | 38.0501 |
+| 0.1086        | 1.67  | 1400 | 0.0839          | 0.6964 | 0.5326 | 0.6884 | 40.9056   | 38.0501 |
+| 0.0716        | 1.91  | 1600 | 0.0859          | 0.6933 | 0.5298 | 0.6862 | 40.7158   | 38.0501 |
+| 0.1135        | 2.15  | 1800 | 0.0820          | 0.7017 | 0.5366 | 0.6936 | 40.7484   | 38.0501 |
+| 0.0997        | 2.39  | 2000 | 0.0814          | 0.7011 | 0.5351 | 0.6945 | 41.1948   | 38.0501 |
+| 0.0996        | 2.63  | 2200 | 0.0774          | 0.7103 | 0.5522 | 0.7049 | 42.5756   | 38.0501 |
+| 1.1379        | 2.87  | 2400 | 0.0763          | 0.7211 | 0.5556 | 0.7152 | 43.2411   | 38.0501 |
+| 0.0594        | 3.11  | 2600 | 0.0776          | 0.7261 | 0.5647 | 0.7201 | 44.2205   | 38.0501 |
+| 0.0763        | 3.35  | 2800 | 0.0736          | 0.7309 | 0.5709 | 0.7251 | 45.2825   | 38.0501 |
+| 0.1641        | 3.59  | 3000 | 0.0722          | 0.7297 | 0.5685 | 0.7242 | 44.9001   | 38.0501 |
+| 0.1085        | 3.83  | 3200 | 0.0703          | 0.7377 | 0.5793 | 0.7319 | 45.7504   | 38.0501 |
+| 0.0573        | 4.07  | 3400 | 0.0719          | 0.7393 | 0.5796 | 0.7335 | 45.86     | 38.0501 |
+| 0.1149        | 4.31  | 3600 | 0.0705          | 0.7415 | 0.5787 | 0.7365 | 46.2652   | 38.0501 |
+| 0.0843        | 4.55  | 3800 | 0.0703          | 0.7385 | 0.5754 | 0.7326 | 46.5292   | 38.0501 |
+| 0.0658        | 4.78  | 4000 | 0.0705          | 0.7437 | 0.5855 | 0.7384 | 46.864    | 38.0501 |
+| 0.0676        | 5.02  | 4200 | 0.0694          | 0.7437 | 0.584  | 0.7384 | 47.1268   | 38.0501 |
+| 0.0657        | 5.26  | 4400 | 0.0711          | 0.7473 | 0.5913 | 0.7432 | 47.4413   | 38.0501 |
+| 0.0679        | 5.5   | 4600 | 0.0702          | 0.7496 | 0.5908 | 0.7446 | 47.8281   | 38.0501 |
+| 0.0664        | 5.74  | 4800 | 0.0671          | 0.7511 | 0.5929 | 0.7463 | 47.7693   | 38.0501 |
+| 0.0446        | 5.98  | 5000 | 0.0685          | 0.7533 | 0.5932 | 0.7478 | 48.032    | 38.0501 |
+| 0.0732        | 6.22  | 5200 | 0.0678          | 0.7523 | 0.5948 | 0.7472 | 48.3467   | 38.0501 |
+| 0.0706        | 6.46  | 5400 | 0.0672          | 0.755  | 0.5983 | 0.7507 | 48.6158   | 38.0501 |
+| 0.051         | 6.7   | 5600 | 0.0674          | 0.7523 | 0.5961 | 0.7478 | 48.4828   | 38.0501 |
+| 0.067         | 6.94  | 5800 | 0.0681          | 0.7532 | 0.5978 | 0.7492 | 48.7253   | 38.0501 |
+| 0.075         | 7.18  | 6000 | 0.0684          | 0.7534 | 0.5969 | 0.7492 | 48.7053   | 38.0501 |
+| 0.1323        | 7.42  | 6200 | 0.0671          | 0.755  | 0.5991 | 0.7511 | 48.9922   | 38.0501 |
+| 0.0383        | 7.66  | 6400 | 0.0671          | 0.7551 | 0.5994 | 0.7511 | 49.0028   | 38.0501 |
+| 0.0599        | 7.89  | 6600 | 0.0672          | 0.7548 | 0.5989 | 0.7509 | 49.0373   | 38.0501 |
+### Framework versions
+- Transformers 4.33.0
+- Pytorch 2.1.2+cu121
+- Datasets 2.14.4
+- Tokenizers 0.13.3

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "_from_model_config": true,
+  "decoder_start_token_id": 0,
+  "eos_token_id": 1,
+  "pad_token_id": 0,
+  "transformers_version": "4.33.0"
+}

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5eb01946a0d90e805338d81afe3bb0fc59f69a5f2bd283c633f501ea28e2d87d
 size 2329703026

 version https://git-lfs.github.com/spec/v1
+oid sha256:8d4f23fe452d12251638e9246ad4661c9aa4456231727a6090b1f4d1396156cd
 size 2329703026

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,5 @@

+{
+  "eos_token": "</s>",
+  "pad_token": "<pad>",
+  "unk_token": "<unk>"
+}

spiece.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ef78f86560d809067d12bac6c09f19a462cb3af3f54d2b8acbba26e1433125d6
+size 4309802

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "additional_special_tokens": null,
+  "clean_up_tokenization_spaces": true,
+  "eos_token": "</s>",
+  "extra_ids": 0,
+  "legacy": true,
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "<pad>",
+  "sp_model_kwargs": {},
+  "tokenizer_class": "T5Tokenizer",
+  "tokenizer_file": null,
+  "unk_token": "<unk>"
+}