finetune-t5-base-on-opus100-Ar2En-with-lora
Browse files
README.md
CHANGED
@@ -14,15 +14,16 @@ model-index:
|
|
14 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
15 |
should probably proofread and complete it, then remove this comment. -->
|
16 |
|
17 |
-
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/FinalProject_/T5/runs/
|
|
|
18 |
# finetune-t5-base-on-opus100-Ar2En-with-lora
|
19 |
|
20 |
This model is a fine-tuned version of [UBC-NLP/AraT5v2-base-1024](https://huggingface.co/UBC-NLP/AraT5v2-base-1024) on an unknown dataset.
|
21 |
It achieves the following results on the evaluation set:
|
22 |
-
- Loss: 3.
|
23 |
-
- Bleu:
|
24 |
-
- Rouge: 0.
|
25 |
-
- Gen Len:
|
26 |
|
27 |
## Model description
|
28 |
|
@@ -47,40 +48,20 @@ The following hyperparameters were used during training:
|
|
47 |
- seed: 42
|
48 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
49 |
- lr_scheduler_type: linear
|
50 |
-
- num_epochs:
|
51 |
- mixed_precision_training: Native AMP
|
52 |
|
53 |
### Training results
|
54 |
|
55 |
-
| Training Loss | Epoch | Step
|
56 |
-
|
57 |
-
|
|
58 |
-
| 6.
|
59 |
-
| 5.
|
60 |
-
| 5.
|
61 |
-
| 4.
|
62 |
-
| 4.
|
63 |
-
| 4.
|
64 |
-
| 4.5242 | 8.0 | 5600 | 3.5183 | 5.1712 | 0.2633 | 11.188 |
|
65 |
-
| 4.4707 | 9.0 | 6300 | 3.4968 | 5.4002 | 0.2652 | 11.0745 |
|
66 |
-
| 4.4178 | 10.0 | 7000 | 3.4632 | 5.697 | 0.2704 | 11.442 |
|
67 |
-
| 4.3806 | 11.0 | 7700 | 3.4465 | 5.8389 | 0.278 | 11.3195 |
|
68 |
-
| 4.3356 | 12.0 | 8400 | 3.4190 | 5.8889 | 0.2789 | 11.389 |
|
69 |
-
| 4.3172 | 13.0 | 9100 | 3.4074 | 6.1714 | 0.2869 | 11.1865 |
|
70 |
-
| 4.262 | 14.0 | 9800 | 3.3940 | 6.2538 | 0.291 | 11.2835 |
|
71 |
-
| 4.2517 | 15.0 | 10500 | 3.3711 | 6.6821 | 0.2947 | 11.4705 |
|
72 |
-
| 4.2225 | 16.0 | 11200 | 3.3631 | 6.6337 | 0.2915 | 11.5375 |
|
73 |
-
| 4.2032 | 17.0 | 11900 | 3.3606 | 6.6492 | 0.2954 | 11.3655 |
|
74 |
-
| 4.2062 | 18.0 | 12600 | 3.3476 | 6.5354 | 0.2956 | 11.388 |
|
75 |
-
| 4.1743 | 19.0 | 13300 | 3.3420 | 6.7065 | 0.2986 | 11.5025 |
|
76 |
-
| 4.1636 | 20.0 | 14000 | 3.3332 | 6.7179 | 0.299 | 11.538 |
|
77 |
-
| 4.1448 | 21.0 | 14700 | 3.3278 | 6.6867 | 0.298 | 11.502 |
|
78 |
-
| 4.1378 | 22.0 | 15400 | 3.3209 | 6.8417 | 0.2993 | 11.5215 |
|
79 |
-
| 4.127 | 23.0 | 16100 | 3.3182 | 6.7923 | 0.298 | 11.5035 |
|
80 |
-
| 4.1259 | 24.0 | 16800 | 3.3141 | 6.8933 | 0.3033 | 11.5165 |
|
81 |
-
| 4.1239 | 25.0 | 17500 | 3.3119 | 6.8698 | 0.3022 | 11.5275 |
|
82 |
-
| 4.1299 | 26.0 | 18200 | 3.3108 | 6.8569 | 0.3022 | 11.5385 |
|
83 |
-
| 4.1101 | 27.0 | 18900 | 3.3112 | 6.969 | 0.3029 | 11.5485 |
|
84 |
|
85 |
|
86 |
### Framework versions
|
|
|
14 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
15 |
should probably proofread and complete it, then remove this comment. -->
|
16 |
|
17 |
+
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/FinalProject_/T5/runs/vvxagyr8)
|
18 |
+
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/FinalProject_/T5/runs/vvxagyr8)
|
19 |
# finetune-t5-base-on-opus100-Ar2En-with-lora
|
20 |
|
21 |
This model is a fine-tuned version of [UBC-NLP/AraT5v2-base-1024](https://huggingface.co/UBC-NLP/AraT5v2-base-1024) on an unknown dataset.
|
22 |
It achieves the following results on the evaluation set:
|
23 |
+
- Loss: 3.7552
|
24 |
+
- Bleu: 4.3018
|
25 |
+
- Rouge: 0.2386
|
26 |
+
- Gen Len: 10.572
|
27 |
|
28 |
## Model description
|
29 |
|
|
|
48 |
- seed: 42
|
49 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
50 |
- lr_scheduler_type: linear
|
51 |
+
- num_epochs: 7
|
52 |
- mixed_precision_training: Native AMP
|
53 |
|
54 |
### Training results
|
55 |
|
56 |
+
| Training Loss | Epoch | Step | Validation Loss | Bleu | Rouge | Gen Len |
|
57 |
+
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:-------:|
|
58 |
+
| 6.6745 | 1.0 | 700 | 4.6813 | 3.2487 | 0.2249 | 10.726 |
|
59 |
+
| 6.1243 | 2.0 | 1400 | 4.0666 | 3.3995 | 0.2273 | 10.0245 |
|
60 |
+
| 5.3863 | 3.0 | 2100 | 3.9208 | 3.8728 | 0.2335 | 10.3965 |
|
61 |
+
| 5.1275 | 4.0 | 2800 | 3.8485 | 3.9535 | 0.2331 | 10.5655 |
|
62 |
+
| 4.975 | 5.0 | 3500 | 3.7971 | 3.9941 | 0.2318 | 10.572 |
|
63 |
+
| 4.8991 | 6.0 | 4200 | 3.7639 | 4.0786 | 0.2349 | 10.6005 |
|
64 |
+
| 4.857 | 7.0 | 4900 | 3.7552 | 4.3018 | 0.2386 | 10.572 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
|
66 |
|
67 |
### Framework versions
|