End of training

Browse files

Files changed (4) hide show

README.md +100 -0
generation_config.json +16 -0
model.safetensors +1 -1
runs/Apr03_10-39-07_8eb53fe594f9/events.out.tfevents.1712140842.8eb53fe594f9.445.0 +2 -2

README.md ADDED Viewed

	@@ -0,0 +1,100 @@

+---
+license: apache-2.0
+base_model: Helsinki-NLP/opus-mt-en-ro
+tags:
+- generated_from_trainer
+datasets:
+- arrow
+metrics:
+- bleu
+model-index:
+- name: opus-mt-en-bkm
+  results:
+  - task:
+      name: Sequence-to-sequence Language Modeling
+      type: text2text-generation
+    dataset:
+      name: arrow
+      type: arrow
+      config: default
+      split: train
+      args: default
+    metrics:
+    - name: Bleu
+      type: bleu
+      value: 14.5684
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# opus-mt-en-bkm
+This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-ro](https://huggingface.co/Helsinki-NLP/opus-mt-en-ro) on the arrow dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.1597
+- Bleu: 14.5684
+- Gen Len: 58.4294
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 16
+- eval_batch_size: 16
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 25
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Bleu    | Gen Len |
+|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
+| 3.3983        | 1.0   | 974   | 1.9251          | 3.7894  | 60.1579 |
+| 1.9429        | 2.0   | 1948  | 1.6720          | 5.7083  | 58.6443 |
+| 1.7118        | 3.0   | 2922  | 1.5389          | 7.1977  | 58.8536 |
+| 1.5647        | 4.0   | 3896  | 1.4484          | 8.4631  | 57.9068 |
+| 1.4611        | 5.0   | 4870  | 1.3836          | 9.5314  | 59.3106 |
+| 1.3735        | 6.0   | 5844  | 1.3357          | 10.1879 | 59.5501 |
+| 1.3078        | 7.0   | 6818  | 1.3014          | 10.9172 | 59.4968 |
+| 1.245         | 8.0   | 7792  | 1.2737          | 11.445  | 59.585  |
+| 1.2048        | 9.0   | 8766  | 1.2485          | 11.9346 | 58.3275 |
+| 1.1648        | 10.0  | 9740  | 1.2298          | 12.3049 | 58.7768 |
+| 1.1272        | 11.0  | 10714 | 1.2176          | 12.7287 | 58.1549 |
+| 1.086         | 12.0  | 11688 | 1.2043          | 13.0962 | 59.2217 |
+| 1.0595        | 13.0  | 12662 | 1.1973          | 13.3375 | 58.6736 |
+| 1.0343        | 14.0  | 13636 | 1.1844          | 13.3963 | 58.2763 |
+| 1.0174        | 15.0  | 14610 | 1.1797          | 13.7067 | 58.1738 |
+| 0.9923        | 16.0  | 15584 | 1.1757          | 13.9467 | 59.3246 |
+| 0.9703        | 17.0  | 16558 | 1.1704          | 14.1023 | 58.9813 |
+| 0.9589        | 18.0  | 17532 | 1.1663          | 14.2842 | 58.401  |
+| 0.9472        | 19.0  | 18506 | 1.1662          | 14.2109 | 58.4796 |
+| 0.9262        | 20.0  | 19480 | 1.1635          | 14.3872 | 58.1601 |
+| 0.9147        | 21.0  | 20454 | 1.1606          | 14.4983 | 58.7417 |
+| 0.9162        | 22.0  | 21428 | 1.1630          | 14.5229 | 58.4345 |
+| 0.9012        | 23.0  | 22402 | 1.1607          | 14.6204 | 58.0767 |
+| 0.899         | 24.0  | 23376 | 1.1600          | 14.5681 | 58.4357 |
+| 0.8934        | 25.0  | 24350 | 1.1597          | 14.5684 | 58.4294 |
+### Framework versions
+- Transformers 4.39.3
+- Pytorch 2.2.1+cu121
+- Datasets 2.18.0
+- Tokenizers 0.15.2

generation_config.json ADDED Viewed

	@@ -0,0 +1,16 @@

+{
+  "bad_words_ids": [
+    [
+      59542
+    ]
+  ],
+  "bos_token_id": 0,
+  "decoder_start_token_id": 59542,
+  "eos_token_id": 0,
+  "forced_eos_token_id": 0,
+  "max_length": 512,
+  "num_beams": 4,
+  "pad_token_id": 59542,
+  "renormalize_logits": true,
+  "transformers_version": "4.39.3"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c25676f4df2cfe0665556f528fbed4104ceb154835409bf65c72240fec0e4c83
 size 298765276

 version https://git-lfs.github.com/spec/v1
+oid sha256:b9b409101e277016b808ef0bde1a53e41c18c1ab6752752267ad0e6537dca2ca
 size 298765276

runs/Apr03_10-39-07_8eb53fe594f9/events.out.tfevents.1712140842.8eb53fe594f9.445.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:941ff14c806a6721569d954a829bdee2248006233f601ce13b0120a8cdfc4a1b
-size 24597

 version https://git-lfs.github.com/spec/v1
+oid sha256:8404e7c2dda4d5a228cc3fc650cb1c458994d123931c88e5ffd7387a1739309b
+size 25334