File size: 5,236 Bytes
e5b7e95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
---
license: apache-2.0
base_model: Helsinki-NLP/opus-mt-en-ro
tags:
- generated_from_trainer
datasets:
- arrow
metrics:
- bleu
model-index:
- name: opus-mt-en-bkm
  results:
  - task:
      name: Sequence-to-sequence Language Modeling
      type: text2text-generation
    dataset:
      name: arrow
      type: arrow
      config: default
      split: train
      args: default
    metrics:
    - name: Bleu
      type: bleu
      value: 17.7574
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# opus-mt-en-bkm

This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-ro](https://huggingface.co/Helsinki-NLP/opus-mt-en-ro) on the arrow dataset.
It achieves the following results on the evaluation set:
- Loss: 1.1790
- Bleu: 17.7574
- Gen Len: 58.4209

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Bleu    | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
| 2.1758        | 1.0   | 1113  | 1.8681          | 4.1739  | 58.6351 |
| 1.8143        | 2.0   | 2226  | 1.6288          | 6.2869  | 62.8396 |
| 1.635         | 3.0   | 3339  | 1.4789          | 7.8756  | 58.5721 |
| 1.4988        | 4.0   | 4452  | 1.3930          | 9.2821  | 59.5793 |
| 1.3753        | 5.0   | 5565  | 1.3288          | 10.4942 | 58.924  |
| 1.3015        | 6.0   | 6678  | 1.2773          | 11.3724 | 60.0849 |
| 1.2424        | 7.0   | 7791  | 1.2419          | 12.1525 | 60.724  |
| 1.1758        | 8.0   | 8904  | 1.2131          | 12.5595 | 58.5216 |
| 1.1263        | 9.0   | 10017 | 1.1882          | 13.4807 | 58.1827 |
| 1.0781        | 10.0  | 11130 | 1.1720          | 13.6583 | 56.953  |
| 1.0377        | 11.0  | 12243 | 1.1571          | 14.2744 | 58.1146 |
| 1.0014        | 12.0  | 13356 | 1.1437          | 14.5804 | 57.9928 |
| 0.9737        | 13.0  | 14469 | 1.1326          | 14.9612 | 57.4652 |
| 0.9384        | 14.0  | 15582 | 1.1263          | 15.1647 | 58.4813 |
| 0.9061        | 15.0  | 16695 | 1.1262          | 15.3948 | 57.8562 |
| 0.8854        | 16.0  | 17808 | 1.1164          | 15.7348 | 57.8652 |
| 0.8657        | 17.0  | 18921 | 1.1179          | 15.9306 | 57.5578 |
| 0.837         | 18.0  | 20034 | 1.1140          | 16.0704 | 58.2836 |
| 0.8208        | 19.0  | 21147 | 1.1135          | 16.1836 | 57.6796 |
| 0.7919        | 20.0  | 22260 | 1.1117          | 16.4418 | 57.7658 |
| 0.7645        | 21.0  | 23373 | 1.1134          | 16.3838 | 58.2189 |
| 0.7519        | 22.0  | 24486 | 1.1157          | 16.4369 | 57.7701 |
| 0.7375        | 23.0  | 25599 | 1.1178          | 16.4328 | 57.5811 |
| 0.7221        | 24.0  | 26712 | 1.1186          | 16.8289 | 57.3139 |
| 0.7009        | 25.0  | 27825 | 1.1190          | 16.9092 | 57.9038 |
| 0.6882        | 26.0  | 28938 | 1.1254          | 17.0946 | 58.229  |
| 0.6778        | 27.0  | 30051 | 1.1246          | 17.1689 | 58.5953 |
| 0.6668        | 28.0  | 31164 | 1.1281          | 17.1734 | 58.1258 |
| 0.6589        | 29.0  | 32277 | 1.1322          | 16.9988 | 58.0218 |
| 0.639         | 30.0  | 33390 | 1.1297          | 17.2725 | 58.3717 |
| 0.6318        | 31.0  | 34503 | 1.1392          | 17.3926 | 57.9088 |
| 0.6174        | 32.0  | 35616 | 1.1429          | 17.385  | 58.6474 |
| 0.6105        | 33.0  | 36729 | 1.1443          | 17.4034 | 58.7521 |
| 0.5953        | 34.0  | 37842 | 1.1485          | 17.4571 | 58.4733 |
| 0.5897        | 35.0  | 38955 | 1.1491          | 17.4854 | 58.9544 |
| 0.5807        | 36.0  | 40068 | 1.1572          | 17.544  | 58.1013 |
| 0.5774        | 37.0  | 41181 | 1.1588          | 17.5858 | 58.4694 |
| 0.5633        | 38.0  | 42294 | 1.1588          | 17.604  | 58.2328 |
| 0.5565        | 39.0  | 43407 | 1.1640          | 17.7342 | 58.3148 |
| 0.5556        | 40.0  | 44520 | 1.1642          | 17.6596 | 58.6809 |
| 0.5469        | 41.0  | 45633 | 1.1671          | 17.5064 | 58.1013 |
| 0.5428        | 42.0  | 46746 | 1.1686          | 17.7473 | 58.5171 |
| 0.5342        | 43.0  | 47859 | 1.1719          | 17.749  | 58.8335 |
| 0.5292        | 44.0  | 48972 | 1.1730          | 17.6552 | 58.4492 |
| 0.5314        | 45.0  | 50085 | 1.1728          | 17.7932 | 58.6007 |
| 0.5283        | 46.0  | 51198 | 1.1770          | 17.7351 | 58.4564 |
| 0.5252        | 47.0  | 52311 | 1.1778          | 17.803  | 58.5793 |
| 0.5227        | 48.0  | 53424 | 1.1782          | 17.7729 | 58.3533 |
| 0.5206        | 49.0  | 54537 | 1.1788          | 17.7547 | 58.5108 |
| 0.5186        | 50.0  | 55650 | 1.1790          | 17.7574 | 58.4209 |


### Framework versions

- Transformers 4.38.2
- Pytorch 2.1.0+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2