davelotito commited on
Commit
abd96ec
1 Parent(s): 12961ec

End of training

Browse files
README.md CHANGED
@@ -18,15 +18,15 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [naver-clova-ix/donut-base](https://huggingface.co/naver-clova-ix/donut-base) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.8193
22
- - Bleu: 0.0433
23
- - Precisions: [0.6616702355460385, 0.5073170731707317, 0.43342776203966005, 0.36824324324324326]
24
- - Brevity Penalty: 0.0899
25
- - Length Ratio: 0.2933
26
- - Translation Length: 467
27
- - Reference Length: 1592
28
- - Cer: 0.7585
29
- - Wer: 0.8689
30
 
31
  ## Model description
32
 
@@ -46,11 +46,11 @@ More information needed
46
 
47
  The following hyperparameters were used during training:
48
  - learning_rate: 2e-05
49
- - train_batch_size: 2
50
- - eval_batch_size: 2
51
  - seed: 42
52
  - gradient_accumulation_steps: 2
53
- - total_train_batch_size: 4
54
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
55
  - lr_scheduler_type: linear
56
  - num_epochs: 4
@@ -58,12 +58,12 @@ The following hyperparameters were used during training:
58
 
59
  ### Training results
60
 
61
- | Training Loss | Epoch | Step | Validation Loss | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Cer | Wer |
62
- |:-------------:|:------:|:----:|:---------------:|:------:|:-------------------------------------------------------------------------------------:|:---------------:|:------------:|:------------------:|:----------------:|:------:|:------:|
63
- | 6.3552 | 0.9960 | 126 | 2.6182 | 0.0001 | [0.38207547169811323, 0.09876543209876543, 0.03508771929824561, 0.012195121951219513] | 0.0015 | 0.1332 | 212 | 1592 | 0.8972 | 0.9604 |
64
- | 2.8839 | 2.0 | 253 | 1.3248 | 0.0102 | [0.5095238095238095, 0.20110192837465565, 0.11726384364820847, 0.06374501992031872] | 0.0614 | 0.2638 | 420 | 1592 | 0.8207 | 0.9242 |
65
- | 1.8618 | 2.9960 | 379 | 0.9383 | 0.0231 | [0.5866983372921615, 0.41483516483516486, 0.32247557003257327, 0.248] | 0.0619 | 0.2644 | 421 | 1592 | 0.7806 | 0.8966 |
66
- | 1.0999 | 3.9842 | 504 | 0.8193 | 0.0433 | [0.6616702355460385, 0.5073170731707317, 0.43342776203966005, 0.36824324324324326] | 0.0899 | 0.2933 | 467 | 1592 | 0.7585 | 0.8689 |
67
 
68
 
69
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [naver-clova-ix/donut-base](https://huggingface.co/naver-clova-ix/donut-base) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.5114
22
+ - Bleu: 0.0672
23
+ - Precisions: [0.7849898580121704, 0.7041284403669725, 0.6490765171503958, 0.6024844720496895]
24
+ - Brevity Penalty: 0.0986
25
+ - Length Ratio: 0.3015
26
+ - Translation Length: 493
27
+ - Reference Length: 1635
28
+ - Cer: 0.7616
29
+ - Wer: 0.8389
30
 
31
  ## Model description
32
 
 
46
 
47
  The following hyperparameters were used during training:
48
  - learning_rate: 2e-05
49
+ - train_batch_size: 1
50
+ - eval_batch_size: 1
51
  - seed: 42
52
  - gradient_accumulation_steps: 2
53
+ - total_train_batch_size: 2
54
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
55
  - lr_scheduler_type: linear
56
  - num_epochs: 4
 
58
 
59
  ### Training results
60
 
61
+ | Training Loss | Epoch | Step | Validation Loss | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Cer | Wer |
62
+ |:-------------:|:-----:|:----:|:---------------:|:------:|:----------------------------------------------------------------------------------:|:---------------:|:------------:|:------------------:|:----------------:|:------:|:------:|
63
+ | 3.7646 | 1.0 | 253 | 1.3802 | 0.0163 | [0.5175257731958763, 0.21962616822429906, 0.1293800539083558, 0.06369426751592357] | 0.0934 | 0.2966 | 485 | 1635 | 0.8225 | 0.9292 |
64
+ | 1.2442 | 2.0 | 506 | 0.7108 | 0.0454 | [0.6456211812627292, 0.4930875576036866, 0.41379310344827586, 0.359375] | 0.0973 | 0.3003 | 491 | 1635 | 0.7755 | 0.8908 |
65
+ | 0.8189 | 3.0 | 759 | 0.5739 | 0.0574 | [0.757700205338809, 0.6395348837209303, 0.5656836461126006, 0.4936708860759494] | 0.0947 | 0.2979 | 487 | 1635 | 0.7606 | 0.8539 |
66
+ | 0.6444 | 4.0 | 1012 | 0.5114 | 0.0672 | [0.7849898580121704, 0.7041284403669725, 0.6490765171503958, 0.6024844720496895] | 0.0986 | 0.3015 | 493 | 1635 | 0.7616 | 0.8389 |
67
 
68
 
69
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5f4323c52bcbcbd6a98d301c0d264220fd32c499224d1f244bb43815e8374f64
3
  size 809103512
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:92a377fc86bb73a03f7f399667171f5cfb5a57a7c66379d25f4bfcd2cb77d9ab
3
  size 809103512
runs/May07_20-18-21_ip-172-16-118-186.ec2.internal/events.out.tfevents.1715113101.ip-172-16-118-186.ec2.internal.2573.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:81e850bc7319d645d2be50bc807c54e24f618744c28e2434b008792c4d9013ab
3
- size 12747
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fbfbb8b251ee9dc1d643875e29fa1b3d911f584f0822f8a3c489c371e2bf90b7
3
+ size 14384
tokenizer.json CHANGED
@@ -1,7 +1,21 @@
1
  {
2
  "version": "1.0",
3
- "truncation": null,
4
- "padding": null,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  "added_tokens": [
6
  {
7
  "id": 0,
 
1
  {
2
  "version": "1.0",
3
+ "truncation": {
4
+ "direction": "Right",
5
+ "max_length": 512,
6
+ "strategy": "LongestFirst",
7
+ "stride": 0
8
+ },
9
+ "padding": {
10
+ "strategy": {
11
+ "Fixed": 512
12
+ },
13
+ "direction": "Right",
14
+ "pad_to_multiple_of": null,
15
+ "pad_id": 1,
16
+ "pad_type_id": 0,
17
+ "pad_token": "<pad>"
18
+ },
19
  "added_tokens": [
20
  {
21
  "id": 0,