End of training

Browse files

Files changed (4) hide show

README.md +18 -18
model.safetensors +1 -1
runs/May07_20-18-21_ip-172-16-118-186.ec2.internal/events.out.tfevents.1715113101.ip-172-16-118-186.ec2.internal.2573.0 +2 -2
tokenizer.json +16 -2

README.md CHANGED Viewed

@@ -18,15 +18,15 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [naver-clova-ix/donut-base](https://huggingface.co/naver-clova-ix/donut-base) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.8193
-- Bleu: 0.0433
-- Precisions: [0.6616702355460385, 0.5073170731707317, 0.43342776203966005, 0.36824324324324326]
-- Brevity Penalty: 0.0899
-- Length Ratio: 0.2933
-- Translation Length: 467
-- Reference Length: 1592
-- Cer: 0.7585
-- Wer: 0.8689
 ## Model description
@@ -46,11 +46,11 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 2
-- eval_batch_size: 2
 - seed: 42
 - gradient_accumulation_steps: 2
-- total_train_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 4
@@ -58,12 +58,12 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch  | Step | Validation Loss | Bleu   | Precisions                                                                            | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Cer    | Wer    |
-|:-------------:|:------:|:----:|:---------------:|:------:|:-------------------------------------------------------------------------------------:|:---------------:|:------------:|:------------------:|:----------------:|:------:|:------:|
-| 6.3552        | 0.9960 | 126  | 2.6182          | 0.0001 | [0.38207547169811323, 0.09876543209876543, 0.03508771929824561, 0.012195121951219513] | 0.0015          | 0.1332       | 212                | 1592             | 0.8972 | 0.9604 |
-| 2.8839        | 2.0    | 253  | 1.3248          | 0.0102 | [0.5095238095238095, 0.20110192837465565, 0.11726384364820847, 0.06374501992031872]   | 0.0614          | 0.2638       | 420                | 1592             | 0.8207 | 0.9242 |
-| 1.8618        | 2.9960 | 379  | 0.9383          | 0.0231 | [0.5866983372921615, 0.41483516483516486, 0.32247557003257327, 0.248]                 | 0.0619          | 0.2644       | 421                | 1592             | 0.7806 | 0.8966 |
-| 1.0999        | 3.9842 | 504  | 0.8193          | 0.0433 | [0.6616702355460385, 0.5073170731707317, 0.43342776203966005, 0.36824324324324326]    | 0.0899          | 0.2933       | 467                | 1592             | 0.7585 | 0.8689 |
 ### Framework versions

 This model is a fine-tuned version of [naver-clova-ix/donut-base](https://huggingface.co/naver-clova-ix/donut-base) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.5114
+- Bleu: 0.0672
+- Precisions: [0.7849898580121704, 0.7041284403669725, 0.6490765171503958, 0.6024844720496895]
+- Brevity Penalty: 0.0986
+- Length Ratio: 0.3015
+- Translation Length: 493
+- Reference Length: 1635
+- Cer: 0.7616
+- Wer: 0.8389
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 1
+- eval_batch_size: 1
 - seed: 42
 - gradient_accumulation_steps: 2
+- total_train_batch_size: 2
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 4
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Bleu   | Precisions                                                                         | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Cer    | Wer    |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:----------------------------------------------------------------------------------:|:---------------:|:------------:|:------------------:|:----------------:|:------:|:------:|
+| 3.7646        | 1.0   | 253  | 1.3802          | 0.0163 | [0.5175257731958763, 0.21962616822429906, 0.1293800539083558, 0.06369426751592357] | 0.0934          | 0.2966       | 485                | 1635             | 0.8225 | 0.9292 |
+| 1.2442        | 2.0   | 506  | 0.7108          | 0.0454 | [0.6456211812627292, 0.4930875576036866, 0.41379310344827586, 0.359375]            | 0.0973          | 0.3003       | 491                | 1635             | 0.7755 | 0.8908 |
+| 0.8189        | 3.0   | 759  | 0.5739          | 0.0574 | [0.757700205338809, 0.6395348837209303, 0.5656836461126006, 0.4936708860759494]    | 0.0947          | 0.2979       | 487                | 1635             | 0.7606 | 0.8539 |
+| 0.6444        | 4.0   | 1012 | 0.5114          | 0.0672 | [0.7849898580121704, 0.7041284403669725, 0.6490765171503958, 0.6024844720496895]   | 0.0986          | 0.3015       | 493                | 1635             | 0.7616 | 0.8389 |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5f4323c52bcbcbd6a98d301c0d264220fd32c499224d1f244bb43815e8374f64
 size 809103512

 version https://git-lfs.github.com/spec/v1
+oid sha256:92a377fc86bb73a03f7f399667171f5cfb5a57a7c66379d25f4bfcd2cb77d9ab
 size 809103512

runs/May07_20-18-21_ip-172-16-118-186.ec2.internal/events.out.tfevents.1715113101.ip-172-16-118-186.ec2.internal.2573.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:81e850bc7319d645d2be50bc807c54e24f618744c28e2434b008792c4d9013ab
-size 12747

 version https://git-lfs.github.com/spec/v1
+oid sha256:fbfbb8b251ee9dc1d643875e29fa1b3d911f584f0822f8a3c489c371e2bf90b7
+size 14384

tokenizer.json CHANGED Viewed

@@ -1,7 +1,21 @@
 {
   "version": "1.0",
-  "truncation": null,
-  "padding": null,
   "added_tokens": [
     {
       "id": 0,

 {
   "version": "1.0",
+  "truncation": {
+    "direction": "Right",
+    "max_length": 512,
+    "strategy": "LongestFirst",
+    "stride": 0
+  },
+  "padding": {
+    "strategy": {
+      "Fixed": 512
+    },
+    "direction": "Right",
+    "pad_to_multiple_of": null,
+    "pad_id": 1,
+    "pad_type_id": 0,
+    "pad_token": "<pad>"
+  },
   "added_tokens": [
     {
       "id": 0,