Trkkk commited on
Commit
68a27e7
·
verified ·
1 Parent(s): 2a49d48

End of training

Browse files
Files changed (3) hide show
  1. README.md +121 -76
  2. generation_config.json +7 -7
  3. model.safetensors +1 -1
README.md CHANGED
@@ -1,76 +1,121 @@
1
- ---
2
- base_model: microsoft/git-base
3
- library_name: transformers
4
- license: mit
5
- tags:
6
- - generated_from_trainer
7
- model-index:
8
- - name: git-base-bdd100k
9
- results: []
10
- ---
11
-
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
-
15
- # git-base-bdd100k
16
-
17
- This model is a fine-tuned version of [microsoft/git-base](https://huggingface.co/microsoft/git-base) on an unknown dataset.
18
- It achieves the following results on the evaluation set:
19
- - Loss: 0.4317
20
- - Wer Score: 0.7406
21
-
22
- ## Model description
23
-
24
- More information needed
25
-
26
- ## Intended uses & limitations
27
-
28
- More information needed
29
-
30
- ## Training and evaluation data
31
-
32
- More information needed
33
-
34
- ## Training procedure
35
-
36
- ### Training hyperparameters
37
-
38
- The following hyperparameters were used during training:
39
- - learning_rate: 0.0003
40
- - train_batch_size: 10
41
- - eval_batch_size: 10
42
- - seed: 42
43
- - gradient_accumulation_steps: 2
44
- - total_train_batch_size: 20
45
- - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
- - lr_scheduler_type: linear
47
- - num_epochs: 15
48
- - mixed_precision_training: Native AMP
49
-
50
- ### Training results
51
-
52
- | Training Loss | Epoch | Step | Validation Loss | Wer Score |
53
- |:-------------:|:-----:|:----:|:---------------:|:---------:|
54
- | 10.8372 | 1.0 | 3 | 10.0676 | 12.2594 |
55
- | 9.8981 | 2.0 | 6 | 8.6733 | 9.4194 |
56
- | 8.1063 | 3.0 | 9 | 6.8677 | 0.8955 |
57
- | 6.5282 | 4.0 | 12 | 5.5489 | 3.5342 |
58
- | 5.2838 | 5.0 | 15 | 4.3715 | 2.2077 |
59
- | 4.16 | 6.0 | 18 | 3.3203 | 3.32 |
60
- | 3.1554 | 7.0 | 21 | 2.3917 | 1.5897 |
61
- | 2.2691 | 8.0 | 24 | 1.6361 | 0.7832 |
62
- | 1.553 | 9.0 | 27 | 1.1034 | 0.7703 |
63
- | 1.0453 | 10.0 | 30 | 0.7820 | 0.7781 |
64
- | 0.7256 | 11.0 | 33 | 0.6073 | 0.7703 |
65
- | 0.5425 | 12.0 | 36 | 0.5168 | 0.7548 |
66
- | 0.4393 | 13.0 | 39 | 0.4689 | 0.7419 |
67
- | 0.3801 | 14.0 | 42 | 0.4449 | 0.7445 |
68
- | 0.3404 | 15.0 | 45 | 0.4317 | 0.7406 |
69
-
70
-
71
- ### Framework versions
72
-
73
- - Transformers 4.46.0.dev0
74
- - Pytorch 2.0.1+cu117
75
- - Datasets 3.0.1
76
- - Tokenizers 0.20.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: mit
4
+ base_model: microsoft/git-base
5
+ tags:
6
+ - generated_from_trainer
7
+ model-index:
8
+ - name: git-base-bdd100k
9
+ results: []
10
+ ---
11
+
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
14
+
15
+ # git-base-bdd100k
16
+
17
+ This model is a fine-tuned version of [microsoft/git-base](https://huggingface.co/microsoft/git-base) on an unknown dataset.
18
+ It achieves the following results on the evaluation set:
19
+ - Loss: 0.7580
20
+ - Wer Score: 2.4791
21
+
22
+ ## Model description
23
+
24
+ More information needed
25
+
26
+ ## Intended uses & limitations
27
+
28
+ More information needed
29
+
30
+ ## Training and evaluation data
31
+
32
+ More information needed
33
+
34
+ ## Training procedure
35
+
36
+ ### Training hyperparameters
37
+
38
+ The following hyperparameters were used during training:
39
+ - learning_rate: 5e-05
40
+ - train_batch_size: 25
41
+ - eval_batch_size: 25
42
+ - seed: 42
43
+ - gradient_accumulation_steps: 2
44
+ - total_train_batch_size: 50
45
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
+ - lr_scheduler_type: linear
47
+ - num_epochs: 60
48
+ - mixed_precision_training: Native AMP
49
+
50
+ ### Training results
51
+
52
+ | Training Loss | Epoch | Step | Validation Loss | Wer Score |
53
+ |:-------------:|:-----:|:----:|:---------------:|:---------:|
54
+ | 10.5887 | 1.0 | 3 | 9.3421 | 6.9151 |
55
+ | 9.214 | 2.0 | 6 | 8.8516 | 6.0480 |
56
+ | 8.7067 | 3.0 | 9 | 8.2609 | 5.4057 |
57
+ | 8.1888 | 4.0 | 12 | 7.8240 | 6.3589 |
58
+ | 7.7891 | 5.0 | 15 | 7.4809 | 7.0127 |
59
+ | 7.4625 | 6.0 | 18 | 7.1805 | 8.0121 |
60
+ | 7.1743 | 7.0 | 21 | 6.9085 | 8.0926 |
61
+ | 6.9065 | 8.0 | 24 | 6.6502 | 7.8826 |
62
+ | 6.6509 | 9.0 | 27 | 6.4007 | 6.7806 |
63
+ | 6.4049 | 10.0 | 30 | 6.1576 | 5.3401 |
64
+ | 6.163 | 11.0 | 33 | 5.9207 | 3.7040 |
65
+ | 5.9257 | 12.0 | 36 | 5.6873 | 2.8456 |
66
+ | 5.6928 | 13.0 | 39 | 5.4572 | 2.7166 |
67
+ | 5.4619 | 14.0 | 42 | 5.2319 | 2.2856 |
68
+ | 5.235 | 15.0 | 45 | 5.0112 | 2.5138 |
69
+ | 5.0122 | 16.0 | 48 | 4.7931 | 2.0353 |
70
+ | 4.7935 | 17.0 | 51 | 4.5815 | 2.0843 |
71
+ | 4.5784 | 18.0 | 54 | 4.3751 | 2.1378 |
72
+ | 4.3684 | 19.0 | 57 | 4.1720 | 1.9609 |
73
+ | 4.1622 | 20.0 | 60 | 3.9752 | 1.9994 |
74
+ | 3.9616 | 21.0 | 63 | 3.7828 | 2.1312 |
75
+ | 3.764 | 22.0 | 66 | 3.5945 | 2.1163 |
76
+ | 3.5738 | 23.0 | 69 | 3.4124 | 2.1417 |
77
+ | 3.3868 | 24.0 | 72 | 3.2380 | 2.2707 |
78
+ | 3.2067 | 25.0 | 75 | 3.0658 | 2.2205 |
79
+ | 3.0298 | 26.0 | 78 | 2.9021 | 2.2029 |
80
+ | 2.8614 | 27.0 | 81 | 2.7425 | 2.3682 |
81
+ | 2.6981 | 28.0 | 84 | 2.5918 | 2.2133 |
82
+ | 2.5412 | 29.0 | 87 | 2.4445 | 2.2889 |
83
+ | 2.3899 | 30.0 | 90 | 2.3042 | 2.2795 |
84
+ | 2.2443 | 31.0 | 93 | 2.1726 | 2.3831 |
85
+ | 2.1068 | 32.0 | 96 | 2.0445 | 2.3649 |
86
+ | 1.975 | 33.0 | 99 | 1.9276 | 2.3291 |
87
+ | 1.8509 | 34.0 | 102 | 1.8173 | 2.3252 |
88
+ | 1.733 | 35.0 | 105 | 1.7116 | 2.3809 |
89
+ | 1.6231 | 36.0 | 108 | 1.6166 | 2.3743 |
90
+ | 1.5204 | 37.0 | 111 | 1.5221 | 2.4256 |
91
+ | 1.4227 | 38.0 | 114 | 1.4396 | 2.4305 |
92
+ | 1.3334 | 39.0 | 117 | 1.3620 | 2.5766 |
93
+ | 1.2509 | 40.0 | 120 | 1.2913 | 2.4140 |
94
+ | 1.1736 | 41.0 | 123 | 1.2291 | 2.4140 |
95
+ | 1.1027 | 42.0 | 126 | 1.1664 | 2.4162 |
96
+ | 1.0378 | 43.0 | 129 | 1.1151 | 2.4531 |
97
+ | 0.9774 | 44.0 | 132 | 1.0686 | 2.4013 |
98
+ | 0.9234 | 45.0 | 135 | 1.0257 | 2.4548 |
99
+ | 0.8731 | 46.0 | 138 | 0.9856 | 2.4603 |
100
+ | 0.8301 | 47.0 | 141 | 0.9499 | 2.5463 |
101
+ | 0.7886 | 48.0 | 144 | 0.9213 | 2.3953 |
102
+ | 0.7511 | 49.0 | 147 | 0.8932 | 2.5083 |
103
+ | 0.7193 | 50.0 | 150 | 0.8675 | 2.4542 |
104
+ | 0.6894 | 51.0 | 153 | 0.8475 | 2.4713 |
105
+ | 0.664 | 52.0 | 156 | 0.8284 | 2.4030 |
106
+ | 0.6405 | 53.0 | 159 | 0.8146 | 2.4548 |
107
+ | 0.6205 | 54.0 | 162 | 0.7990 | 2.5424 |
108
+ | 0.6042 | 55.0 | 165 | 0.7881 | 2.4961 |
109
+ | 0.5893 | 56.0 | 168 | 0.7785 | 2.4664 |
110
+ | 0.5766 | 57.0 | 171 | 0.7710 | 2.4598 |
111
+ | 0.5664 | 58.0 | 174 | 0.7650 | 2.4564 |
112
+ | 0.5598 | 59.0 | 177 | 0.7613 | 2.4895 |
113
+ | 0.5539 | 60.0 | 180 | 0.7580 | 2.4791 |
114
+
115
+
116
+ ### Framework versions
117
+
118
+ - Transformers 4.45.2
119
+ - Pytorch 2.1.0+cu118
120
+ - Datasets 3.0.1
121
+ - Tokenizers 0.20.1
generation_config.json CHANGED
@@ -1,7 +1,7 @@
1
- {
2
- "_from_model_config": true,
3
- "bos_token_id": 101,
4
- "eos_token_id": 102,
5
- "pad_token_id": 0,
6
- "transformers_version": "4.46.0.dev0"
7
- }
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 101,
4
+ "eos_token_id": 102,
5
+ "pad_token_id": 0,
6
+ "transformers_version": "4.45.2"
7
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e75e8688a40c9d96faa6c02427f95df2466642c893f7a5157951c3d99539813e
3
  size 706516040
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e06eca1e70144ebcaed81fb57ea39a29ad6764cf3bd572e4e99b821e60e4a58
3
  size 706516040