oliverguhr commited on
Commit
8dcc7cf
·
1 Parent(s): 6bc7eba

updated mmodel card

Browse files
Files changed (1) hide show
  1. README.md +16 -199
README.md CHANGED
@@ -3,215 +3,32 @@ license: apache-2.0
3
  tags:
4
  - generated_from_trainer
5
  model-index:
6
- - name: flan-t5-spelling-de-base-fullds2
7
  results: []
 
 
 
 
 
8
  ---
9
 
10
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
- should probably proofread and complete it, then remove this comment. -->
12
 
13
- # flan-t5-spelling-de-base-fullds2
14
-
15
- This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on an unknown dataset.
16
- It achieves the following results on the evaluation set:
17
- - Loss: 0.0525
18
- - Cer: 0.0077
19
 
20
  ## Model description
21
 
22
- More information needed
23
 
24
  ## Intended uses & limitations
25
 
26
- More information needed
27
-
28
- ## Training and evaluation data
29
-
30
- More information needed
31
-
32
- ## Training procedure
33
-
34
- ### Training hyperparameters
35
-
36
- The following hyperparameters were used during training:
37
- - learning_rate: 0.003
38
- - train_batch_size: 16
39
- - eval_batch_size: 16
40
- - seed: 42
41
- - gradient_accumulation_steps: 16
42
- - total_train_batch_size: 256
43
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
- - lr_scheduler_type: linear
45
- - num_epochs: 2.0
46
-
47
- ### Training results
48
-
49
- | Training Loss | Epoch | Step | Validation Loss | Cer |
50
- |:-------------:|:-----:|:-----:|:---------------:|:------:|
51
- | 0.444 | 0.01 | 500 | 0.3389 | 0.6733 |
52
- | 0.3429 | 0.03 | 1000 | 0.2655 | 0.6706 |
53
- | 0.2914 | 0.04 | 1500 | 0.2277 | 0.6705 |
54
- | 0.264 | 0.05 | 2000 | 0.2078 | 0.6698 |
55
- | 0.2506 | 0.06 | 2500 | 0.1894 | 0.6694 |
56
- | 0.2305 | 0.08 | 3000 | 0.1787 | 0.6685 |
57
- | 0.2206 | 0.09 | 3500 | 0.1685 | 0.6688 |
58
- | 0.2086 | 0.1 | 4000 | 0.1607 | 0.6685 |
59
- | 0.1955 | 0.11 | 4500 | 0.1518 | 0.6683 |
60
- | 0.1903 | 0.13 | 5000 | 0.1475 | 0.6686 |
61
- | 0.1827 | 0.14 | 5500 | 0.1430 | 0.6684 |
62
- | 0.1775 | 0.15 | 6000 | 0.1369 | 0.6681 |
63
- | 0.1748 | 0.16 | 6500 | 0.1359 | 0.6681 |
64
- | 0.1725 | 0.18 | 7000 | 0.1312 | 0.6677 |
65
- | 0.1638 | 0.19 | 7500 | 0.1254 | 0.6674 |
66
- | 0.1575 | 0.2 | 8000 | 0.1255 | 0.6680 |
67
- | 0.1537 | 0.21 | 8500 | 0.1204 | 0.6678 |
68
- | 0.1516 | 0.23 | 9000 | 0.1188 | 0.6671 |
69
- | 0.1526 | 0.24 | 9500 | 0.1150 | 0.6673 |
70
- | 0.148 | 0.25 | 10000 | 0.1131 | 0.6676 |
71
- | 0.1445 | 0.26 | 10500 | 0.1107 | 0.6675 |
72
- | 0.1378 | 0.28 | 11000 | 0.1113 | 0.6664 |
73
- | 6.1099 | 0.29 | 11500 | 6.0484 | 0.8805 |
74
- | 4.9528 | 0.3 | 12000 | 4.6614 | 0.8115 |
75
- | 0.2066 | 0.31 | 12500 | 0.1495 | 0.6679 |
76
- | 0.1654 | 0.33 | 13000 | 0.1228 | 0.6678 |
77
- | 0.1552 | 0.34 | 13500 | 0.1153 | 0.6670 |
78
- | 0.1443 | 0.35 | 14000 | 0.1110 | 0.6670 |
79
- | 0.1397 | 0.36 | 14500 | 0.1073 | 0.6670 |
80
- | 0.1366 | 0.38 | 15000 | 0.1067 | 0.6664 |
81
- | 0.1362 | 0.39 | 15500 | 0.1043 | 0.6669 |
82
- | 0.1375 | 0.4 | 16000 | 0.1012 | 0.6668 |
83
- | 0.1325 | 0.41 | 16500 | 0.0996 | 0.6672 |
84
- | 0.1277 | 0.43 | 17000 | 0.0993 | 0.6664 |
85
- | 0.1261 | 0.44 | 17500 | 0.0977 | 0.6667 |
86
- | 0.1274 | 0.45 | 18000 | 0.0978 | 0.6666 |
87
- | 0.127 | 0.46 | 18500 | 0.0952 | 0.6670 |
88
- | 0.1218 | 0.48 | 19000 | 0.0933 | 0.6666 |
89
- | 0.1196 | 0.49 | 19500 | 0.0923 | 0.6670 |
90
- | 0.1192 | 0.5 | 20000 | 0.0920 | 0.6665 |
91
- | 0.1171 | 0.52 | 20500 | 0.0910 | 0.6664 |
92
- | 0.1153 | 0.53 | 21000 | 0.0906 | 0.6667 |
93
- | 0.1102 | 0.54 | 21500 | 0.0890 | 0.6669 |
94
- | 0.1147 | 0.55 | 22000 | 0.0886 | 0.6667 |
95
- | 0.1144 | 0.57 | 22500 | 0.0868 | 0.6664 |
96
- | 0.1132 | 0.58 | 23000 | 0.0858 | 0.6666 |
97
- | 0.1073 | 0.59 | 23500 | 0.0853 | 0.6667 |
98
- | 0.109 | 0.6 | 24000 | 0.0845 | 0.6663 |
99
- | 0.1073 | 0.62 | 24500 | 0.0842 | 0.6662 |
100
- | 0.1062 | 0.63 | 25000 | 0.0831 | 0.6662 |
101
- | 0.1018 | 0.64 | 25500 | 0.0830 | 0.6662 |
102
- | 0.1052 | 0.65 | 26000 | 0.0818 | 0.6666 |
103
- | 0.1072 | 0.67 | 26500 | 0.0811 | 0.6662 |
104
- | 0.1023 | 0.68 | 27000 | 0.0807 | 0.6661 |
105
- | 0.1013 | 0.69 | 27500 | 0.0801 | 0.6664 |
106
- | 0.0986 | 0.7 | 28000 | 0.0797 | 0.6664 |
107
- | 0.1022 | 0.72 | 28500 | 0.0786 | 0.6662 |
108
- | 0.0984 | 0.73 | 29000 | 0.0781 | 0.6659 |
109
- | 0.0971 | 0.74 | 29500 | 0.0778 | 0.6662 |
110
- | 0.0963 | 0.75 | 30000 | 0.0773 | 0.6660 |
111
- | 0.0958 | 0.77 | 30500 | 0.0760 | 0.6662 |
112
- | 0.0999 | 0.78 | 31000 | 0.0760 | 0.6661 |
113
- | 0.0953 | 0.79 | 31500 | 0.0752 | 0.6661 |
114
- | 0.095 | 0.8 | 32000 | 0.0749 | 0.6662 |
115
- | 0.09 | 0.82 | 32500 | 0.0748 | 0.6663 |
116
- | 0.0927 | 0.83 | 33000 | 0.0740 | 0.6656 |
117
- | 0.0914 | 0.84 | 33500 | 0.0739 | 0.6662 |
118
- | 0.0889 | 0.85 | 34000 | 0.0737 | 0.6659 |
119
- | 0.0924 | 0.87 | 34500 | 0.0726 | 0.6660 |
120
- | 0.0898 | 0.88 | 35000 | 0.0719 | 0.6659 |
121
- | 0.0913 | 0.89 | 35500 | 0.0721 | 0.6657 |
122
- | 0.0897 | 0.9 | 36000 | 0.0715 | 0.6657 |
123
- | 0.0887 | 0.92 | 36500 | 0.0708 | 0.6659 |
124
- | 0.0922 | 0.93 | 37000 | 0.0712 | 0.6653 |
125
- | 0.0905 | 0.94 | 37500 | 0.0707 | 0.6660 |
126
- | 0.0881 | 0.95 | 38000 | 0.0700 | 0.6658 |
127
- | 0.0858 | 0.97 | 38500 | 0.0693 | 0.6658 |
128
- | 0.0882 | 0.98 | 39000 | 0.0690 | 0.6657 |
129
- | 0.0858 | 0.99 | 39500 | 0.0688 | 0.6656 |
130
- | 0.0808 | 1.0 | 40000 | 0.0680 | 0.6658 |
131
- | 0.0783 | 1.02 | 40500 | 0.0680 | 0.6657 |
132
- | 0.0822 | 1.03 | 41000 | 0.0676 | 0.6658 |
133
- | 0.077 | 1.04 | 41500 | 0.0675 | 0.6657 |
134
- | 0.0788 | 1.06 | 42000 | 0.0673 | 0.6655 |
135
- | 0.0754 | 1.07 | 42500 | 0.0667 | 0.6660 |
136
- | 0.0762 | 1.08 | 43000 | 0.0669 | 0.6656 |
137
- | 0.075 | 1.09 | 43500 | 0.0660 | 0.6660 |
138
- | 0.0816 | 1.11 | 44000 | 0.0661 | 0.6657 |
139
- | 0.0758 | 1.12 | 44500 | 0.0659 | 0.6657 |
140
- | 0.0767 | 1.13 | 45000 | 0.0653 | 0.6658 |
141
- | 0.076 | 1.14 | 45500 | 0.0649 | 0.6656 |
142
- | 0.0727 | 1.16 | 46000 | 0.0651 | 0.6656 |
143
- | 0.0768 | 1.17 | 46500 | 0.0641 | 0.6656 |
144
- | 0.0722 | 1.18 | 47000 | 0.0640 | 0.6655 |
145
- | 0.0763 | 1.19 | 47500 | 0.0646 | 0.6654 |
146
- | 0.0766 | 1.21 | 48000 | 0.0636 | 0.6658 |
147
- | 0.0774 | 1.22 | 48500 | 0.0636 | 0.6654 |
148
- | 0.0759 | 1.23 | 49000 | 0.0633 | 0.6654 |
149
- | 0.0779 | 1.24 | 49500 | 0.0625 | 0.6658 |
150
- | 0.074 | 1.26 | 50000 | 0.0628 | 0.6654 |
151
- | 0.0761 | 1.27 | 50500 | 0.0623 | 0.6656 |
152
- | 0.0763 | 1.28 | 51000 | 0.0617 | 0.6655 |
153
- | 0.072 | 1.29 | 51500 | 0.0617 | 0.6656 |
154
- | 0.0718 | 1.31 | 52000 | 0.0618 | 0.6653 |
155
- | 0.0703 | 1.32 | 52500 | 0.0611 | 0.6655 |
156
- | 0.0718 | 1.33 | 53000 | 0.0608 | 0.6655 |
157
- | 0.0686 | 1.34 | 53500 | 0.0610 | 0.6653 |
158
- | 0.0688 | 1.36 | 54000 | 0.0604 | 0.6657 |
159
- | 0.0694 | 1.37 | 54500 | 0.0604 | 0.6656 |
160
- | 0.0736 | 1.38 | 55000 | 0.0598 | 0.6655 |
161
- | 0.0674 | 1.39 | 55500 | 0.0599 | 0.6653 |
162
- | 0.0681 | 1.41 | 56000 | 0.0592 | 0.6655 |
163
- | 0.07 | 1.42 | 56500 | 0.0592 | 0.6653 |
164
- | 0.0704 | 1.43 | 57000 | 0.0591 | 0.6656 |
165
- | 0.0719 | 1.44 | 57500 | 0.0588 | 0.6653 |
166
- | 0.0667 | 1.46 | 58000 | 0.0587 | 0.6653 |
167
- | 0.0694 | 1.47 | 58500 | 0.0583 | 0.6653 |
168
- | 0.0709 | 1.48 | 59000 | 0.0579 | 0.6655 |
169
- | 0.0661 | 1.49 | 59500 | 0.0578 | 0.6655 |
170
- | 0.0682 | 1.51 | 60000 | 0.0575 | 0.6655 |
171
- | 0.0668 | 1.52 | 60500 | 0.0578 | 0.6654 |
172
- | 0.0684 | 1.53 | 61000 | 0.0575 | 0.6653 |
173
- | 0.0688 | 1.55 | 61500 | 0.0571 | 0.6652 |
174
- | 0.068 | 1.56 | 62000 | 0.0572 | 0.6653 |
175
- | 0.0694 | 1.57 | 62500 | 0.0566 | 0.6654 |
176
- | 0.0642 | 1.58 | 63000 | 0.0569 | 0.6653 |
177
- | 0.0646 | 1.6 | 63500 | 0.0564 | 0.6655 |
178
- | 0.0633 | 1.61 | 64000 | 0.0566 | 0.6653 |
179
- | 0.0677 | 1.62 | 64500 | 0.0563 | 0.6653 |
180
- | 0.0649 | 1.63 | 65000 | 0.0560 | 0.6652 |
181
- | 0.0654 | 1.65 | 65500 | 0.0558 | 0.6654 |
182
- | 0.0675 | 1.66 | 66000 | 0.0557 | 0.6654 |
183
- | 0.0642 | 1.67 | 66500 | 0.0554 | 0.6653 |
184
- | 0.0631 | 1.68 | 67000 | 0.0552 | 0.6653 |
185
- | 0.0628 | 1.7 | 67500 | 0.0552 | 0.6652 |
186
- | 0.0658 | 1.71 | 68000 | 0.0550 | 0.6652 |
187
- | 0.0654 | 1.72 | 68500 | 0.0547 | 0.6653 |
188
- | 0.0648 | 1.73 | 69000 | 0.0544 | 0.6652 |
189
- | 0.0634 | 1.75 | 69500 | 0.0547 | 0.6652 |
190
- | 0.0642 | 1.76 | 70000 | 0.0544 | 0.6654 |
191
- | 0.0649 | 1.77 | 70500 | 0.0542 | 0.6652 |
192
- | 0.0641 | 1.78 | 71000 | 0.0540 | 0.6652 |
193
- | 0.0659 | 1.8 | 71500 | 0.0540 | 0.6653 |
194
- | 0.0651 | 1.81 | 72000 | 0.0536 | 0.6652 |
195
- | 0.0625 | 1.82 | 72500 | 0.0536 | 0.6652 |
196
- | 0.0631 | 1.83 | 73000 | 0.0536 | 0.6651 |
197
- | 0.0614 | 1.85 | 73500 | 0.0535 | 0.6651 |
198
- | 0.0637 | 1.86 | 74000 | 0.0533 | 0.6652 |
199
- | 0.0619 | 1.87 | 74500 | 0.0532 | 0.6652 |
200
- | 0.061 | 1.88 | 75000 | 0.0531 | 0.6652 |
201
- | 0.0598 | 1.9 | 75500 | 0.0530 | 0.6652 |
202
- | 0.0643 | 1.91 | 76000 | 0.0529 | 0.6652 |
203
- | 0.0609 | 1.92 | 76500 | 0.0527 | 0.6651 |
204
- | 0.06 | 1.93 | 77000 | 0.0527 | 0.6652 |
205
- | 0.0627 | 1.95 | 77500 | 0.0527 | 0.6652 |
206
- | 0.0607 | 1.96 | 78000 | 0.0526 | 0.6651 |
207
- | 0.0607 | 1.97 | 78500 | 0.0525 | 0.6651 |
208
- | 0.0608 | 1.98 | 79000 | 0.0525 | 0.6651 |
209
- | 0.0609 | 2.0 | 79500 | 0.0525 | 0.6651 |
210
 
 
 
211
 
212
- ### Framework versions
213
 
214
- - Transformers 4.27.4
215
- - Pytorch 2.0.0+cu117
216
- - Datasets 2.11.0
217
- - Tokenizers 0.13.2
 
3
  tags:
4
  - generated_from_trainer
5
  model-index:
6
+ - name: bart-base-spelling-de
7
  results: []
8
+ widget:
9
+ - text: "ein dransformer isd ein mthode mit der ein compuder eine volge von zeichn in eine andrere folge von zeichen übersetzn kann dies kan zb genutzt werdne um text von einer spracge in eine andrere zu übersetzen"
10
+ example_title: "1"
11
+ - text: "das idst ein neuZr test"
12
+ example_title: "2"
13
  ---
14
 
15
+ This is an experimental model that should fix your typos and punctuation.
16
+ If you like to run your own experiments or train for a different language, have a look at [the code](https://github.com/oliverguhr/spelling).
17
 
 
 
 
 
 
 
18
 
19
  ## Model description
20
 
21
+ This is a proof of concept spelling correction model for german.
22
 
23
  ## Intended uses & limitations
24
 
25
+ This is a work in progress, be aware that the model can produce artefacts.
26
+ You can test the model using the pipeline-interface:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
+ ```python
29
+ from transformers import pipeline
30
 
31
+ fix_spelling = pipeline("text2text-generation",model="oliverguhr/spelling-correction-german-base")
32
 
33
+ print(fix_spelling("das idst ein neuZr test",max_length=2048))
34
+ ```