File size: 7,596 Bytes
0d51201 ab03fb1 0d51201 ab03fb1 2912c8b ab03fb1 2912c8b ab03fb1 2912c8b ab03fb1 2912c8b ab03fb1 2912c8b ab03fb1 2912c8b ab03fb1 2912c8b ab03fb1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 |
---
license: apache-2.0
tags:
- generated_from_trainer
model-index:
- name: bart-base-spelling-nl
results: []
---
# bart-base-spelling-nl
This model is a Dutch fine-tuned version of
[facebook/bart-base](https://huggingface.co/facebook/bart-base).
It achieves the following results on the evaluation set:
- Loss: 0.0217
- Cer: 0.0147
## Model description
This is a text-to-text fine-tuned version of
[facebook/bart-base](https://huggingface.co/facebook/bart-base)
trained on spelling correction. It leans on the excellent work by
Oliver Guhr ([github](https://github.com/oliverguhr/spelling),
[huggingface](https://huggingface.co/oliverguhr/spelling-correction-english-base)). Training
was performed on an AWS EC2 instance (g5.xlarge) on a single GPU.
## Intended uses & limitations
The intended use for this model is to be a component of the
[Valkuil.net](https://valkuil.net) context-sensitive spelling
checker. A next version of the model will be trained on more data.
## Training and evaluation data
The model was trained on a Dutch dataset composed of 1,500,000 lines of
text from three public Dutch sources, downloaded from the [Opus
corpus](https://opus.nlpl.eu/):
- nl-europarlv7.100k.txt (500,000 lines)
- nl-opensubtitles2016.100k.txt (500,000 lines)
- nl-wikipedia.100k.txt (500,000 lines)
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 2
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 2.0
### Training results
| Training Loss | Epoch | Step | Validation Loss | Cer |
|:-------------:|:-----:|:-----:|:---------------:|:------:|
| 0.2546 | 0.02 | 1000 | 0.1801 | 0.9245 |
| 0.1646 | 0.04 | 2000 | 0.1203 | 0.9243 |
| 0.1456 | 0.06 | 3000 | 0.1016 | 0.9242 |
| 0.1204 | 0.09 | 4000 | 0.0849 | 0.9242 |
| 0.1226 | 0.11 | 5000 | 0.0736 | 0.9241 |
| 0.1049 | 0.13 | 6000 | 0.0680 | 0.9240 |
| 0.1071 | 0.15 | 7000 | 0.0671 | 0.9241 |
| 0.1038 | 0.17 | 8000 | 0.0615 | 0.9240 |
| 0.0815 | 0.19 | 9000 | 0.0575 | 0.9240 |
| 0.0828 | 0.21 | 10000 | 0.0572 | 0.9241 |
| 0.0851 | 0.24 | 11000 | 0.0533 | 0.9241 |
| 0.0787 | 0.26 | 12000 | 0.0529 | 0.9241 |
| 0.0795 | 0.28 | 13000 | 0.0518 | 0.9239 |
| 0.0864 | 0.3 | 14000 | 0.0492 | 0.9239 |
| 0.0806 | 0.32 | 15000 | 0.0471 | 0.9239 |
| 0.0808 | 0.34 | 16000 | 0.0483 | 0.9238 |
| 0.071 | 0.36 | 17000 | 0.0469 | 0.9239 |
| 0.0661 | 0.38 | 18000 | 0.0446 | 0.9239 |
| 0.0641 | 0.41 | 19000 | 0.0437 | 0.9239 |
| 0.0686 | 0.43 | 20000 | 0.0428 | 0.9238 |
| 0.0597 | 0.45 | 21000 | 0.0431 | 0.9238 |
| 0.0585 | 0.47 | 22000 | 0.0417 | 0.9238 |
| 0.0675 | 0.49 | 23000 | 0.0406 | 0.9238 |
| 0.0678 | 0.51 | 24000 | 0.0395 | 0.9238 |
| 0.0581 | 0.53 | 25000 | 0.0393 | 0.9238 |
| 0.0569 | 0.56 | 26000 | 0.0371 | 0.9239 |
| 0.0632 | 0.58 | 27000 | 0.0378 | 0.9238 |
| 0.0589 | 0.6 | 28000 | 0.0377 | 0.9238 |
| 0.0511 | 0.62 | 29000 | 0.0366 | 0.9237 |
| 0.0651 | 0.64 | 30000 | 0.0358 | 0.9239 |
| 0.0594 | 0.66 | 31000 | 0.0356 | 0.9238 |
| 0.054 | 0.68 | 32000 | 0.0368 | 0.9238 |
| 0.0498 | 0.71 | 33000 | 0.0353 | 0.9238 |
| 0.0559 | 0.73 | 34000 | 0.0337 | 0.9238 |
| 0.0502 | 0.75 | 35000 | 0.0341 | 0.9238 |
| 0.0588 | 0.77 | 36000 | 0.0339 | 0.9239 |
| 0.0487 | 0.79 | 37000 | 0.0338 | 0.9237 |
| 0.0489 | 0.81 | 38000 | 0.0333 | 0.9236 |
| 0.0493 | 0.83 | 39000 | 0.0331 | 0.9237 |
| 0.0481 | 0.85 | 40000 | 0.0323 | 0.9237 |
| 0.0444 | 0.88 | 41000 | 0.0318 | 0.9237 |
| 0.0446 | 0.9 | 42000 | 0.0311 | 0.9238 |
| 0.0469 | 0.92 | 43000 | 0.0311 | 0.9237 |
| 0.0525 | 0.94 | 44000 | 0.0312 | 0.9237 |
| 0.042 | 0.96 | 45000 | 0.0312 | 0.9236 |
| 0.0541 | 0.98 | 46000 | 0.0304 | 0.9237 |
| 0.0417 | 1.0 | 47000 | 0.0293 | 0.9238 |
| 0.0369 | 1.03 | 48000 | 0.0305 | 0.9237 |
| 0.0357 | 1.05 | 49000 | 0.0297 | 0.9237 |
| 0.0394 | 1.07 | 50000 | 0.0296 | 0.9237 |
| 0.0343 | 1.09 | 51000 | 0.0288 | 0.9237 |
| 0.037 | 1.11 | 52000 | 0.0286 | 0.9237 |
| 0.0367 | 1.13 | 53000 | 0.0281 | 0.9237 |
| 0.0336 | 1.15 | 54000 | 0.0287 | 0.9236 |
| 0.0331 | 1.18 | 55000 | 0.0275 | 0.9237 |
| 0.0381 | 1.2 | 56000 | 0.0276 | 0.9237 |
| 0.0438 | 1.22 | 57000 | 0.0269 | 0.9237 |
| 0.0319 | 1.24 | 58000 | 0.0274 | 0.9236 |
| 0.0364 | 1.26 | 59000 | 0.0265 | 0.9237 |
| 0.0402 | 1.28 | 60000 | 0.0262 | 0.9237 |
| 0.0341 | 1.3 | 61000 | 0.0259 | 0.9237 |
| 0.0346 | 1.32 | 62000 | 0.0258 | 0.9237 |
| 0.0378 | 1.35 | 63000 | 0.0258 | 0.9236 |
| 0.0372 | 1.37 | 64000 | 0.0253 | 0.9237 |
| 0.0375 | 1.39 | 65000 | 0.0248 | 0.9237 |
| 0.0336 | 1.41 | 66000 | 0.0246 | 0.9236 |
| 0.031 | 1.43 | 67000 | 0.0246 | 0.9237 |
| 0.0344 | 1.45 | 68000 | 0.0248 | 0.9236 |
| 0.0307 | 1.47 | 69000 | 0.0244 | 0.9236 |
| 0.0293 | 1.5 | 70000 | 0.0239 | 0.9237 |
| 0.0406 | 1.52 | 71000 | 0.0235 | 0.9236 |
| 0.0273 | 1.54 | 72000 | 0.0235 | 0.9236 |
| 0.0316 | 1.56 | 73000 | 0.0234 | 0.9235 |
| 0.0308 | 1.58 | 74000 | 0.0229 | 0.9236 |
| 0.0291 | 1.6 | 75000 | 0.0229 | 0.9236 |
| 0.0325 | 1.62 | 76000 | 0.0229 | 0.9236 |
| 0.0347 | 1.65 | 77000 | 0.0224 | 0.9237 |
| 0.0268 | 1.67 | 78000 | 0.0226 | 0.9237 |
| 0.0279 | 1.69 | 79000 | 0.0219 | 0.9236 |
| 0.0247 | 1.71 | 80000 | 0.0220 | 0.9235 |
| 0.0259 | 1.73 | 81000 | 0.0215 | 0.9236 |
| 0.0294 | 1.75 | 82000 | 0.0217 | 0.9235 |
| 0.0267 | 1.77 | 83000 | 0.0217 | 0.9236 |
| 0.0273 | 1.79 | 84000 | 0.0213 | 0.9236 |
| 0.0242 | 1.82 | 85000 | 0.0213 | 0.9236 |
| 0.0254 | 1.84 | 86000 | 0.0210 | 0.9236 |
| 0.0273 | 1.86 | 87000 | 0.0209 | 0.9236 |
| 0.0261 | 1.88 | 88000 | 0.0210 | 0.9235 |
| 0.0244 | 1.9 | 89000 | 0.0206 | 0.9235 |
| 0.0256 | 1.92 | 90000 | 0.0206 | 0.9235 |
| 0.0283 | 1.94 | 91000 | 0.0205 | 0.9235 |
| 0.0255 | 1.97 | 92000 | 0.0204 | 0.9235 |
| 0.022 | 1.99 | 93000 | 0.0203 | 0.9235 |
### Framework versions
- Transformers 4.27.3
- Pytorch 2.0.0+cu117
- Datasets 2.10.1
- Tokenizers 0.13.2
|