antalvdb's picture
Upload README.md
050e67d
|
raw
history blame
13.4 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: bart-base-spelling-nl-1m-3
    results: []

bart-base-spelling-nl-1m-3

This model is a Dutch fine-tuned version of facebook/bart-base.

It achieves the following results on the evaluation set:

  • Loss: 0.0053
  • Cer: 0.0117

Model description

This is a fine-tuned version of facebook/bart-base trained on spelling correction. It leans on the excellent work by Oliver Guhr (github, huggingface). Training was performed on an AWS EC2 instance (g5.xlarge) on a single GPU, and took about two days.

Intended uses & limitations

The intended use for this model is to be a component of the Valkuil.net context-sensitive spelling checker.

Training and evaluation data

The model was trained on a Dutch dataset composed of 2,964,203 lines of text from three public Dutch sources, downloaded from the Opus corpus:

  • nl-europarlv7.1m.txt (1,000,000 lines)
  • nl-opensubtitles2016.1m.txt (1,000,000 lines)
  • nl-wikipedia.txt (964,203 lines)

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 2
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2.0

Training results

Training Loss Epoch Step Validation Loss Cer
0.0834 0.01 1000 0.0603 0.9216
0.0566 0.02 2000 0.0467 0.9217
0.0534 0.03 3000 0.0436 0.9216
0.0461 0.04 4000 0.0392 0.9216
0.0416 0.05 5000 0.0354 0.9216
0.0433 0.06 6000 0.0336 0.9216
0.045 0.08 7000 0.0315 0.9216
0.0452 0.09 8000 0.0305 0.9216
0.04 0.1 9000 0.0281 0.9216
0.0307 0.11 10000 0.0273 0.9216
0.0382 0.12 11000 0.0269 0.9216
0.036 0.13 12000 0.0254 0.9216
0.0412 0.14 13000 0.0258 0.9216
0.0404 0.15 14000 0.0238 0.9216
0.0265 0.16 15000 0.0239 0.9216
0.029 0.17 16000 0.0235 0.9216
0.0295 0.18 17000 0.0218 0.9216
0.0262 0.19 18000 0.0214 0.9216
0.0274 0.21 19000 0.0222 0.9216
0.0317 0.22 20000 0.0204 0.9216
0.0248 0.23 21000 0.0204 0.9216
0.0258 0.24 22000 0.0195 0.9216
0.0247 0.25 23000 0.0188 0.9216
0.0285 0.26 24000 0.0191 0.9215
0.031 0.27 25000 0.0192 0.9216
0.0267 0.28 26000 0.0188 0.9216
0.0245 0.29 27000 0.0177 0.9216
0.0258 0.3 28000 0.0177 0.9216
0.0235 0.31 29000 0.0169 0.9216
0.0235 0.32 30000 0.0176 0.9216
0.0223 0.34 31000 0.0165 0.9216
0.0219 0.35 32000 0.0167 0.9216
0.0214 0.36 33000 0.0165 0.9216
0.0232 0.37 34000 0.0163 0.9216
0.0192 0.38 35000 0.0162 0.9216
0.0159 0.39 36000 0.0160 0.9216
0.0205 0.4 37000 0.0150 0.9216
0.0197 0.41 38000 0.0152 0.9216
0.0205 0.42 39000 0.0150 0.9216
0.0182 0.43 40000 0.0145 0.9216
0.0204 0.44 41000 0.0139 0.9216
0.0201 0.45 42000 0.0146 0.9216
0.0202 0.46 43000 0.0132 0.9215
0.0219 0.48 44000 0.0146 0.9216
0.0161 0.49 45000 0.0134 0.9215
0.0172 0.5 46000 0.0137 0.9216
0.0199 0.51 47000 0.0133 0.9215
0.0211 0.52 48000 0.0132 0.9215
0.0184 0.53 49000 0.0136 0.9216
0.0191 0.54 50000 0.0129 0.9216
0.017 0.55 51000 0.0127 0.9216
0.0188 0.56 52000 0.0127 0.9215
0.0157 0.57 53000 0.0128 0.9216
0.0158 0.58 54000 0.0127 0.9216
0.0145 0.59 55000 0.0117 0.9216
0.0148 0.61 56000 0.0123 0.9216
0.0153 0.62 57000 0.0115 0.9216
0.0182 0.63 58000 0.0115 0.9216
0.0178 0.64 59000 0.0112 0.9215
0.0187 0.65 60000 0.0113 0.9215
0.0174 0.66 61000 0.0119 0.9216
0.0135 0.67 62000 0.0115 0.9215
0.0167 0.68 63000 0.0112 0.9216
0.0163 0.69 64000 0.0111 0.9215
0.0128 0.7 65000 0.0110 0.9215
0.0178 0.71 66000 0.0113 0.9215
0.0142 0.72 67000 0.0110 0.9215
0.0143 0.74 68000 0.0110 0.9215
0.0168 0.75 69000 0.0106 0.9216
0.0136 0.76 70000 0.0107 0.9215
0.0141 0.77 71000 0.0104 0.9215
0.0217 0.78 72000 0.0115 0.9216
0.012 0.79 73000 0.0105 0.9215
0.0141 0.8 74000 0.0100 0.9215
0.0136 0.81 75000 0.0096 0.9215
0.0106 0.82 76000 0.0104 0.9216
0.0176 0.83 77000 0.0102 0.9216
0.0169 0.84 78000 0.0099 0.9215
0.0118 0.85 79000 0.0102 0.9215
0.0178 0.86 80000 0.0095 0.9215
0.0145 0.88 81000 0.0097 0.9216
0.0154 0.89 82000 0.0099 0.9215
0.0129 0.9 83000 0.0094 0.9215
0.0125 0.91 84000 0.0097 0.9215
0.0147 0.92 85000 0.0093 0.9215
0.0145 0.93 86000 0.0091 0.9215
0.0121 0.94 87000 0.0089 0.9215
0.0125 0.95 88000 0.0094 0.9215
0.0113 0.96 89000 0.0088 0.9216
0.0098 0.97 90000 0.0094 0.9216
0.0137 0.98 91000 0.0089 0.9215
0.0105 0.99 92000 0.0091 0.9215
0.01 1.01 93000 0.0090 0.9216
0.0103 1.02 94000 0.0087 0.9216
0.0103 1.03 95000 0.0091 0.9215
0.0107 1.04 96000 0.0088 0.9216
0.0109 1.05 97000 0.0087 0.9215
0.0102 1.06 98000 0.0090 0.9216
0.0109 1.07 99000 0.0087 0.9215
0.0094 1.08 100000 0.0084 0.9215
0.009 1.09 101000 0.0085 0.9215
0.0085 1.1 102000 0.0084 0.9216
0.0123 1.11 103000 0.0085 0.9215
0.0094 1.12 104000 0.0084 0.9215
0.0076 1.14 105000 0.0081 0.9215
0.0119 1.15 106000 0.0079 0.9216
0.0079 1.16 107000 0.0081 0.9216
0.0108 1.17 108000 0.0080 0.9216
0.01 1.18 109000 0.0077 0.9216
0.0112 1.19 110000 0.0077 0.9216
0.0092 1.2 111000 0.0076 0.9215
0.0097 1.21 112000 0.0077 0.9215
0.0093 1.22 113000 0.0078 0.9215
0.0106 1.23 114000 0.0077 0.9215
0.0107 1.24 115000 0.0076 0.9215
0.0111 1.25 116000 0.0077 0.9215
0.0118 1.26 117000 0.0076 0.9215
0.0088 1.28 118000 0.0076 0.9215
0.01 1.29 119000 0.0076 0.9215
0.0102 1.3 120000 0.0076 0.9215
0.0106 1.31 121000 0.0076 0.9215
0.0099 1.32 122000 0.0077 0.9215
0.0099 1.33 123000 0.0077 0.9216
0.0105 1.34 124000 0.0075 0.9216
0.0082 1.35 125000 0.0074 0.9216
0.0088 1.36 126000 0.0072 0.9215
0.0077 1.37 127000 0.0070 0.9215
0.0063 1.38 128000 0.0074 0.9216
0.0084 1.39 129000 0.0069 0.9215
0.0085 1.41 130000 0.0071 0.9215
0.0107 1.42 131000 0.0067 0.9215
0.0064 1.43 132000 0.0068 0.9215
0.0064 1.44 133000 0.0069 0.9215
0.0139 1.45 134000 0.0067 0.9216
0.0093 1.46 135000 0.0068 0.9216
0.009 1.47 136000 0.0067 0.9215
0.0083 1.48 137000 0.0065 0.9216
0.0108 1.49 138000 0.0064 0.9215
0.0074 1.5 139000 0.0066 0.9215
0.009 1.51 140000 0.0064 0.9216
0.0062 1.52 141000 0.0064 0.9215
0.007 1.54 142000 0.0063 0.9215
0.0082 1.55 143000 0.0062 0.9215
0.0077 1.56 144000 0.0064 0.9215
0.0094 1.57 145000 0.0062 0.9215
0.0085 1.58 146000 0.0063 0.9215
0.0091 1.59 147000 0.0062 0.9215
0.0087 1.6 148000 0.0061 0.9215
0.0066 1.61 149000 0.0062 0.9215
0.0087 1.62 150000 0.0061 0.9215
0.0059 1.63 151000 0.0059 0.9215
0.0086 1.64 152000 0.0059 0.9215
0.0066 1.65 153000 0.0059 0.9215
0.0076 1.66 154000 0.0058 0.9215
0.0073 1.68 155000 0.0060 0.9215
0.0118 1.69 156000 0.0060 0.9215
0.0058 1.7 157000 0.0059 0.9215
0.0093 1.71 158000 0.0058 0.9215
0.0079 1.72 159000 0.0058 0.9215
0.0063 1.73 160000 0.0059 0.9215
0.0065 1.74 161000 0.0056 0.9215
0.0105 1.75 162000 0.0057 0.9215
0.0075 1.76 163000 0.0055 0.9215
0.0069 1.77 164000 0.0056 0.9215
0.0075 1.78 165000 0.0056 0.9215
0.0067 1.79 166000 0.0055 0.9215
0.0069 1.81 167000 0.0056 0.9215
0.0063 1.82 168000 0.0056 0.9215
0.0058 1.83 169000 0.0055 0.9215
0.0058 1.84 170000 0.0054 0.9215
0.0081 1.85 171000 0.0055 0.9215
0.0071 1.86 172000 0.0054 0.9215
0.0077 1.87 173000 0.0054 0.9215
0.0053 1.88 174000 0.0053 0.9215
0.0067 1.89 175000 0.0053 0.9215
0.0066 1.9 176000 0.0053 0.9215
0.0084 1.91 177000 0.0053 0.9215
0.0066 1.92 178000 0.0052 0.9215
0.0057 1.94 179000 0.0053 0.9215
0.0059 1.95 180000 0.0052 0.9215
0.0053 1.96 181000 0.0053 0.9215
0.0056 1.97 182000 0.0052 0.9215
0.0054 1.98 183000 0.0052 0.9215
0.0053 1.99 184000 0.0052 0.9215
0.0066 2.0 185000 0.0052 0.9215

Framework versions

  • Transformers 4.27.3
  • Pytorch 2.0.0+cu117
  • Datasets 2.10.1
  • Tokenizers 0.13.2