metadata

license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: bart-base-spelling-nl
    results: []

bart-base-spelling-nl

This model is a Dutch fine-tuned version of facebook/bart-base.

It achieves the following results on the evaluation set:

Loss: 0.0217
Cer: 0.0147

Model description

This is a text-to-text fine-tuned version of facebook/bart-base trained on spelling correction. It leans on the excellent work by Oliver Guhr (github, huggingface). Training was performed on an AWS EC2 instance (g5.xlarge) on a single GPU.

Intended uses & limitations

The intended use for this model is to be a component of the Valkuil.net context-sensitive spelling checker. A next version of the model will be trained on more data.

Training and evaluation data

The model was trained on a Dutch dataset composed of 1,500,000 lines of text from three public Dutch sources, downloaded from the Opus corpus:

nl-europarlv7.100k.txt (500,000 lines)
nl-opensubtitles2016.100k.txt (500,000 lines)
nl-wikipedia.100k.txt (500,000 lines)

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 2
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 2.0

Training results

Training Loss	Epoch	Step	Validation Loss	Cer
0.2546	0.02	1000	0.1801	0.9245
0.1646	0.04	2000	0.1203	0.9243
0.1456	0.06	3000	0.1016	0.9242
0.1204	0.09	4000	0.0849	0.9242
0.1226	0.11	5000	0.0736	0.9241
0.1049	0.13	6000	0.0680	0.9240
0.1071	0.15	7000	0.0671	0.9241
0.1038	0.17	8000	0.0615	0.9240
0.0815	0.19	9000	0.0575	0.9240
0.0828	0.21	10000	0.0572	0.9241
0.0851	0.24	11000	0.0533	0.9241
0.0787	0.26	12000	0.0529	0.9241
0.0795	0.28	13000	0.0518	0.9239
0.0864	0.3	14000	0.0492	0.9239
0.0806	0.32	15000	0.0471	0.9239
0.0808	0.34	16000	0.0483	0.9238
0.071	0.36	17000	0.0469	0.9239
0.0661	0.38	18000	0.0446	0.9239
0.0641	0.41	19000	0.0437	0.9239
0.0686	0.43	20000	0.0428	0.9238
0.0597	0.45	21000	0.0431	0.9238
0.0585	0.47	22000	0.0417	0.9238
0.0675	0.49	23000	0.0406	0.9238
0.0678	0.51	24000	0.0395	0.9238
0.0581	0.53	25000	0.0393	0.9238
0.0569	0.56	26000	0.0371	0.9239
0.0632	0.58	27000	0.0378	0.9238
0.0589	0.6	28000	0.0377	0.9238
0.0511	0.62	29000	0.0366	0.9237
0.0651	0.64	30000	0.0358	0.9239
0.0594	0.66	31000	0.0356	0.9238
0.054	0.68	32000	0.0368	0.9238
0.0498	0.71	33000	0.0353	0.9238
0.0559	0.73	34000	0.0337	0.9238
0.0502	0.75	35000	0.0341	0.9238
0.0588	0.77	36000	0.0339	0.9239
0.0487	0.79	37000	0.0338	0.9237
0.0489	0.81	38000	0.0333	0.9236
0.0493	0.83	39000	0.0331	0.9237
0.0481	0.85	40000	0.0323	0.9237
0.0444	0.88	41000	0.0318	0.9237
0.0446	0.9	42000	0.0311	0.9238
0.0469	0.92	43000	0.0311	0.9237
0.0525	0.94	44000	0.0312	0.9237
0.042	0.96	45000	0.0312	0.9236
0.0541	0.98	46000	0.0304	0.9237
0.0417	1.0	47000	0.0293	0.9238
0.0369	1.03	48000	0.0305	0.9237
0.0357	1.05	49000	0.0297	0.9237
0.0394	1.07	50000	0.0296	0.9237
0.0343	1.09	51000	0.0288	0.9237
0.037	1.11	52000	0.0286	0.9237
0.0367	1.13	53000	0.0281	0.9237
0.0336	1.15	54000	0.0287	0.9236
0.0331	1.18	55000	0.0275	0.9237
0.0381	1.2	56000	0.0276	0.9237
0.0438	1.22	57000	0.0269	0.9237
0.0319	1.24	58000	0.0274	0.9236
0.0364	1.26	59000	0.0265	0.9237
0.0402	1.28	60000	0.0262	0.9237
0.0341	1.3	61000	0.0259	0.9237
0.0346	1.32	62000	0.0258	0.9237
0.0378	1.35	63000	0.0258	0.9236
0.0372	1.37	64000	0.0253	0.9237
0.0375	1.39	65000	0.0248	0.9237
0.0336	1.41	66000	0.0246	0.9236
0.031	1.43	67000	0.0246	0.9237
0.0344	1.45	68000	0.0248	0.9236
0.0307	1.47	69000	0.0244	0.9236
0.0293	1.5	70000	0.0239	0.9237
0.0406	1.52	71000	0.0235	0.9236
0.0273	1.54	72000	0.0235	0.9236
0.0316	1.56	73000	0.0234	0.9235
0.0308	1.58	74000	0.0229	0.9236
0.0291	1.6	75000	0.0229	0.9236
0.0325	1.62	76000	0.0229	0.9236
0.0347	1.65	77000	0.0224	0.9237
0.0268	1.67	78000	0.0226	0.9237
0.0279	1.69	79000	0.0219	0.9236
0.0247	1.71	80000	0.0220	0.9235
0.0259	1.73	81000	0.0215	0.9236
0.0294	1.75	82000	0.0217	0.9235
0.0267	1.77	83000	0.0217	0.9236
0.0273	1.79	84000	0.0213	0.9236
0.0242	1.82	85000	0.0213	0.9236
0.0254	1.84	86000	0.0210	0.9236
0.0273	1.86	87000	0.0209	0.9236
0.0261	1.88	88000	0.0210	0.9235
0.0244	1.9	89000	0.0206	0.9235
0.0256	1.92	90000	0.0206	0.9235
0.0283	1.94	91000	0.0205	0.9235
0.0255	1.97	92000	0.0204	0.9235
0.022	1.99	93000	0.0203	0.9235

Framework versions

Transformers 4.27.3
Pytorch 2.0.0+cu117
Datasets 2.10.1
Tokenizers 0.13.2