xlm-roberta-large-finetuned-wikiner-fr

This model is a fine-tuned version of xlm-roberta-large on the Alizee/wikiner_fr_mixed_caps.

Why this model?

Credits to Jean-Baptiste for building the current "best" model for French NER "camembert-ner" based on wikiNER (Jean-Baptiste/wikiner_fr).

xlm-roberta-large models fine-tuned on conll03 English and especially German were outperforming the Camembert-NER model in my own tasks. This inspired me to build a French version of the xlm-roberta-large models based on the wikiNER dataset, with the hope to create a slightly improved standard for French 4-entity NER.

Intended uses & limitations

4-entity NER for French, with the following tags:

Abbreviation	Description
O	Outside of a named entity
MISC	Miscellaneous entity
PER	Person’s name
ORG	Organization
LOC	Location

Performance

It achieves the following results on the evaluation set:

Loss: 0.0518
Precision: 0.8881
Recall: 0.9014
F1: 0.8947
Accuracy: 0.9855

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1.5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.02
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
0.1032	0.1	374	0.0853	0.7645	0.8170	0.7899	0.9742
0.0767	0.2	748	0.0721	0.8111	0.8423	0.8264	0.9785
0.074	0.3	1122	0.0655	0.8252	0.8502	0.8375	0.9797
0.0634	0.4	1496	0.0629	0.8423	0.8694	0.8556	0.9809
0.0605	0.5	1870	0.0610	0.8515	0.8711	0.8612	0.9808
0.0578	0.6	2244	0.0594	0.8633	0.8744	0.8688	0.9822
0.0592	0.7	2618	0.0555	0.8624	0.8833	0.8727	0.9825
0.0567	0.8	2992	0.0534	0.8626	0.8838	0.8731	0.9830
0.0522	0.9	3366	0.0563	0.8560	0.8771	0.8664	0.9818
0.0516	1.0	3739	0.0556	0.8702	0.8869	0.8785	0.9831
0.0438	1.0	3740	0.0558	0.8712	0.8873	0.8792	0.9831
0.0395	1.1	4114	0.0565	0.8696	0.8856	0.8775	0.9830
0.0371	1.2	4488	0.0536	0.8762	0.8910	0.8835	0.9838
0.0403	1.3	4862	0.0531	0.8709	0.8887	0.8797	0.9835
0.0366	1.4	5236	0.0517	0.8791	0.8912	0.8851	0.9843
0.037	1.5	5610	0.0510	0.8830	0.8936	0.8883	0.9847
0.0368	1.6	5984	0.0492	0.8795	0.8940	0.8867	0.9845
0.0359	1.7	6358	0.0501	0.8833	0.8986	0.8909	0.9850
0.034	1.8	6732	0.0496	0.8852	0.8986	0.8918	0.9852
0.0327	1.9	7106	0.0512	0.8762	0.8948	0.8854	0.9843
0.0325	2.0	7478	0.0512	0.8829	0.8945	0.8887	0.9844
0.01	2.0	7480	0.0512	0.8836	0.8945	0.8890	0.9843
0.0232	2.1	7854	0.0526	0.8870	0.9002	0.8936	0.9852
0.0235	2.2	8228	0.0530	0.8841	0.8983	0.8911	0.9848
0.0211	2.3	8602	0.0542	0.8875	0.9008	0.8941	0.9852
0.0235	2.4	8976	0.0525	0.8883	0.9008	0.8945	0.9855
0.0232	2.5	9350	0.0525	0.8874	0.9013	0.8943	0.9855
0.0238	2.6	9724	0.0517	0.8861	0.9011	0.8935	0.9854
0.0223	2.7	10098	0.0513	0.8893	0.9016	0.8954	0.9856
0.0226	2.8	10472	0.0517	0.8892	0.9017	0.8954	0.9856
0.0228	2.9	10846	0.0517	0.8879	0.9013	0.8945	0.9855
0.0235	3.0	11217	0.0518	0.8881	0.9014	0.8947	0.9855

Framework versions

Transformers 4.36.2
Pytorch 2.0.1
Datasets 2.16.1
Tokenizers 0.15.0

Alizee
/

xlm-roberta-large-finetuned-wikiner-fr