textdetox
/

xlmr-large-toxicity-classifier-v2

Text Classification

Model card Files Files and versions Community

dardem commited on Mar 20

Commit

94331a8

·

verified ·

1 Parent(s): b94d83f

Update README.md

Files changed (1) hide show

README.md +53 -1

README.md CHANGED Viewed

@@ -15,6 +15,58 @@ language:
 - he
 - am
 - de
 ---
-## Multilingual Toxicity Classifier for 15 Languages

 - he
 - am
 - de
+license: openrail++
+datasets:
+- textdetox/multilingual_toxicity_dataset
+metrics:
+- f1
+base_model:
+- FacebookAI/xlm-roberta-large
 ---
+## Multilingual Toxicity Classifier for 15 Languages (2025)
+This is an instance of [xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large) that was fine-tuned on binary toxicity classification task based on our updated (2025) dataset [textdetox/multilingual_toxicity_dataset](https://huggingface.co/datasets/textdetox/multilingual_toxicity_dataset).
+Now, the models covers 15 languages from various language families:
+* English (en)
+* Russian (ru)
+* Ukrainian (uk)
+* German (de)
+* Spanish (es)
+* Arabic (ar)
+* Amharic (am)
+* Hindi (hi)
+* Chinese (zh)
+* Italian (it)
+* French (fr)
+* Hinglish (hin)
+* Hebrew (he)
+* Japanese (ja)
+* Tatar (tt)
+The evaluation results on the test set are the following:
+|          | F1    |
+|----------|-------|
+| en       | 0.9650|
+| ru       | 0.9790|
+| uk       | 0.9251|
+| de       | 0.8758|
+| es       | 0.8700|
+| ar       | 0.7780|
+| am       | 0.7780|
+| hi       | 0.9360|
+| zh       | 0.7315|
+| it       |       |
+| fr       |       |
+| hin      |       |
+| he       |       |
+| ja       |       |
+| tt       |       |
+## Citation
+The model is prepared for TextDetox 2025 Shared Task evaluation.
+Citation TBD soon.