Update README.md
Browse files
README.md
CHANGED
@@ -27,6 +27,8 @@ pipeline_tag: text2text-generation
|
|
27 |
## Model Information
|
28 |
This is a multilingual 3.7B text detoxification model for 9 languages built on [TextDetox 2024 shared task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html) based on [mT0-xl](https://huggingface.co/bigscience/mt0-xl). The model was trained in a two-step setup: the first step is full fine-tuning on different parallel text detoxification datasets, and the second step is ORPO alignment on a self-annotated preference dataset collected using toxicity and similarity classifiers. See the paper for more details.
|
29 |
|
|
|
|
|
30 |
## Example usage
|
31 |
|
32 |
```python
|
@@ -64,7 +66,4 @@ tokenizer = AutoTokenizer.from_pretrained('s-nlp/mt0-xl-detox-orpo')
|
|
64 |
return tokenizer.batch_decode(outputs, skip_special_tokens=True)
|
65 |
```
|
66 |
|
67 |
-
|
68 |
-
|
69 |
-
|
70 |
-
## Automatic evaluation
|
|
|
27 |
## Model Information
|
28 |
This is a multilingual 3.7B text detoxification model for 9 languages built on [TextDetox 2024 shared task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html) based on [mT0-xl](https://huggingface.co/bigscience/mt0-xl). The model was trained in a two-step setup: the first step is full fine-tuning on different parallel text detoxification datasets, and the second step is ORPO alignment on a self-annotated preference dataset collected using toxicity and similarity classifiers. See the paper for more details.
|
29 |
|
30 |
+
The model shows state-of-the-art performance for the Ukrainian language on the [TextDetox 2024 shared task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html), top-2 scores for Arabic, and near state-of-the-art performance for other languages. Overall, the model is the second best approach on the entire human-rated leaderboard.
|
31 |
+
|
32 |
## Example usage
|
33 |
|
34 |
```python
|
|
|
66 |
return tokenizer.batch_decode(outputs, skip_special_tokens=True)
|
67 |
```
|
68 |
|
69 |
+
|
|
|
|
|
|