Text2Text Generation
Transformers
Safetensors
mt5
Inference Endpoints
lmeribal commited on
Commit
e3e2445
1 Parent(s): 0495ca1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -27,6 +27,8 @@ pipeline_tag: text2text-generation
27
  ## Model Information
28
  This is a multilingual 3.7B text detoxification model for 9 languages built on [TextDetox 2024 shared task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html) based on [mT0-xl](https://huggingface.co/bigscience/mt0-xl). The model was trained in a two-step setup: the first step is full fine-tuning on different parallel text detoxification datasets, and the second step is ORPO alignment on a self-annotated preference dataset collected using toxicity and similarity classifiers. See the paper for more details.
29
 
 
 
30
  ## Example usage
31
 
32
  ```python
@@ -64,7 +66,4 @@ tokenizer = AutoTokenizer.from_pretrained('s-nlp/mt0-xl-detox-orpo')
64
  return tokenizer.batch_decode(outputs, skip_special_tokens=True)
65
  ```
66
 
67
- ## Human evaluation
68
-
69
-
70
- ## Automatic evaluation
 
27
  ## Model Information
28
  This is a multilingual 3.7B text detoxification model for 9 languages built on [TextDetox 2024 shared task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html) based on [mT0-xl](https://huggingface.co/bigscience/mt0-xl). The model was trained in a two-step setup: the first step is full fine-tuning on different parallel text detoxification datasets, and the second step is ORPO alignment on a self-annotated preference dataset collected using toxicity and similarity classifiers. See the paper for more details.
29
 
30
+ The model shows state-of-the-art performance for the Ukrainian language on the [TextDetox 2024 shared task](https://pan.webis.de/clef24/pan24-web/text-detoxification.html), top-2 scores for Arabic, and near state-of-the-art performance for other languages. Overall, the model is the second best approach on the entire human-rated leaderboard.
31
+
32
  ## Example usage
33
 
34
  ```python
 
66
  return tokenizer.batch_decode(outputs, skip_special_tokens=True)
67
  ```
68
 
69
+