File size: 3,536 Bytes

dd1b02a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6ec6eb0
 
 
b32bfa0
f9e8db6
6ec6eb0
b8a417e
6ec6eb0
 
 
 
 
 
 
ea24df2
 
6ec6eb0
 
 
 
b32bfa0
 
ffef542
b32bfa0
ffef542
 
b32bfa0
 
 
 
 
ffef542
b32bfa0
 
ffef542
b32bfa0
 
 
 
 
 
 
6ec6eb0
 
 
b8a417e
 
 
 
 
 
 
6ec6eb0
 
 
 
b8a417e
6ec6eb0
 
3edfce5

---
datasets:
- PleIAs/ToxicCommons
language:
- en
- nl
- es
- de
- pl
- la
- it
- fr
- pt
pipeline_tag: text-classification
---

# Celadon Toxicity Classifier


Celadon is a DeBERTa-v3-small finetune with five classification heads, trained on 600k samples from [Toxic Commons](https://huggingface.co/datasets/PleIAs/ToxicCommons).

It classfies toxicity along five dimension:
*  **Race and origin-based bias**: includes racism as well as bias against someone’s country or region of origin or immigration status, especially immigrant or refugee status. 
*  **Gender and sexuality-based bias**: includes sexism and misogyny, homophobia, transphobia, and sexual harassment. 
*  **Religious bias**: any bias or stereotype based on someone’s religion. 
*  **Ability bias**: bias according to someone’s physical, mental, or intellectual ability or disability. 
*  **Violence and abuse**: overly graphic descriptions of violence, threats of violence, or calls or incitement of violence.


Read more about the training details in the paper, [Toxicity of the Commons: Curating Open-Source Pre-Training Data](https://arxiv.org/pdf/2410.22587) by [Catherine Arnett](https://huggingface.co/catherinearnett), [Eliot Jones](https://huggingface.co/eliotj), [Ivan P. Yamshchikov](https://huggingface.co/ivan-the-bearable), [Pierre-Carl Langlais](https://huggingface.co/Pclanglais). 
For more detailed code regarding generating the annotations in Toxic Commons, training the model, and using the model, please refer to the official [GitHub](https://github.com/Pleias/toxic-commons) repository. 


# How to Use

```
from transformers import AutoTokenizer
from celadon.model import MultiHeadDebertaForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("celadon")
model = MultiHeadDebertaForSequenceClassification.from_pretrained("celadon")
model.eval()

sample_text = "This is an example of a normal sentence"

inputs = tokenizer(sample_text, return_tensors="pt", padding=True, truncation=True)
outputs = model(input_ids=inputs['input_ids'], attention_mask=inputs['attention_mask'])

categories = ['Race/Origin', 'Gender/Sex', 'Religion', 'Ability', 'Violence']
predictions = outputs.argmax(dim=-1).squeeze().tolist()

# Print the classification results for each category
print(f"Text: {sample_text}")
for i, category in enumerate(categories):
    print(f"Prediction for Category {category}: {predictions[i]}")
```

# How to Cite

```
@article{arnett2024toxicity,
  title={{Toxicity of the Commons: Curating Open-Source Pre-Training Data}},
  author={Arnett, Catherine and Jones, Eliot and Yamshchikov, Ivan P. and Langlais, Pierre-Carl},
  journal={arXiv preprint arXiv:2410.22587},
  url={https://arxiv.org/pdf/2410.22587},
  year={2024}
}
```

# About

Trained by [Eliot Jones](https://huggingface.co/eliotj) while working at [Pleias](https://huggingface.co/PleIAs). This project was made possible by Jean Zay compute grant #GC011015451.

## About the Name
Celadon is a type of porcelain, whose European name refers to its jade-like color. The Chinese name for this type of pottery is 青瓷, which means blue-green ceramic. The earliest examples of celadon pottery date from the first century AD. Celadon was first brought to Europe by the Dutch East India Company in the 16th and 17th centuries. In order to increase sales, as the ceramics were very expensive to bring to Europe from China, the Dutch made up fantastical properties of the ceramics, for example that celadon would change color or break in the presence of poison.