---
language:
- en
library_name: transformers
---

# Tiny-Toxic-Detector

A tiny comment toxicity classifier model at only 2M parameters. With only ~10MB ram and fast inference we bring you one of the best toxicity classifiers that outperforms models over 50 times its size.

A paper on this model is being released soon.


## Benchmarks

The Tiny-Toxic-Detector achieves an impressive 90.26% on the Toxigen benchmark and 87.34% on the Jigsaw-Toxic-Comment-Classification-Challenge. Here we compare our results against other toxic classification models:


| Model                             | Size (parameters) | Toxigen (%) | Jigsaw (%) |
| --------------------------------- | ----------------- | ----------- | ---------- |
| lmsys/toxicchat-t5-large-v1.0     | 738M              | 72.67       | 88.82      |
| s-nlp/roberta toxicity classifier | 124M              | 88.41       | **94.92**  |
| mohsenfayyaz/toxicity-classifier  | 109M              | 81.50       | 83.31      |
| martin-ha/toxic-comment-model     | 67M               | 68.02       | 91.56      |
| **Tiny-toxic-detector**           | **2M**            | **90.26**   | 87.34      |


## Usage
This model uses custom architecture and requires some extra custom code to work. Below you can find the architecture and a fully-usable example.

<details>
  <summary>
    Architecture  
  </summary>

```python
import torch
import torch.nn as nn
from transformers import PreTrainedModel, PretrainedConfig, AutoTokenizer

# Define TinyTransformer model
class TinyTransformer(nn.Module):
    def __init__(self, vocab_size, embed_dim, num_heads, ff_dim, num_layers):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embed_dim)
        self.pos_encoding = nn.Parameter(torch.zeros(1, 512, embed_dim))
        encoder_layer = nn.TransformerEncoderLayer(d_model=embed_dim, nhead=num_heads, dim_feedforward=ff_dim, batch_first=True)
        self.transformer = nn.TransformerEncoder(encoder_layer, num_layers=num_layers)
        self.fc = nn.Linear(embed_dim, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        x = self.embedding(x) + self.pos_encoding[:, :x.size(1), :]
        x = self.transformer(x)
        x = x.mean(dim=1)  # Global average pooling
        x = self.fc(x)
        return self.sigmoid(x)

class TinyTransformerConfig(PretrainedConfig):
    model_type = "tiny_transformer"

    def __init__(self, vocab_size=30522, embed_dim=64, num_heads=2, ff_dim=128, num_layers=4, max_position_embeddings=512, **kwargs):
        super().__init__(**kwargs)
        self.vocab_size = vocab_size
        self.embed_dim = embed_dim
        self.num_heads = num_heads
        self.ff_dim = ff_dim
        self.num_layers = num_layers
        self.max_position_embeddings = max_position_embeddings

class TinyTransformerForSequenceClassification(PreTrainedModel):
    config_class = TinyTransformerConfig

    def __init__(self, config):
        super().__init__(config)
        self.num_labels = 1
        self.transformer = TinyTransformer(
            config.vocab_size,
            config.embed_dim,
            config.num_heads,
            config.ff_dim,
            config.num_layers
        )

    def forward(self, input_ids, attention_mask=None):
        outputs = self.transformer(input_ids)
        return {"logits": outputs}
```
</details>

<details>
  <summary>
    Full example  
  </summary>

```python
import torch
import torch.nn as nn
from transformers import PreTrainedModel, PretrainedConfig, AutoTokenizer

# Define TinyTransformer model
class TinyTransformer(nn.Module):
    def __init__(self, vocab_size, embed_dim, num_heads, ff_dim, num_layers):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embed_dim)
        self.pos_encoding = nn.Parameter(torch.zeros(1, 512, embed_dim))
        encoder_layer = nn.TransformerEncoderLayer(d_model=embed_dim, nhead=num_heads, dim_feedforward=ff_dim, batch_first=True)
        self.transformer = nn.TransformerEncoder(encoder_layer, num_layers=num_layers)
        self.fc = nn.Linear(embed_dim, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        x = self.embedding(x) + self.pos_encoding[:, :x.size(1), :]
        x = self.transformer(x)
        x = x.mean(dim=1)  # Global average pooling
        x = self.fc(x)
        return self.sigmoid(x)

class TinyTransformerConfig(PretrainedConfig):
    model_type = "tiny_transformer"

    def __init__(self, vocab_size=30522, embed_dim=64, num_heads=2, ff_dim=128, num_layers=4, max_position_embeddings=512, **kwargs):
        super().__init__(**kwargs)
        self.vocab_size = vocab_size
        self.embed_dim = embed_dim
        self.num_heads = num_heads
        self.ff_dim = ff_dim
        self.num_layers = num_layers
        self.max_position_embeddings = max_position_embeddings

class TinyTransformerForSequenceClassification(PreTrainedModel):
    config_class = TinyTransformerConfig

    def __init__(self, config):
        super().__init__(config)
        self.num_labels = 1
        self.transformer = TinyTransformer(
            config.vocab_size,
            config.embed_dim,
            config.num_heads,
            config.ff_dim,
            config.num_layers
        )

    def forward(self, input_ids, attention_mask=None):
        outputs = self.transformer(input_ids)
        return {"logits": outputs}

# Load the Tiny-Toxic-Detector model and tokenizer
def load_model_and_tokenizer():
    device = torch.device("cpu") # Due to GPU overhead inference is faster on CPU!

    # Load Tiny-toxic-detector
    config = TinyTransformerConfig.from_pretrained("AssistantsLab/Tiny-Toxic-Detector")
    model = TinyTransformerForSequenceClassification.from_pretrained("AssistantsLab/Tiny-Toxic-Detector", config=config).to(device)
    tokenizer = AutoTokenizer.from_pretrained("AssistantsLab/Tiny-Toxic-Detector")

    return model, tokenizer, device

# Prediction function
def predict_toxicity(text, model, tokenizer, device):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128, padding="max_length").to(device)
    if "token_type_ids" in inputs:
        del inputs["token_type_ids"]

    with torch.no_grad():
        outputs = model(**inputs)
    logits = outputs["logits"].squeeze()
    prediction = "Toxic" if logits > 0.5 else "Not Toxic"
    return prediction

def main():
    model, tokenizer, device = load_model_and_tokenizer()

    while True:
        print("Enter text to classify (or type 'exit' to quit):")
        text = input()

        if text.lower() == 'exit':
            print("Exiting...")
            break

        if text:
            prediction = predict_toxicity(text, model, tokenizer, device)
            print(f"Prediction: {prediction}")
        else:
            print("No text provided. Please enter some text.")

if __name__ == "__main__":
    main()
```
</details>

Please note that to predict toxicity you can use the following example:
```python
# Define architecture before this!
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128, padding="max_length").to(device)
if "token_type_ids" in inputs:
  del inputs["token_type_ids"]

with torch.no_grad():
  outputs = model(**inputs)
  logits = outputs["logits"].squeeze()
  prediction = "Toxic" if logits > 0.5 else "Not Toxic"
```


## Usage and Limitations

Toxicity classification models always have certain limitations you should be aware of, and this model is no different.

### Intended Usage

The Tiny-toxic-detector is designed to classify comments for toxicity. It is particularly useful in scenarios where minimal resource usage and rapid inference are essential. Key features include:
* Low Resource Consumption: With a requirement of (roughly) only 10MB of RAM and 8MB of VRAM, this model is well-suited for environments with limited hardware resources.
* Fast Inference: The model provides high-speed inference. The Tiny-toxic-detector significantly outperforms larger models on CPU-based systems. Due to the overhead of using GPU inference, small models with a relatively small number of input tokens are often faster on CPU. This includes the Tiny-toxic-detector.

### Limitations

* Training Data
  * The Tiny-toxic-detector has been trained exclusively on English-language data, limiting its ability to classify toxicity in other languages.
* Maximum Context Length
  * The model can handle up to 512 input tokens. Comments exceeding this length are not in the scope of this model.
  * While extending the context length is possible, such modifications have not been trained for or validated. Early tests with a 4096-token context resulted in a performance drop of over 10% on the Toxigen benchmark.
* Language Ambiguity
  * The Tiny-toxic-detector may struggle with ambiguous or nuanced language as any other model would. Even though benchmarks like Toxigen evaluate the model’s performance with ambiguous language, it may still misclassify comments where toxicity is not clearly defined.