This repo has an optimized version of Detoxify, which needs less disk space and less memory at the cost of just a little bit of accuracy.

This is an experiment for me to learn how to use 🤗 Optimum.

Usage

Loading the model requires the 🤗 Optimum library installed.

from optimum.onnxruntime import ORTModelForSequenceClassification
from optimum.pipelines import pipeline as opt_pipeline
from transformers import AutoTokenizer


tokenizer = AutoTokenizer.from_pretrained("dcferreira/detoxify-optimized")
model = ORTModelForSequenceClassification.from_pretrained("dcferreira/detoxify-optimized")
pipe = opt_pipeline(
    model=model,
    task="text-classification",
    function_to_apply="sigmoid",
    accelerator="ort",
    tokenizer=tokenizer,
    top_k=None,  # return scores for all the labels, model was trained as multilabel
)

print(pipe(['example text','exemple de texte','texto de ejemplo','testo di esempio','texto de exemplo','örnek metin','пример текста']))

Performance

The table below compares some statistics on running the original model, vs the original model with the onnxruntime, vs optimizing the model with onnxruntime.

model	Accuracy (%)	Samples p/ second (CPU)	Samples p/ second (GPU)	GPU VRAM	Disk Space
original	92.1083	16	250	3GB	1.1GB
ort	92.1067	19	340	4GB	1.1GB
optimized (O4)	92.1031	14	650	2GB	540MB

For details on how these numbers were reached, check out evaluate.py in this repo.