This repo has an optimized version of Detoxify, which needs less disk space and less memory at the cost of just a little bit of accuracy.
This is an experiment for me to learn how to use 🤗 Optimum.
Usage
Loading the model requires the 🤗 Optimum library installed.
from optimum.onnxruntime import ORTModelForSequenceClassification
from optimum.pipelines import pipeline as opt_pipeline
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("dcferreira/detoxify-optimized")
model = ORTModelForSequenceClassification.from_pretrained("dcferreira/detoxify-optimized")
pipe = opt_pipeline(
model=model,
task="text-classification",
function_to_apply="sigmoid",
accelerator="ort",
tokenizer=tokenizer,
top_k=None, # return scores for all the labels, model was trained as multilabel
)
print(pipe(['example text','exemple de texte','texto de ejemplo','testo di esempio','texto de exemplo','örnek metin','пример текста']))
Performance
The table below compares some statistics on running the original model, vs the original model with the onnxruntime, vs optimizing the model with onnxruntime.
model | Accuracy (%) | Samples p/ second (CPU) | Samples p/ second (GPU) | GPU VRAM | Disk Space |
---|---|---|---|---|---|
original | 92.1083 | 16 | 250 | 3GB | 1.1GB |
ort | 92.1067 | 19 | 340 | 4GB | 1.1GB |
optimized (O4) | 92.1031 | 14 | 650 | 2GB | 540MB |
For details on how these numbers were reached, check out evaluate.py
in this repo.
- Downloads last month
- 1
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.