Quantized version of meta-llama/LlamaGuard-7b

Model Description

The model meta-llama/LlamaGuard-7b was quantized to 4bit, group_size 128, and act-order=True with auto-gptq integration in transformers (https://huggingface.co/blog/gptq-integration).

Evaluation

To evaluate the qunatized model and compare it with the full precision model, I performed binary classification on the "toxicity" label from the ~5k samples test set of lmsys/toxic-chat.

๐Ÿ“Š Full Precision Model:

Average Precision Score: 0.3625

๐Ÿ“Š 4-bit Quantized Model:

Average Precision Score: 0.3450

Downloads last month
82
Safetensors
Model size
1.13B params
Tensor type
I32
ยท
FP16
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Model tree for SebastianSchramm/LlamaGuard-7b-GPTQ-4bit-128g-actorder_True

Quantized
(2)
this model