|
--- |
|
license: other |
|
license_name: govtech-singapore |
|
license_link: LICENSE |
|
datasets: |
|
- gabrielchua/off-topic |
|
language: |
|
- en |
|
metrics: |
|
- roc_auc |
|
- f1 |
|
- precision |
|
- recall |
|
base_model: |
|
- jinaai/jina-embeddings-v2-small-en |
|
--- |
|
|
|
# Off-Topic Classification Model |
|
|
|
This repository contains a fine-tuned **Jina Embeddings model** designed to perform binary classification. The model predicts whether a user prompt is **off-topic** based on the intended purpose defined in the system prompt. |
|
|
|
## Model Highlights |
|
|
|
- **Base Model**: [`jina-embeddings-v2-small-en`](https://huggingface.co/jinaai/jina-embeddings-v2-small-en) |
|
- **Maximum Context Length**: 1024 tokens |
|
- **Task**: Binary classification (on-topic/off-topic) |
|
|
|
## Performance |
|
|
|
We evaluated our fine-tuned models on synthetic data modelling system and user prompt pairs reflecting real world enterprise use cases of LLMs. The dataset is available [here](https://huggingface.co/datasets/gabrielchua/off-topic). |
|
|
|
| Approach | Model | ROC-AUC | F1 | Precision | Recall | |
|
|---------------------------------------|--------------------------------|---------|------|-----------|--------| |
|
| [Fine-tuned bi-encoder classifier](https://huggingface.co/govtech/jina-embeddings-v2-small-en-off-topic) | jina-embeddings-v2-small-en | 0.99 | 0.97 | 0.99 | 0.95 | |
|
| ๐ [Fine-tuned cross-encoder classifier](https://huggingface.co/govtech/stsb-roberta-base-off-topic) | stsb-roberta-base | 0.99 | 0.99 | 0.99 | 0.99 | |
|
| Pre-trained cross-encoder | stsb-roberta-base | 0.73 | 0.68 | 0.53 | 0.93 | |
|
| Prompt Engineering | GPT 4o (2024-08-06) | - | 0.95 | 0.94 | 0.97 | |
|
| Prompt Engineering | GPT 4o Mini (2024-07-18) | - | 0.91 | 0.85 | 0.91 | |
|
| Zero-shot Classification | GPT 4o Mini (2024-07-18) | 0.99 | 0.97 | 0.95 | 0.99 | |
|
|
|
Further evaluation results on additional synthetic and external datasets (e.g.,`JailbreakBench`, `HarmBench`, `TrustLLM`) are available in our [technical report](https://arxiv.org/abs/2411.12946). |
|
|
|
## Usage |
|
1. Clone this repository and install the required dependencies: |
|
|
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
|
|
2. You can run the model using two options: |
|
|
|
**Option 1**: Using `inference_onnx.py` with the ONNX Model. |
|
|
|
``` |
|
python inference_onnx.py '[ |
|
["System prompt example 1", "User prompt example 1"], |
|
["System prompt example 2", "System prompt example 2] |
|
]' |
|
``` |
|
|
|
**Option 2**: Using `inference_safetensors.py` with PyTorch and SafeTensors. |
|
|
|
``` |
|
python inference_safetensors.py '[ |
|
["System prompt example 1", "User prompt example 1"], |
|
["System prompt example 2", "System prompt example 2] |
|
]' |
|
``` |
|
|
|
Read more about this model in our [technical report](https://arxiv.org/abs/2411.12946). |