File size: 3,589 Bytes

10a4c5a

# 📩 DistilBERT Quantized Model for SMS Spam Detection

This repository contains a production-ready **quantized DistilBERT** model fine-tuned for **SMS spam classification**, achieving **99.94% accuracy** while optimizing inference speed using ONNX Runtime.

---

## 📌 Model Details

- **Model Architecture:** DistilBERT Base Uncased  
- **Task:** Binary SMS Spam Classification (`ham=0`, `spam=1`)  
- **Dataset:** Custom SMS Spam Collection (5,574 messages)  
- **Quantization:** ONNX Runtime Dynamic Quantization  
- **Fine-tuning Framework:** Hugging Face Transformers + Optimum  

---

## 🚀 Quick Start

### 🧰 Installation

```bash
pip install -r requirements.txt
```

### ✅ Basic Usage

```python
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="./spam_model_quantized",
    tokenizer="./spam_model"
)

sample = "WINNER!! Claim your $1000 prize now!"
result = classifier(sample)
print(f"Prediction: {result[0]['label']} (confidence: {result[0]['score']:.2%})")
```

---

## 📈 Performance Metrics

| Metric      | Value     |
|-------------|-----------|
| Accuracy    | 99.94%    |
| F1 Score    | 0.9977    |
| Precision   | 100%      |
| Recall      | 99.55%    |
| Inference*  | 2.7ms     |

> \* Tested on AWS `t3.xlarge` (4 vCPUs)

---

## 🛠 Advanced Usage

### 🔍 Load Quantized Model Directly

```python
from optimum.onnxruntime import ORTModelForSequenceClassification

model = ORTModelForSequenceClassification.from_pretrained(
    "./spam_model_quantized",
    provider="CPUExecutionProvider"
)
```

### 📊 Batch Processing

```python
import pandas as pd

df = pd.read_csv("messages.csv")
predictions = classifier(list(df["text"]), batch_size=32)
```

---

## 🎯 Training Details

### 🔧 Hyperparameters

| Parameter       | Value         |
|-----------------|---------------|
| Epochs          | 5 (early stopped at 3) |
| Batch Size      | 12 (train), 16 (eval) |
| Learning Rate   | 3e-5          |
| Warmup Steps    | 10% of data   |
| Weight Decay    | 0.01          |

### ⚡ Quantization Benefits

| Metric        | Original | Quantized |
|---------------|----------|-----------|
| Model Size    | 255MB    | 68MB      |
| CPU Latency   | 9.2ms    | 2.7ms     |
| Throughput    | 110/sec  | 380/sec   |

---

## 📁 Repository Structure

```
.
├── spam_model/               # Original PyTorch model
│   ├── config.json
│   ├── model.safetensors
│   └── tokenizer.json
├── spam_model_quantized/     # Production-ready quantized model
│   ├── model.onnx
│   ├── quantized_model.onnx
│   └── tokenizer_config.json
├── examples/                 # Ready-to-use scripts
│   ├── predict.py            # CLI interface
│   └── api_server.py         # FastAPI service
├── requirements.txt          # Dependencies
└── README.md                 # This document
```

---

## 🚀 Deployment Options

### 1. Local REST API

```bash
uvicorn examples.api_server:app --port 8000
```

### 2. Docker Container

```dockerfile
FROM python:3.9-slim
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
CMD ["uvicorn", "examples.api_server:app", "--host", "0.0.0.0"]
```

---

## ⚠️ Limitations

- Optimized for **English** SMS messages  
- May require **retraining** for regional language or localized spam patterns  
- Quantized model requires **x86 CPUs with AVX2** support  

---

## 🙌 Contributions

Pull requests and suggestions are welcome! Please open an issue for feature requests or bug reports.