Aryan7500's picture
Upload 8 files
10a4c5a verified
# πŸ“© DistilBERT Quantized Model for SMS Spam Detection
This repository contains a production-ready **quantized DistilBERT** model fine-tuned for **SMS spam classification**, achieving **99.94% accuracy** while optimizing inference speed using ONNX Runtime.
---
## πŸ“Œ Model Details
- **Model Architecture:** DistilBERT Base Uncased
- **Task:** Binary SMS Spam Classification (`ham=0`, `spam=1`)
- **Dataset:** Custom SMS Spam Collection (5,574 messages)
- **Quantization:** ONNX Runtime Dynamic Quantization
- **Fine-tuning Framework:** Hugging Face Transformers + Optimum
---
## πŸš€ Quick Start
### 🧰 Installation
```bash
pip install -r requirements.txt
```
### βœ… Basic Usage
```python
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="./spam_model_quantized",
tokenizer="./spam_model"
)
sample = "WINNER!! Claim your $1000 prize now!"
result = classifier(sample)
print(f"Prediction: {result[0]['label']} (confidence: {result[0]['score']:.2%})")
```
---
## πŸ“ˆ Performance Metrics
| Metric | Value |
|-------------|-----------|
| Accuracy | 99.94% |
| F1 Score | 0.9977 |
| Precision | 100% |
| Recall | 99.55% |
| Inference* | 2.7ms |
> \* Tested on AWS `t3.xlarge` (4 vCPUs)
---
## πŸ›  Advanced Usage
### πŸ” Load Quantized Model Directly
```python
from optimum.onnxruntime import ORTModelForSequenceClassification
model = ORTModelForSequenceClassification.from_pretrained(
"./spam_model_quantized",
provider="CPUExecutionProvider"
)
```
### πŸ“Š Batch Processing
```python
import pandas as pd
df = pd.read_csv("messages.csv")
predictions = classifier(list(df["text"]), batch_size=32)
```
---
## 🎯 Training Details
### πŸ”§ Hyperparameters
| Parameter | Value |
|-----------------|---------------|
| Epochs | 5 (early stopped at 3) |
| Batch Size | 12 (train), 16 (eval) |
| Learning Rate | 3e-5 |
| Warmup Steps | 10% of data |
| Weight Decay | 0.01 |
### ⚑ Quantization Benefits
| Metric | Original | Quantized |
|---------------|----------|-----------|
| Model Size | 255MB | 68MB |
| CPU Latency | 9.2ms | 2.7ms |
| Throughput | 110/sec | 380/sec |
---
## πŸ“ Repository Structure
```
.
β”œβ”€β”€ spam_model/ # Original PyTorch model
β”‚ β”œβ”€β”€ config.json
β”‚ β”œβ”€β”€ model.safetensors
β”‚ └── tokenizer.json
β”œβ”€β”€ spam_model_quantized/ # Production-ready quantized model
β”‚ β”œβ”€β”€ model.onnx
β”‚ β”œβ”€β”€ quantized_model.onnx
β”‚ └── tokenizer_config.json
β”œβ”€β”€ examples/ # Ready-to-use scripts
β”‚ β”œβ”€β”€ predict.py # CLI interface
β”‚ └── api_server.py # FastAPI service
β”œβ”€β”€ requirements.txt # Dependencies
└── README.md # This document
```
---
## πŸš€ Deployment Options
### 1. Local REST API
```bash
uvicorn examples.api_server:app --port 8000
```
### 2. Docker Container
```dockerfile
FROM python:3.9-slim
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
CMD ["uvicorn", "examples.api_server:app", "--host", "0.0.0.0"]
```
---
## ⚠️ Limitations
- Optimized for **English** SMS messages
- May require **retraining** for regional language or localized spam patterns
- Quantized model requires **x86 CPUs with AVX2** support
---
## πŸ™Œ Contributions
Pull requests and suggestions are welcome! Please open an issue for feature requests or bug reports.