Aryan7500's picture
Upload 8 files
10a4c5a verified

πŸ“© DistilBERT Quantized Model for SMS Spam Detection

This repository contains a production-ready quantized DistilBERT model fine-tuned for SMS spam classification, achieving 99.94% accuracy while optimizing inference speed using ONNX Runtime.


πŸ“Œ Model Details

  • Model Architecture: DistilBERT Base Uncased
  • Task: Binary SMS Spam Classification (ham=0, spam=1)
  • Dataset: Custom SMS Spam Collection (5,574 messages)
  • Quantization: ONNX Runtime Dynamic Quantization
  • Fine-tuning Framework: Hugging Face Transformers + Optimum

πŸš€ Quick Start

🧰 Installation

pip install -r requirements.txt

βœ… Basic Usage

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="./spam_model_quantized",
    tokenizer="./spam_model"
)

sample = "WINNER!! Claim your $1000 prize now!"
result = classifier(sample)
print(f"Prediction: {result[0]['label']} (confidence: {result[0]['score']:.2%})")

πŸ“ˆ Performance Metrics

Metric Value
Accuracy 99.94%
F1 Score 0.9977
Precision 100%
Recall 99.55%
Inference* 2.7ms

* Tested on AWS t3.xlarge (4 vCPUs)


πŸ›  Advanced Usage

πŸ” Load Quantized Model Directly

from optimum.onnxruntime import ORTModelForSequenceClassification

model = ORTModelForSequenceClassification.from_pretrained(
    "./spam_model_quantized",
    provider="CPUExecutionProvider"
)

πŸ“Š Batch Processing

import pandas as pd

df = pd.read_csv("messages.csv")
predictions = classifier(list(df["text"]), batch_size=32)

🎯 Training Details

πŸ”§ Hyperparameters

Parameter Value
Epochs 5 (early stopped at 3)
Batch Size 12 (train), 16 (eval)
Learning Rate 3e-5
Warmup Steps 10% of data
Weight Decay 0.01

⚑ Quantization Benefits

Metric Original Quantized
Model Size 255MB 68MB
CPU Latency 9.2ms 2.7ms
Throughput 110/sec 380/sec

πŸ“ Repository Structure

.
β”œβ”€β”€ spam_model/               # Original PyTorch model
β”‚   β”œβ”€β”€ config.json
β”‚   β”œβ”€β”€ model.safetensors
β”‚   └── tokenizer.json
β”œβ”€β”€ spam_model_quantized/     # Production-ready quantized model
β”‚   β”œβ”€β”€ model.onnx
β”‚   β”œβ”€β”€ quantized_model.onnx
β”‚   └── tokenizer_config.json
β”œβ”€β”€ examples/                 # Ready-to-use scripts
β”‚   β”œβ”€β”€ predict.py            # CLI interface
β”‚   └── api_server.py         # FastAPI service
β”œβ”€β”€ requirements.txt          # Dependencies
└── README.md                 # This document

πŸš€ Deployment Options

1. Local REST API

uvicorn examples.api_server:app --port 8000

2. Docker Container

FROM python:3.9-slim
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
CMD ["uvicorn", "examples.api_server:app", "--host", "0.0.0.0"]

⚠️ Limitations

  • Optimized for English SMS messages
  • May require retraining for regional language or localized spam patterns
  • Quantized model requires x86 CPUs with AVX2 support

πŸ™Œ Contributions

Pull requests and suggestions are welcome! Please open an issue for feature requests or bug reports.