# 📩 DistilBERT Quantized Model for SMS Spam Detection This repository contains a production-ready **quantized DistilBERT** model fine-tuned for **SMS spam classification**, achieving **99.94% accuracy** while optimizing inference speed using ONNX Runtime. --- ## 📌 Model Details - **Model Architecture:** DistilBERT Base Uncased - **Task:** Binary SMS Spam Classification (`ham=0`, `spam=1`) - **Dataset:** Custom SMS Spam Collection (5,574 messages) - **Quantization:** ONNX Runtime Dynamic Quantization - **Fine-tuning Framework:** Hugging Face Transformers + Optimum --- ## 🚀 Quick Start ### 🧰 Installation ```bash pip install -r requirements.txt ``` ### ✅ Basic Usage ```python from transformers import pipeline classifier = pipeline( "text-classification", model="./spam_model_quantized", tokenizer="./spam_model" ) sample = "WINNER!! Claim your $1000 prize now!" result = classifier(sample) print(f"Prediction: {result[0]['label']} (confidence: {result[0]['score']:.2%})") ``` --- ## 📈 Performance Metrics | Metric | Value | |-------------|-----------| | Accuracy | 99.94% | | F1 Score | 0.9977 | | Precision | 100% | | Recall | 99.55% | | Inference* | 2.7ms | > \* Tested on AWS `t3.xlarge` (4 vCPUs) --- ## 🛠 Advanced Usage ### 🔍 Load Quantized Model Directly ```python from optimum.onnxruntime import ORTModelForSequenceClassification model = ORTModelForSequenceClassification.from_pretrained( "./spam_model_quantized", provider="CPUExecutionProvider" ) ``` ### 📊 Batch Processing ```python import pandas as pd df = pd.read_csv("messages.csv") predictions = classifier(list(df["text"]), batch_size=32) ``` --- ## 🎯 Training Details ### 🔧 Hyperparameters | Parameter | Value | |-----------------|---------------| | Epochs | 5 (early stopped at 3) | | Batch Size | 12 (train), 16 (eval) | | Learning Rate | 3e-5 | | Warmup Steps | 10% of data | | Weight Decay | 0.01 | ### ⚡ Quantization Benefits | Metric | Original | Quantized | |---------------|----------|-----------| | Model Size | 255MB | 68MB | | CPU Latency | 9.2ms | 2.7ms | | Throughput | 110/sec | 380/sec | --- ## 📁 Repository Structure ``` . ├── spam_model/ # Original PyTorch model │ ├── config.json │ ├── model.safetensors │ └── tokenizer.json ├── spam_model_quantized/ # Production-ready quantized model │ ├── model.onnx │ ├── quantized_model.onnx │ └── tokenizer_config.json ├── examples/ # Ready-to-use scripts │ ├── predict.py # CLI interface │ └── api_server.py # FastAPI service ├── requirements.txt # Dependencies └── README.md # This document ``` --- ## 🚀 Deployment Options ### 1. Local REST API ```bash uvicorn examples.api_server:app --port 8000 ``` ### 2. Docker Container ```dockerfile FROM python:3.9-slim COPY . /app WORKDIR /app RUN pip install -r requirements.txt CMD ["uvicorn", "examples.api_server:app", "--host", "0.0.0.0"] ``` --- ## ⚠️ Limitations - Optimized for **English** SMS messages - May require **retraining** for regional language or localized spam patterns - Quantized model requires **x86 CPUs with AVX2** support --- ## 🙌 Contributions Pull requests and suggestions are welcome! Please open an issue for feature requests or bug reports.