File size: 5,454 Bytes
903b224 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
---
license: mit
language:
- en
tags:
- intent-classification
- mental-health
- transformer
- conversational-ai
pipeline_tag: text-classification
base_model: distilbert-base-uncased
---
# 🧠 Intent Classifier (MindPadi)
The `intent_classifier` is a transformer-based text classification model trained to detect **user intents** in a mental health support setting. It powers the MindPadi assistant's ability to route conversations to the appropriate modules—like emotional support, scheduling, reflection, or journal analysis—based on the user’s message.
## 📝 Model Overview
- **Model Architecture:** DistilBERT (uncased) + classification head
- **Task:** Intent Classification
- **Classes:** Over 20 intent categories (e.g., `vent`, `gratitude`, `help_request`, `journal_analysis`)
- **Model Size:** ~66M parameters
- **Files:**
- `config.json`
- `pytorch_model.bin` or `model.safetensors`
- `tokenizer_config.json`, `vocab.txt`, `tokenizer.json`
- `checkpoint-*/` (optional training checkpoints)
## ✅ Intended Use
### ✔️ Use Cases
- Detecting user intent in MindPadi mental health conversations
- Enabling context-specific dialogue flows
- Assisting with journal entry triage and tagging
- Triggering therapy-related tools (e.g., emotion check-ins, PubMed summaries)
### 🚫 Not Intended For
- Multilingual intent classification (English-only)
- Legal or medical diagnosis tasks
- Multi-label classification (currently single-label per input)
## 💡 Example Intents Detected
| Intent | Description |
|--------------------|-------------------------------------------------------|
| `vent` | User expressing frustration or emotion freely |
| `help_request` | Seeking mental health support |
| `schedule_session` | Booking a therapy check-in |
| `gratitude` | Showing appreciation for support |
| `journal_analysis` | Submitting a journal entry for AI feedback |
| `reflection` | Talking about personal growth or setbacks |
| `not_sure` | Unsure or unclear message from user |
## 🛠️ Training Details
- **Base Model:** `distilbert-base-uncased`
- **Dataset:** Curated and annotated conversations (`training/datasets/finetuned/intents/`)
- **Script:** `training/train_intent_classifier.py`
- **Preprocessing:**
- Text normalization (lowercasing, punctuation removal)
- Label encoding
- **Loss:** CrossEntropyLoss
- **Metrics:** Accuracy, F1-score
- **Tokenizer:** WordPiece (DistilBERT tokenizer)
## 📊 Evaluation
| Metric | Score |
|-----------|-------------|
| Accuracy | 91.3% |
| F1-score | 89.8% |
| Recall@3 | 97.1% |
| Precision | 88.4% |
Evaluation performed on a held-out validation split of MindPadi intent dataset.
## 🔍 Example Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model = AutoModelForSequenceClassification.from_pretrained("mindpadi/intent_classifier")
tokenizer = AutoTokenizer.from_pretrained("mindpadi/intent_classifier")
text = "I’m struggling with my emotions today"
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
predicted_class = torch.argmax(outputs.logits, dim=1).item()
print("Predicted intent ID:", predicted_class)
````
To map `intent ID → label`, load your label encoder from:
```python
from joblib import load
label_encoder = load("intent_encoder/label_encoder.joblib")
print("Predicted intent:", label_encoder.inverse_transform([predicted_class])[0])
```
## 🔌 Inference Endpoint Example
```python
import requests
API_URL = "https://api-inference.huggingface.co/models/mindpadi/intent_classifier"
headers = {"Authorization": f"Bearer <your-api-token>"}
payload = {"inputs": "Can I book a mental health session?"}
response = requests.post(API_URL, headers=headers, json=payload)
print(response.json())
```
## ⚠️ Limitations
* Not robust to long-form texts (>256 tokens); truncate or summarize input.
* May confuse overlapping intents like `vent` and `help_request`
* False positives possible in vague or sarcastic inputs
* Requires pairing with fallback model (`intent_fallback`) for reliability
## 🔐 Ethical Considerations
* This model is for **supportive routing**, not clinical diagnosis
* Use with user consent and proper data privacy safeguards
* Intent predictions should not override human judgment in sensitive contexts
## 📂 Integration Points
| Location | Functionality |
| ---------------------------------- | --------------------------------------------- |
| `app/chatbot/intent_classifier.py` | Main classifier logic |
| `app/chatbot/intent_router.py` | Routes based on predicted intent |
| `app/utils/embedding_search.py` | Uses `intent_encoder` for similarity fallback |
| `data/processed_intents.json` | Annotated intent samples |
## 📜 License
MIT License – freely available for commercial and non-commercial use.
## 📬 Contact
* **Team:** MindPadi AI Developers
* **Profile:** [https://huggingface.co/mindpadi](https://huggingface.co/mindpadi)
* **Email:** \[[[email protected]](mailto:[email protected])]
*Last updated: May 2025* |