|
--- |
|
license: mit |
|
language: |
|
- en |
|
tags: |
|
- intent-classification |
|
- mental-health |
|
- transformer |
|
- conversational-ai |
|
pipeline_tag: text-classification |
|
base_model: distilbert-base-uncased |
|
--- |
|
|
|
# 🧠 Intent Classifier (MindPadi) |
|
|
|
The `intent_classifier` is a transformer-based text classification model trained to detect **user intents** in a mental health support setting. It powers the MindPadi assistant's ability to route conversations to the appropriate modules—like emotional support, scheduling, reflection, or journal analysis—based on the user’s message. |
|
|
|
|
|
|
|
## 📝 Model Overview |
|
|
|
- **Model Architecture:** DistilBERT (uncased) + classification head |
|
- **Task:** Intent Classification |
|
- **Classes:** Over 20 intent categories (e.g., `vent`, `gratitude`, `help_request`, `journal_analysis`) |
|
- **Model Size:** ~66M parameters |
|
- **Files:** |
|
- `config.json` |
|
- `pytorch_model.bin` or `model.safetensors` |
|
- `tokenizer_config.json`, `vocab.txt`, `tokenizer.json` |
|
- `checkpoint-*/` (optional training checkpoints) |
|
|
|
|
|
|
|
## ✅ Intended Use |
|
|
|
### ✔️ Use Cases |
|
- Detecting user intent in MindPadi mental health conversations |
|
- Enabling context-specific dialogue flows |
|
- Assisting with journal entry triage and tagging |
|
- Triggering therapy-related tools (e.g., emotion check-ins, PubMed summaries) |
|
|
|
### 🚫 Not Intended For |
|
- Multilingual intent classification (English-only) |
|
- Legal or medical diagnosis tasks |
|
- Multi-label classification (currently single-label per input) |
|
|
|
|
|
|
|
## 💡 Example Intents Detected |
|
|
|
| Intent | Description | |
|
|--------------------|-------------------------------------------------------| |
|
| `vent` | User expressing frustration or emotion freely | |
|
| `help_request` | Seeking mental health support | |
|
| `schedule_session` | Booking a therapy check-in | |
|
| `gratitude` | Showing appreciation for support | |
|
| `journal_analysis` | Submitting a journal entry for AI feedback | |
|
| `reflection` | Talking about personal growth or setbacks | |
|
| `not_sure` | Unsure or unclear message from user | |
|
|
|
|
|
|
|
## 🛠️ Training Details |
|
|
|
- **Base Model:** `distilbert-base-uncased` |
|
- **Dataset:** Curated and annotated conversations (`training/datasets/finetuned/intents/`) |
|
- **Script:** `training/train_intent_classifier.py` |
|
- **Preprocessing:** |
|
- Text normalization (lowercasing, punctuation removal) |
|
- Label encoding |
|
- **Loss:** CrossEntropyLoss |
|
- **Metrics:** Accuracy, F1-score |
|
- **Tokenizer:** WordPiece (DistilBERT tokenizer) |
|
|
|
|
|
|
|
## 📊 Evaluation |
|
|
|
| Metric | Score | |
|
|-----------|-------------| |
|
| Accuracy | 91.3% | |
|
| F1-score | 89.8% | |
|
| Recall@3 | 97.1% | |
|
| Precision | 88.4% | |
|
|
|
Evaluation performed on a held-out validation split of MindPadi intent dataset. |
|
|
|
|
|
|
|
## 🔍 Example Usage |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
import torch |
|
|
|
model = AutoModelForSequenceClassification.from_pretrained("mindpadi/intent_classifier") |
|
tokenizer = AutoTokenizer.from_pretrained("mindpadi/intent_classifier") |
|
|
|
text = "I’m struggling with my emotions today" |
|
inputs = tokenizer(text, return_tensors="pt") |
|
outputs = model(**inputs) |
|
|
|
predicted_class = torch.argmax(outputs.logits, dim=1).item() |
|
print("Predicted intent ID:", predicted_class) |
|
```` |
|
|
|
To map `intent ID → label`, load your label encoder from: |
|
|
|
```python |
|
from joblib import load |
|
label_encoder = load("intent_encoder/label_encoder.joblib") |
|
print("Predicted intent:", label_encoder.inverse_transform([predicted_class])[0]) |
|
``` |
|
|
|
|
|
## 🔌 Inference Endpoint Example |
|
|
|
```python |
|
import requests |
|
|
|
API_URL = "https://api-inference.huggingface.co/models/mindpadi/intent_classifier" |
|
headers = {"Authorization": f"Bearer <your-api-token>"} |
|
payload = {"inputs": "Can I book a mental health session?"} |
|
|
|
response = requests.post(API_URL, headers=headers, json=payload) |
|
print(response.json()) |
|
``` |
|
|
|
|
|
|
|
## ⚠️ Limitations |
|
|
|
* Not robust to long-form texts (>256 tokens); truncate or summarize input. |
|
* May confuse overlapping intents like `vent` and `help_request` |
|
* False positives possible in vague or sarcastic inputs |
|
* Requires pairing with fallback model (`intent_fallback`) for reliability |
|
|
|
|
|
|
|
## 🔐 Ethical Considerations |
|
|
|
* This model is for **supportive routing**, not clinical diagnosis |
|
* Use with user consent and proper data privacy safeguards |
|
* Intent predictions should not override human judgment in sensitive contexts |
|
|
|
|
|
|
|
## 📂 Integration Points |
|
|
|
| Location | Functionality | |
|
| ---------------------------------- | --------------------------------------------- | |
|
| `app/chatbot/intent_classifier.py` | Main classifier logic | |
|
| `app/chatbot/intent_router.py` | Routes based on predicted intent | |
|
| `app/utils/embedding_search.py` | Uses `intent_encoder` for similarity fallback | |
|
| `data/processed_intents.json` | Annotated intent samples | |
|
|
|
|
|
|
|
## 📜 License |
|
|
|
MIT License – freely available for commercial and non-commercial use. |
|
|
|
|
|
## 📬 Contact |
|
|
|
* **Team:** MindPadi AI Developers |
|
* **Profile:** [https://huggingface.co/mindpadi](https://huggingface.co/mindpadi) |
|
* **Email:** \[[[email protected]](mailto:[email protected])] |
|
|
|
*Last updated: May 2025* |