EndConvo-health-deberta-v2
Model Description
The EndConvo-health-deberta-v2 is a fine-tuned conversational AI model based on the DeBERTa architecture. It is designed for binary classification tasks to determine whether a conversation in a health-related chatbot has reached its endpoint or should continue. The model significantly improves efficiency by identifying conversation closure, especially in healthcare applications, where accurate and timely responses are crucial.
Intended Use
- Primary Use Case: End-of-conversation detection in health-related chatbot systems.
- Scope of Application: Healthcare dialogues, customer support automation, or any domain requiring conversational flow control.
- Limitations:
- Reduced recall for the "True" (conversation ending) class, which could affect performance in ambiguous scenarios.
- The model requires GPU support for efficient inference on large-scale data.
Training
- Structure: Binary classification dataset with labels:
0
for "Continue conversation"1
for "End conversation."
- Size: 4,000 training samples and 1,000 validation samples.
- Source: Annotated conversational data designed for healthcare-related use cases.
- Preprocessing:
- Tokenization using DeBERTa tokenizer.
- Maximum sequence length of 256 tokens.
- Truncation applied for longer conversations.
Model Details
- Base Model: DeBERTa-V2
- Training Framework: Hugging Face Transformers
- Optimizer: AdamW with weight decay
- Loss Function: Cross-entropy loss
- Batch Size: 16
- Epochs: 3
- Learning Rate: 5e-5
- Evaluation Metric: Accuracy, Precision, Recall, F1-score
Evaluation Metrics
- Overall Accuracy: 86.6%
- Precision: 86.7%
- Recall: 58.0%
- F1-Score: 69.5%
- Validation Loss: 0.3729
Confusion Matrix
- True Negatives (TN): 71.29%
- False Positives (FP): 2.35%
- False Negatives (FN): 11.06%
- True Positives (TP): 15.29%
Detailed Report
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
False (Continue) | 0.87 | 0.97 | 0.91 | 313 |
True (End) | 0.87 | 0.58 | 0.70 | 112 |
Macro Average | 0.87 | 0.77 | 0.80 | - |
Weighted Average | 0.87 | 0.87 | 0.86 | - |
Pipeline and Usage
- Task Type: Text classification for conversation flow.
- Pipeline: Predicts whether a conversation should continue or end.
Example Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("MathewManoj/EndConvo-health-deberta-v2")
model = AutoModelForSequenceClassification.from_pretrained("MathewManoj/EndConvo-health-deberta-v2")
# Example text input
text = "Thank you for your help. I don't have any more questions."
# Tokenize the input
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
# Prediction
prediction = outputs.logits.argmax(dim=-1).item()
print("Prediction:", "End" if prediction == 1 else "Continue")
Performance Insights
Strengths:
- High accuracy and precision indicate the model performs well in correctly identifying most "Continue" conversations.
Limitations:
- Lower recall for "End" conversations suggests the need for additional data augmentation or fine-tuning to improve sensitivity.
Environment and Dependencies
- Framework: Hugging Face Transformers (v4.46.3)
- Python Version: 3.8+
- Dependencies:
torch
transformers
safetensors
numpy
Conda Environment Configuration
name: huggingface-env
channels:
- defaults
- conda-forge
dependencies:
- python=3.8
- pip
- pip:
- torch==2.4.1
- transformers==4.46.3
- safetensors
Model Limitations
- The model exhibits reduced recall for the "End conversation" class, which could impact its utility in edge cases.
- Requires labeled data for fine-tuning in other domains or applications.
- Downloads last month
- 52
Model tree for ProdocAI/EndConvo-health-deberta-v2
Base model
microsoft/deberta-v2-xlarge