EndConvo-health-deberta-v2

Model Description

The EndConvo-health-deberta-v2 is a fine-tuned conversational AI model based on the DeBERTa architecture. It is designed for binary classification tasks to determine whether a conversation in a health-related chatbot has reached its endpoint or should continue. The model significantly improves efficiency by identifying conversation closure, especially in healthcare applications, where accurate and timely responses are crucial.

Intended Use

Primary Use Case: End-of-conversation detection in health-related chatbot systems.
Scope of Application: Healthcare dialogues, customer support automation, or any domain requiring conversational flow control.
Limitations:
- Reduced recall for the "True" (conversation ending) class, which could affect performance in ambiguous scenarios.
- The model requires GPU support for efficient inference on large-scale data.

Training

Structure: Binary classification dataset with labels:
- 0 for "Continue conversation"
- 1 for "End conversation."
Size: 4,000 training samples and 1,000 validation samples.
Source: Annotated conversational data designed for healthcare-related use cases.
Preprocessing:
- Tokenization using DeBERTa tokenizer.
- Maximum sequence length of 256 tokens.
- Truncation applied for longer conversations.

Model Details

Base Model: DeBERTa-V2
Training Framework: Hugging Face Transformers
Optimizer: AdamW with weight decay
Loss Function: Cross-entropy loss
Batch Size: 16
Epochs: 3
Learning Rate: 5e-5
Evaluation Metric: Accuracy, Precision, Recall, F1-score

Evaluation Metrics

Overall Accuracy: 86.6%
Precision: 86.7%
Recall: 58.0%
F1-Score: 69.5%
Validation Loss: 0.3729

Confusion Matrix

True Negatives (TN): 71.29%
False Positives (FP): 2.35%
False Negatives (FN): 11.06%
True Positives (TP): 15.29%

Detailed Report

Class	Precision	Recall	F1-Score	Support
False (Continue)	0.87	0.97	0.91	313
True (End)	0.87	0.58	0.70	112
Macro Average	0.87	0.77	0.80	-
Weighted Average	0.87	0.87	0.86	-

Pipeline and Usage

Task Type: Text classification for conversation flow.
Pipeline: Predicts whether a conversation should continue or end.

Example Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("MathewManoj/EndConvo-health-deberta-v2")
model = AutoModelForSequenceClassification.from_pretrained("MathewManoj/EndConvo-health-deberta-v2")

# Example text input
text = "Thank you for your help. I don't have any more questions."

# Tokenize the input
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

# Prediction
prediction = outputs.logits.argmax(dim=-1).item()
print("Prediction:", "End" if prediction == 1 else "Continue")

Performance Insights

Strengths:

High accuracy and precision indicate the model performs well in correctly identifying most "Continue" conversations.

Limitations:

Lower recall for "End" conversations suggests the need for additional data augmentation or fine-tuning to improve sensitivity.

Environment and Dependencies

Framework: Hugging Face Transformers (v4.46.3)
Python Version: 3.8+
Dependencies:
- torch
- transformers
- safetensors
- numpy

Conda Environment Configuration

name: huggingface-env
channels:
  - defaults
  - conda-forge
dependencies:
  - python=3.8
  - pip
  - pip:
      - torch==2.4.1
      - transformers==4.46.3
      - safetensors

Model Limitations

The model exhibits reduced recall for the "End conversation" class, which could impact its utility in edge cases.
Requires labeled data for fine-tuning in other domains or applications.

ProdocAI
/

EndConvo-health-deberta-v2