EndConvo-health-deberta-v2

Model Description

The EndConvo-health-deberta-v2 is a fine-tuned conversational AI model based on the DeBERTa architecture. It is designed for binary classification tasks to determine whether a conversation in a health-related chatbot has reached its endpoint or should continue. The model significantly improves efficiency by identifying conversation closure, especially in healthcare applications, where accurate and timely responses are crucial.


Intended Use

  • Primary Use Case: End-of-conversation detection in health-related chatbot systems.
  • Scope of Application: Healthcare dialogues, customer support automation, or any domain requiring conversational flow control.
  • Limitations:
    • Reduced recall for the "True" (conversation ending) class, which could affect performance in ambiguous scenarios.
    • The model requires GPU support for efficient inference on large-scale data.

Training

  • Structure: Binary classification dataset with labels:
    • 0 for "Continue conversation"
    • 1 for "End conversation."
  • Size: 4,000 training samples and 1,000 validation samples.
  • Source: Annotated conversational data designed for healthcare-related use cases.
  • Preprocessing:
    • Tokenization using DeBERTa tokenizer.
    • Maximum sequence length of 256 tokens.
    • Truncation applied for longer conversations.

Model Details

  • Base Model: DeBERTa-V2
  • Training Framework: Hugging Face Transformers
  • Optimizer: AdamW with weight decay
  • Loss Function: Cross-entropy loss
  • Batch Size: 16
  • Epochs: 3
  • Learning Rate: 5e-5
  • Evaluation Metric: Accuracy, Precision, Recall, F1-score

Evaluation Metrics

  • Overall Accuracy: 86.6%
  • Precision: 86.7%
  • Recall: 58.0%
  • F1-Score: 69.5%
  • Validation Loss: 0.3729

Confusion Matrix

  • True Negatives (TN): 71.29%
  • False Positives (FP): 2.35%
  • False Negatives (FN): 11.06%
  • True Positives (TP): 15.29%

Detailed Report

Class Precision Recall F1-Score Support
False (Continue) 0.87 0.97 0.91 313
True (End) 0.87 0.58 0.70 112
Macro Average 0.87 0.77 0.80 -
Weighted Average 0.87 0.87 0.86 -

Pipeline and Usage

  • Task Type: Text classification for conversation flow.
  • Pipeline: Predicts whether a conversation should continue or end.

Example Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("MathewManoj/EndConvo-health-deberta-v2")
model = AutoModelForSequenceClassification.from_pretrained("MathewManoj/EndConvo-health-deberta-v2")

# Example text input
text = "Thank you for your help. I don't have any more questions."

# Tokenize the input
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

# Prediction
prediction = outputs.logits.argmax(dim=-1).item()
print("Prediction:", "End" if prediction == 1 else "Continue")

Performance Insights

Strengths:

  • High accuracy and precision indicate the model performs well in correctly identifying most "Continue" conversations.

Limitations:

  • Lower recall for "End" conversations suggests the need for additional data augmentation or fine-tuning to improve sensitivity.

Environment and Dependencies

  • Framework: Hugging Face Transformers (v4.46.3)
  • Python Version: 3.8+
  • Dependencies:
    • torch
    • transformers
    • safetensors
    • numpy

Conda Environment Configuration

name: huggingface-env
channels:
  - defaults
  - conda-forge
dependencies:
  - python=3.8
  - pip
  - pip:
      - torch==2.4.1
      - transformers==4.46.3
      - safetensors

Model Limitations

  1. The model exhibits reduced recall for the "End conversation" class, which could impact its utility in edge cases.
  2. Requires labeled data for fine-tuning in other domains or applications.
Downloads last month
52
Safetensors
Model size
184M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for ProdocAI/EndConvo-health-deberta-v2

Finetuned
(6)
this model