Model Card for DistilGPT2-DiseaseSymptomPredictor

Model Overview

The DistilGPT2-DiseaseSymptomPredictor model is a fine-tuned version of distilgpt2, optimized to predict symptoms based on disease names. It is designed to assist healthcare professionals, researchers, and students by providing likely symptoms for various diseases, improving understanding and enhancing diagnostic assistance. This model was fine-tuned on a dataset containing diseases and corresponding symptoms to ensure high relevance and utility.

Model Details

Model type: Language Model
Architecture: DistilGPT2 (GPT-2 distilled version)
Fine-tuning Dataset: Diseases and Symptoms Dataset
Purpose: Predict symptoms for a given disease name.

Intended Use

This model is intended for educational and research purposes, particularly useful for:

Healthcare education.
Symptom understanding for general diseases.
Research and prototyping in health-related NLP tasks.

Usage Example:

from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

# Load model and tokenizer
model = GPT2LMHeadModel.from_pretrained("gautamraj8044/DistilGPT2-DiseaseSymptomPredictor")
tokenizer = GPT2Tokenizer.from_pretrained("gautamraj8044/DistilGPT2-DiseaseSymptomPredictor")

# Generate symptom predictions
input_text = "Kidney Failure"
input_ids = tokenizer.encode(input_text, return_tensors='pt')

output = model.generate(
    input_ids,
    max_length=20,
    num_return_sequences=1,
    do_sample=True,
    top_k=8,
    top_p=0.95,
    temperature=0.5,
    repetition_penalty=1.2
)

decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
print(decoded_output)

Training Procedure

The model was fine-tuned on a dataset with 400 rows, each containing disease names and associated symptoms. The training process used a batch size of 8, a learning rate of 5e-4, and a CrossEntropyLoss function, with early stopping based on validation loss.

Evaluation

Training Loss: Average per epoch.
Validation Loss: Tracked during training.
Epochs: Trained for 10 epochs.

Limitations and Biases

This model should be used with caution as it is not a replacement for professional healthcare advice or diagnosis. The model may not accurately predict rare or complex symptoms associated with certain diseases. Additionally, it may reflect biases present in the dataset.

License

Apache-2.0 License

Acknowledgments

Hugging Face team for the DistilGPT2 model.
Dataset by QuyenAnhDE on Hugging Face.