YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model Card for DistilGPT2-DiseaseSymptomPredictor

Model Overview

The DistilGPT2-DiseaseSymptomPredictor model is a fine-tuned version of distilgpt2, optimized to predict symptoms based on disease names. It is designed to assist healthcare professionals, researchers, and students by providing likely symptoms for various diseases, improving understanding and enhancing diagnostic assistance. This model was fine-tuned on a dataset containing diseases and corresponding symptoms to ensure high relevance and utility.

Model Details

  • Model type: Language Model
  • Architecture: DistilGPT2 (GPT-2 distilled version)
  • Fine-tuning Dataset: Diseases and Symptoms Dataset
  • Purpose: Predict symptoms for a given disease name.

Intended Use

This model is intended for educational and research purposes, particularly useful for:

  • Healthcare education.
  • Symptom understanding for general diseases.
  • Research and prototyping in health-related NLP tasks.

Usage Example:

from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

# Load model and tokenizer
model = GPT2LMHeadModel.from_pretrained("gautamraj8044/DistilGPT2-DiseaseSymptomPredictor")
tokenizer = GPT2Tokenizer.from_pretrained("gautamraj8044/DistilGPT2-DiseaseSymptomPredictor")

# Generate symptom predictions
input_text = "Kidney Failure"
input_ids = tokenizer.encode(input_text, return_tensors='pt')

output = model.generate(
    input_ids,
    max_length=20,
    num_return_sequences=1,
    do_sample=True,
    top_k=8,
    top_p=0.95,
    temperature=0.5,
    repetition_penalty=1.2
)

decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
print(decoded_output)

Training Procedure

The model was fine-tuned on a dataset with 400 rows, each containing disease names and associated symptoms. The training process used a batch size of 8, a learning rate of 5e-4, and a CrossEntropyLoss function, with early stopping based on validation loss.

Evaluation

  • Training Loss: Average per epoch.
  • Validation Loss: Tracked during training.
  • Epochs: Trained for 10 epochs.

Limitations and Biases

This model should be used with caution as it is not a replacement for professional healthcare advice or diagnosis. The model may not accurately predict rare or complex symptoms associated with certain diseases. Additionally, it may reflect biases present in the dataset.

License

Apache-2.0 License

Acknowledgments

  • Hugging Face team for the DistilGPT2 model.
  • Dataset by QuyenAnhDE on Hugging Face.
Downloads last month
4
Safetensors
Model size
81.9M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.