Model Card for Model ID

Model Details

Model Description

Fine-Tuned Llama 3.1 3B Instruct with Medical Terms using QLoRA

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

This repository contains a fine-tuned version of Meta’s Llama 3.1 3B Instruct model, optimized for medical term comprehension using QLoRA (Quantized Low-Rank Adaptation) techniques. The model has been fine-tuned on the dmedhi/wiki_medical_terms dataset, enhancing its ability to generate accurate responses related to medical terminology and healthcare-related questions.

The fine-tuning process involves using QLoRA to adapt the pre-trained model while maintaining memory efficiency and computational feasibility. This technique allows for fine-tuning large-scale models on consumer-grade GPUs by leveraging NF4 4-bit quantization.

Developed by [FineTuned]: Karthik Manjunath Hadagali
Model type: Text-Generation
Language(s) (NLP): Python
License: [More Information Needed]
Fine-Tuned from model [optional]: Meta Llama 3.1 3B Instruct
Fine-Tuning Method: QLoRA
Target Task: Medical Knowledge Augmentation for Causal Language Modeling (CAUSAL_LM)
Quantization: 4-bit NF4 (Normal Float 4) Quantization
Hardware Used: Consumer-grade GPU with 4-bit memory optimization

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load the fine-tuned model
model_id = "Karthik2510/Medi_terms_Llama3_1_8B_instruct_model"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

# Example query
input_text = "What is the medical definition of pneumonia?"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model has been fine-tuned on the dmedhi/wiki_medical_terms dataset. This dataset is designed to improve medical terminology comprehension and consists of:

✅ Medical definitions and terminologies

✅ Disease symptoms and conditions

✅ Healthcare and clinical knowledge from Wikipedia's medical section

This dataset ensures that the fine-tuned model performs well in understanding and responding to medical queries with enhanced accuracy.

Training Procedure

Preprocessing

The dataset was cleaned and tokenized using the Llama 3.1 tokenizer, ensuring that medical terms were preserved.
Special medical terminologies were handled properly to maintain context.
The dataset was formatted into a question-answer style to align with the instruction-based nature of Llama 3.1 3B Instruct.

Training Hyperparameters

Training regime: bf16 mixed precision (to balance efficiency and precision)
Batch Size: 1 per device
Gradient Accumulation Steps: 4 (to simulate a larger batch size)
Learning Rate: 2e-4
Warmup Steps: 100
Epochs: 3
Optimizer: paged_adamw_8bit (efficient low-memory optimizer)
LoRA Rank (r): 16
LoRA Alpha: 32
LoRA Dropout: 0.05

Speeds, Sizes, Times

Training Hardware: Single GPU (consumer-grade, VRAM-optimized)
Model Size after Fine-Tuning: Approx. 3B parameters with LoRA adapters
Training Time: ~3-4 hours per epoch on A100 40GB GPU
Final Checkpoint Size: ~2.8GB (with LoRA adapters stored separately)

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: A100 40 GB GPU
Hours used: Approximatly 3 to 4 hours
Cloud Provider: Google Colabs
Compute Region: US-East
Carbon Emitted: [More Information Needed]

Limitations & Considerations

❗ Not a substitute for professional medical advice

❗ May contain biases from training data

❗ Limited knowledge scope (not updated in real-time)

Citation

If you use this model, please consider citing:

@article{llama3.1_medical_qlora,
  title={Fine-tuned Llama 3.1 3B Instruct for Medical Knowledge with QLoRA},
  author={Karthik Manjunath Hadagali},
  year={2024},
  journal={Hugging Face Model Repository}
}

Acknowledgments

Meta AI for the Llama 3.1 3B Instruct Model.
Hugging Face PEFT for QLoRA implementation.
dmedhi/wiki_medical_terms dataset contributors.

Karthik2510
/

Medi_terms_Llama3_1_8B_instruct_model