--- library_name: transformers tags: - trl - sft datasets: - dmedhi/wiki_medical_terms base_model: - meta-llama/Llama-3.1-8B-Instruct --- # Model Card for Model ID ## Model Details ### Model Description **Fine-Tuned Llama 3.1 3B Instruct with Medical Terms using QLoRA** This is the model card of a πŸ€— transformers model that has been pushed on the Hub. This model card has been automatically generated. This repository contains a fine-tuned version of **Meta’s Llama 3.1 3B Instruct** model, optimized for medical term comprehension using **QLoRA** (Quantized Low-Rank Adaptation) techniques. The model has been fine-tuned on the **dmedhi/wiki_medical_terms** dataset, enhancing its ability to generate accurate responses related to medical terminology and healthcare-related questions. The fine-tuning process involves using **QLoRA** to adapt the pre-trained model while maintaining memory efficiency and computational feasibility. This technique allows for fine-tuning large-scale models on consumer-grade GPUs by leveraging **NF4** 4-bit quantization. - **Developed by [FineTuned]:** Karthik Manjunath Hadagali - **Model type:** Text-Generation - **Language(s) (NLP):** Python - **License:** [More Information Needed] - **Fine-Tuned from model [optional]:** Meta Llama 3.1 3B Instruct - **Fine-Tuning Method:** QLoRA - **Target Task:** Medical Knowledge Augmentation for Causal Language Modeling (CAUSAL_LM) - **Quantization:** 4-bit NF4 (Normal Float 4) Quantization - **Hardware Used:** Consumer-grade GPU with 4-bit memory optimization ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch # Load the fine-tuned model model_id = "Karthik2510/Medi_terms_Llama3_1_8B_instruct_model" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto") # Example query input_text = "What is the medical definition of pneumonia?" inputs = tokenizer(input_text, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Training Details ### Training Data The model has been fine-tuned on the **dmedhi/wiki_medical_terms** dataset. This dataset is designed to improve medical terminology comprehension and consists of: βœ… Medical definitions and terminologies βœ… Disease symptoms and conditions βœ… Healthcare and clinical knowledge from Wikipedia's medical section This dataset ensures that the fine-tuned model performs well in understanding and responding to medical queries with enhanced accuracy. ### Training Procedure #### Preprocessing - The dataset was cleaned and tokenized using the Llama 3.1 tokenizer, ensuring that medical terms were preserved. - Special medical terminologies were handled properly to maintain context. - The dataset was formatted into a question-answer style to align with the instruction-based nature of Llama 3.1 3B Instruct. #### Training Hyperparameters - **Training regime:** bf16 mixed precision (to balance efficiency and precision) - **Batch Size:** 1 per device - **Gradient Accumulation Steps:** 4 (to simulate a larger batch size) - **Learning Rate:** 2e-4 - **Warmup Steps:** 100 - **Epochs:** 3 - **Optimizer:** paged_adamw_8bit (efficient low-memory optimizer) - **LoRA Rank (r):** 16 - **LoRA Alpha:** 32 - **LoRA Dropout:** 0.05 #### Speeds, Sizes, Times - **Training Hardware:** Single GPU (consumer-grade, VRAM-optimized) - **Model Size after Fine-Tuning:** Approx. 3B parameters with LoRA adapters - **Training Time:** ~3-4 hours per epoch on A100 40GB GPU - **Final Checkpoint Size:** ~2.8GB (with LoRA adapters stored separately) ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** A100 40 GB GPU - **Hours used:** Approximatly 3 to 4 hours - **Cloud Provider:** Google Colabs - **Compute Region:** US-East - **Carbon Emitted:** [More Information Needed] ## Limitations & Considerations ❗ Not a substitute for professional medical advice ❗ May contain biases from training data ❗ Limited knowledge scope (not updated in real-time) ## Citation If you use this model, please consider citing: ```bibtex @article{llama3.1_medical_qlora, title={Fine-tuned Llama 3.1 3B Instruct for Medical Knowledge with QLoRA}, author={Karthik Manjunath Hadagali}, year={2024}, journal={Hugging Face Model Repository} } ``` ## Acknowledgments - Meta AI for the Llama 3.1 3B Instruct Model. - Hugging Face PEFT for QLoRA implementation. - dmedhi/wiki_medical_terms dataset contributors.