Medical-Llama3-8B / README.md
ruslanmv's picture
Create README.md
6fba484 verified
|
raw
history blame
3.14 kB
---
language: en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- ruslanmv
- llama
- trl
base_model: unsloth/llama-3-8b-bnb-4bit
datasets:
- ruslanmv/ai-medical-chatbot
---
# Medical-Llama3-8B-16bit: Fine-Tuned Llama3 for Medical Q&A
This repository provides a fine-tuned version of the powerful Llama3 8B model, specifically designed to answer medical questions in an informative way. It leverages the rich knowledge contained in the AI Medical Chatbot dataset ([ruslanmv/ai-medical-chatbot](https://huggingface.co/datasets/ruslanmv/ai-medical-chatbot)).
**Model & Development**
- **Developed by:** ruslanmv
- **License:** Apache-2.0
- **Finetuned from model:** unsloth/llama-3-8b-bnb-4bit
**Key Features**
- **Medical Focus:** Optimized to address health-related inquiries.
- **Knowledge Base:** Trained on a comprehensive medical chatbot dataset.
- **Text Generation:** Generates informative and potentially helpful responses.
**Installation**
This model is accessible through the Hugging Face Transformers library. Install it using pip:
```bash
pip install transformers
```
**Usage Example**
Here's a Python code snippet demonstrating how to interact with the `Medical-Llama3-8B-16bit` model and generate answers to your medical questions:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("ruslanmv/Medical-Llama3-8B-16bit")
model = AutoModelForCausalLM.from_pretrained("ruslanmv/Medical-Llama3-8B-16bit").to("cuda") # If using GPU
# Function to format and generate response with prompt engineering
def askme(question):
medical_prompt = """You are an AI Medical Assistant trained on a vast dataset of health information. Below is a medical question:
Question: {}
Please provide an informative and comprehensive answer:
Answer: """.format(question)
EOS_TOKEN = tokenizer.eos_token
def format_prompt(question):
return medical_prompt + question + EOS_TOKEN
inputs = tokenizer(format_prompt(question), return_tensors="pt").to("cuda") # If using GPU
outputs = model.generate(**inputs, max_new_tokens=64, use_cache=True) # Adjust max_new_tokens for longer responses
answer = tokenizer.batch_decode(outputs)[0].strip()
return answer
# Example usage
question = "What should I do to reduce my weight gained due to genetic hypothyroidism?"
print(askme(question))
```
**Important Note**
This model is intended for informational purposes only and should not be used as a substitute for professional medical advice. Always consult with a qualified healthcare provider for any medical concerns.
**License**
This model is distributed under the Apache License 2.0 (see LICENSE file for details).
**Contributing**
We welcome contributions to this repository! If you have improvements or suggestions, feel free to create a pull request.
**Disclaimer**
While we strive to provide informative responses, the accuracy of the model's outputs cannot be guaranteed. It is crucial to consult a doctor or other healthcare professional for definitive medical advice.
```