|
--- |
|
language: en |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- ruslanmv |
|
- llama |
|
- trl |
|
base_model: meta-llama/Meta-Llama-3-8B |
|
datasets: |
|
- ruslanmv/ai-medical-chatbot |
|
--- |
|
|
|
# Medical-Llama3-8B-GGUF |
|
[![](future.jpg)](https://ruslanmv.com/) |
|
This is a fine-tuned version of the Llama3 8B model, specifically designed to answer medical questions. |
|
The model was trained on the AI Medical Chatbot dataset, which can be found at [ruslanmv/ai-medical-chatbot](https://huggingface.co/datasets/ruslanmv/ai-medical-chatbot). This fine-tuned model leverages the GGUF (General-Purpose Gradient-based Quantization with Uniform Forwarding) technique for efficient inference with 4-bit quantization. |
|
|
|
**Model:** [ruslanmv/Medical-Llama3-8B-GGUF](https://huggingface.co/ruslanmv/Medical-Llama3-8B-GGUF) |
|
|
|
- **Developed by:** ruslanmv |
|
- **License:** apache-2.0 |
|
- **Finetuned from model:** meta-llama/Meta-Llama-3-8B |
|
|
|
## Installation |
|
|
|
**Prerequisites:** |
|
|
|
- A system with CUDA support is highly recommended for optimal performance. |
|
- Python 3.10 or later |
|
|
|
|
|
1. **Install required Python libraries:** |
|
|
|
|
|
```bash |
|
# GPU llama-cpp-python |
|
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose |
|
``` |
|
|
|
```bash |
|
%%capture |
|
!pip install huggingface-hub hf-transfer |
|
``` |
|
|
|
2. **Download model quantized:** |
|
```bash |
|
import os |
|
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1" |
|
!huggingface-cli download \ |
|
ruslanmv/Medical-Llama3-8B-GGUF \ |
|
medical-llama3-8b.Q5_K_M.gguf \ |
|
--local-dir . \ |
|
--local-dir-use-symlinks False |
|
|
|
MODEL_PATH="/content/medical-llama3-8b.Q5_K_M.gguf" |
|
``` |
|
|
|
|
|
## Example of use |
|
|
|
Here's an example of how to use the Medical-Llama3-8B-GGUF 4bit model to generate an answer to a medical question: |
|
|
|
```python |
|
from llama_cpp import Llama |
|
import json |
|
B_INST, E_INST = "<s>[INST]", "[/INST]" |
|
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n" |
|
DEFAULT_SYSTEM_PROMPT = """\ |
|
You are an AI Medical Chatbot Assistant, I'm equipped with a wealth of medical knowledge derived from extensive datasets. I aim to provide comprehensive and informative responses to your inquiries. However, please note that while I strive for accuracy, my responses should not replace professional medical advice and short answers. |
|
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.""" |
|
SYSTEM_PROMPT = B_SYS + DEFAULT_SYSTEM_PROMPT + E_SYS |
|
def create_prompt(user_query): |
|
instruction = f"User asks: {user_query}\n" |
|
prompt = B_INST + SYSTEM_PROMPT + instruction + E_INST |
|
return prompt.strip() |
|
|
|
|
|
user_query = "I'm a 35-year-old male experiencing symptoms like fatigue, increased sensitivity to cold, and dry, itchy skin. Could these be indicative of hypothyroidism?" |
|
prompt = create_prompt(user_query) |
|
print(prompt) |
|
|
|
llm = Llama(model_path=MODEL_PATH, n_gpu_layers=-1) |
|
result = llm( |
|
prompt=prompt, |
|
max_tokens=100, |
|
echo=False |
|
) |
|
print(result['choices'][0]['text']) |
|
``` |
|
|
|
The output exmample |
|
```bash |
|
Hi, thank you for your query. |
|
Hypothyroidism is characterized by fatigue, sensitivity to cold, weight gain, depression, hair loss and mental dullness. I would suggest that you get a complete blood count with thyroid profile including TSH (thyroid stimulating hormone), free thyroxine level, and anti-thyroglobulin antibodies. These tests will help in establishing the diagnosis of hypothyroidism. |
|
If there is no family history of autoimmune disorders, then it might be due |
|
``` |
|
|
|
|
|
## License |
|
|
|
This model is licensed under the Apache License 2.0. You can find the full license in the LICENSE file. |