Update README.md

1150101 verified 8 months ago

3.71 kB

	---
	language: en
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- ruslanmv
	- llama
	- trl
	base_model: meta-llama/Meta-Llama-3-8B
	datasets:
	- ruslanmv/ai-medical-chatbot
	---

	# Medical-Llama3-8B-GGUF
	[![](future.jpg)](https://ruslanmv.com/)
	This is a fine-tuned version of the Llama3 8B model, specifically designed to answer medical questions.
	The model was trained on the AI Medical Chatbot dataset, which can be found at [ruslanmv/ai-medical-chatbot](https://huggingface.co/datasets/ruslanmv/ai-medical-chatbot). This fine-tuned model leverages the GGUF (General-Purpose Gradient-based Quantization with Uniform Forwarding) technique for efficient inference with 4-bit quantization.

	Model: [ruslanmv/Medical-Llama3-8B-GGUF](https://huggingface.co/ruslanmv/Medical-Llama3-8B-GGUF)

	- Developed by: ruslanmv
	- License: apache-2.0
	- Finetuned from model: meta-llama/Meta-Llama-3-8B

	## Installation

	Prerequisites:

	- A system with CUDA support is highly recommended for optimal performance.
	- Python 3.10 or later


	1. Install required Python libraries:


	```bash
	# GPU llama-cpp-python
	!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose
	```

	```bash
	%%capture
	!pip install huggingface-hub hf-transfer
	```

	2. Download model quantized:
	```bash
	import os
	os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
	!huggingface-cli download \
	ruslanmv/Medical-Llama3-8B-GGUF \
	medical-llama3-8b.Q5_K_M.gguf \
	--local-dir . \
	--local-dir-use-symlinks False

	MODEL_PATH="/content/medical-llama3-8b.Q5_K_M.gguf"
	```


	## Example of use

	Here's an example of how to use the Medical-Llama3-8B-GGUF 4bit model to generate an answer to a medical question:

	```python
	from llama_cpp import Llama
	import json
	B_INST, E_INST = "<s>[INST]", "[/INST]"
	B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
	DEFAULT_SYSTEM_PROMPT = """\
	You are an AI Medical Chatbot Assistant, I'm equipped with a wealth of medical knowledge derived from extensive datasets. I aim to provide comprehensive and informative responses to your inquiries. However, please note that while I strive for accuracy, my responses should not replace professional medical advice and short answers.
	If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information."""
	SYSTEM_PROMPT = B_SYS + DEFAULT_SYSTEM_PROMPT + E_SYS
	def create_prompt(user_query):
	instruction = f"User asks: {user_query}\n"
	prompt = B_INST + SYSTEM_PROMPT + instruction + E_INST
	return prompt.strip()


	user_query = "I'm a 35-year-old male experiencing symptoms like fatigue, increased sensitivity to cold, and dry, itchy skin. Could these be indicative of hypothyroidism?"
	prompt = create_prompt(user_query)
	print(prompt)

	llm = Llama(model_path=MODEL_PATH, n_gpu_layers=-1)
	result = llm(
	prompt=prompt,
	max_tokens=100,
	echo=False
	)
	print(result['choices'][0]['text'])
	```

	The output exmample
	```bash
	Hi, thank you for your query.
	Hypothyroidism is characterized by fatigue, sensitivity to cold, weight gain, depression, hair loss and mental dullness. I would suggest that you get a complete blood count with thyroid profile including TSH (thyroid stimulating hormone), free thyroxine level, and anti-thyroglobulin antibodies. These tests will help in establishing the diagnosis of hypothyroidism.
	If there is no family history of autoimmune disorders, then it might be due
	```


	## License

	This model is licensed under the Apache License 2.0. You can find the full license in the LICENSE file.