File size: 1,532 Bytes
787a2e1 113939d 787a2e1 72bcc1a 787a2e1 72bcc1a f755977 787a2e1 72bcc1a 787a2e1 72bcc1a 787a2e1 72bcc1a 787a2e1 72bcc1a 787a2e1 72bcc1a 787a2e1 72bcc1a 787a2e1 72bcc1a 787a2e1 72bcc1a 787a2e1 72bcc1a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
---
library_name: peft
base_model: yahma/llama-7b-hf
language:
- en
pipeline_tag: text-generation
tags:
- text-generation-inference
---
# About :
AlpaRA 7B, a model for medical dialogue understanding. Fine-tuned using the Alpaca configuration on a curated 5,000-instruction dataset capturing nuances in patient-doctor conversations. Use Parameter Efficient Fine Tuning (PEFT) and Low Rank Adaptation (LoRA), make this model efficient on consumer-grade GPUs.
## How to Use :
## Load the AlpaRA model
```python
from peft import PeftModel
from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig
tokenizer = LlamaTokenizer.from_pretrained("yahma/llama-7b-hf")
model = LlamaForCausalLM.from_pretrained(
"yahma/llama-7b-hf",
load_in_8bit=True,
device_map="auto"
)
model = PeftModel.from_pretrained(model, "KalbeDigitalLab/alpara-7b-peft")
```
## Prompt Template :
Feel free to change the instruction
```python
PROMPT = """Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
"how to cure flu?"
### Response:"""
```
## Evaluation
```python
inputs = tokenizer(
PROMPT,
return_tensors="pt"
)
input_ids = inputs["input_ids"].cuda()
print("Generating...")
generation_output = model.generate(
input_ids=input_ids,
return_dict_in_generate=True,
output_scores=True,
max_new_tokens=512,
)
for s in generation_output.sequences:
result = tokenizer.decode(s).split("### Response:")[1]
print(result)
``` |