KalbeDigitalLab
/

alpara-7b-peft

Text Generation

text-generation-inference

Model card Files Files and versions Community

alpara-7b-peft / README.md

adhisetiawan's picture

Update README.md

f755977 about 1 year ago

|

history blame contribute delete

1.53 kB

	---
	library_name: peft
	base_model: yahma/llama-7b-hf
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- text-generation-inference
	---

	# About :
	AlpaRA 7B, a model for medical dialogue understanding. Fine-tuned using the Alpaca configuration on a curated 5,000-instruction dataset capturing nuances in patient-doctor conversations. Use Parameter Efficient Fine Tuning (PEFT) and Low Rank Adaptation (LoRA), make this model efficient on consumer-grade GPUs.

	## How to Use :
	## Load the AlpaRA model

	```python
	from peft import PeftModel
	from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig

	tokenizer = LlamaTokenizer.from_pretrained("yahma/llama-7b-hf")

	model = LlamaForCausalLM.from_pretrained(
	"yahma/llama-7b-hf",
	load_in_8bit=True,
	device_map="auto"
	)
	model = PeftModel.from_pretrained(model, "KalbeDigitalLab/alpara-7b-peft")
	```

	## Prompt Template :

	Feel free to change the instruction

	```python
	PROMPT = """Below is an instruction that describes a task. Write a response that appropriately completes the request.


	### Instruction:
	"how to cure flu?"

	### Response:"""
	```

	## Evaluation

	```python
	inputs = tokenizer(
	PROMPT,
	return_tensors="pt"
	)
	input_ids = inputs["input_ids"].cuda()

	print("Generating...")
	generation_output = model.generate(
	input_ids=input_ids,
	return_dict_in_generate=True,
	output_scores=True,
	max_new_tokens=512,
	)
	for s in generation_output.sequences:
	result = tokenizer.decode(s).split("### Response:")[1]
	print(result)
	```