README.md · MrOvkill/Phi-3-Instruct-Bloated at main

Phi-3-Instruct-Bloated / README.md

MrOvkill

Update README.md

0ffc2ad verified 10 months ago

preview code

raw

history blame contribute delete

2.12 kB

	---
	tags:
	- merge
	- mergekit
	- lazymergekit
	- microsoft/Phi-3-mini-128k-instruct
	- NexaAIDev/Octopus-v4
	base_model:
	- microsoft/Phi-3-mini-128k-instruct
	- NexaAIDev/Octopus-v4
	license: mit
	language:
	- en
	library_name: transformers
	pipeline_tag: text-generation
	---

	# Phi-3-Instruct-Bloated

	Phi-3-Instruct-Bloated is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
	* [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct)
	* [NexaAIDev/Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4)

	## 🧩 Configuration

	```yaml
	slices:
	- sources:
	- model: microsoft/Phi-3-mini-128k-instruct
	layer_range: [0, 32]
	- model: NexaAIDev/Octopus-v4
	layer_range: [0, 32]
	merge_method: slerp
	base_model: microsoft/Phi-3-mini-128k-instruct
	parameters:
	t:
	- filter: self_attn
	value: [0, 0.5, 0.3, 0.7, 1]
	- filter: mlp
	value: [1, 0.5, 0.7, 0.3, 0]
	- value: 0.5
	dtype: bfloat16
	```

	## 💻 Usage

	```python
	# Installation
	!pip install -qU transformers accelerate

	# Imports
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	# Loading
	tokenizer = AutoTokenizer.from_pretrained("MrOvkill/Phi-3-Instruct-Bloated")
	model = AutoModelForCausalLM.from_pretrained("MrOvkill/Phi-3-Instruct-Bloated")

	# Completion function
	def infer(prompt, **kwargs):
	inputs = tokenizer(prompt, return_tensors="pt")
	with torch.no_grad():
	outputs = model.generate(inputs, kwargs)
	return tokenizer.decode(outputs[0], skip_special_tokens=True)

	# Some silliness
	infer("<\|user\|>\nBen is going to the store for some Ice Cream. So is Jerry. They mix up the ice cream at the store. Is the ice cream: (a. Ben's (b. Jerry's (c. Ben and Jerry's <\|end\|>\n<\|assistant\|>\nMy answer is (", max_new_tokens=1024)

	# A proper test
	infer(
	"""
	<\|user\|>
	Explain what a Mixture of Experts is in less than 100 words.
	<\|assistant\|>
	""",
	max_new_tokens=1024,
	do_sample=False,
	temperature=0.0,
	top_k=50,
	top_p=0.89,
	)
	```