File size: 1,881 Bytes
7e86c98 121e5d8 7e86c98 199ea5c 7e86c98 121e5d8 7e86c98 a7e38e7 7e86c98 47bfd78 55d9834 7e86c98 55d9834 7e86c98 47bfd78 7e86c98 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
---
language:
- en
license: apache-2.0
library_name: peft
tags:
- trl
- unsloth
- nlp
- code
base_model: unsloth/Phi-3-mini-4k-instruct-bnb-4bit
datasets:
- reciperesearch/dolphin-sft-v0.1-preference
pipeline_tag: text-generation
widget:
- messages:
- role: user
content: Can you provide ways to eat combinations of bananas and dragonfruits?
---
## Model Summary
The Phi-3-Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties.
### Chat Format
Given the nature of the training data, the Phi-3 Mini-4K-Instruct model is best suited for prompts using the chat format as follows.
```python
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{}
### Input:
{}
### Response:
{}"""
```
### Sample inference code
This code snippets show how to get quickly started with running the model on a GPU:
```python
pip install peft transformers bitsandbytes accelerate
```
```python
from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"rishiraj/Phi-3-mini-4k-ORPO",
load_in_4bit = True,
)
tokenizer = AutoTokenizer.from_pretrained("rishiraj/Phi-3-mini-4k-ORPO")
# alpaca_prompt = You MUST copy from above!
inputs = tokenizer(
[
alpaca_prompt.format(
"What is a famous tall tower in Paris?", # instruction
"", # input
"", # output - leave this blank for generation!
)
], return_tensors = "pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
tokenizer.batch_decode(outputs)
``` |