huu-ontocord
commited on
Commit
•
f80eb9e
1
Parent(s):
e0ebba7
Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ license: mit
|
|
6 |
|
7 |
The Phi-3-22b is a depth upsampled version of the 14b [Phi-3-medium-128k-instruct](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct). We removed the bottom 8 layers of one copy of the 14b and the top 8 layers of another copy of the 14b model and stacked them. We plan to do continued pretraining to improve performance.
|
8 |
Since this model has not been continued pretrained, the quality may vary.
|
9 |
-
|
10 |
```
|
11 |
!pip install flash-attn --no-build-isolation
|
12 |
!pip install peft bitsandbytes accelerate transformers
|
@@ -15,11 +15,13 @@ import torch
|
|
15 |
tokenizer = AutoTokenizer.from_pretrained("ontocord/phi-3-22b", trust_remote_code=True)
|
16 |
model = AutoModelForCausalLM.from_pretrained("ontocord/phi-3-22b",
|
17 |
torch_dtype="auto", device_map="auto", trust_remote_code=True, )
|
|
|
|
|
|
|
|
|
18 |
with torch.no_grad():
|
19 |
print(tokenizer.batch_decode(model.generate(**tokenizer("<|user|>\nHow to explain Internet for a medieval knight?<|end|>\n<|assistant|>\n", return_tensors="pt").to('cuda'), max_new_tokens=128), use_cache=True)[0])
|
20 |
-
|
21 |
```
|
22 |
-
|
23 |
Will produce:
|
24 |
```
|
25 |
<|user|> How to explain Internet for a medieval knight?<|end|><|assistant|> Ah, noble knight, let me attempt to explain this mystical realm known as the Internet in terms that might resonate with your medieval understanding.
|
|
|
6 |
|
7 |
The Phi-3-22b is a depth upsampled version of the 14b [Phi-3-medium-128k-instruct](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct). We removed the bottom 8 layers of one copy of the 14b and the top 8 layers of another copy of the 14b model and stacked them. We plan to do continued pretraining to improve performance.
|
8 |
Since this model has not been continued pretrained, the quality may vary.
|
9 |
+
Loading:
|
10 |
```
|
11 |
!pip install flash-attn --no-build-isolation
|
12 |
!pip install peft bitsandbytes accelerate transformers
|
|
|
15 |
tokenizer = AutoTokenizer.from_pretrained("ontocord/phi-3-22b", trust_remote_code=True)
|
16 |
model = AutoModelForCausalLM.from_pretrained("ontocord/phi-3-22b",
|
17 |
torch_dtype="auto", device_map="auto", trust_remote_code=True, )
|
18 |
+
|
19 |
+
```
|
20 |
+
Basic test
|
21 |
+
```
|
22 |
with torch.no_grad():
|
23 |
print(tokenizer.batch_decode(model.generate(**tokenizer("<|user|>\nHow to explain Internet for a medieval knight?<|end|>\n<|assistant|>\n", return_tensors="pt").to('cuda'), max_new_tokens=128), use_cache=True)[0])
|
|
|
24 |
```
|
|
|
25 |
Will produce:
|
26 |
```
|
27 |
<|user|> How to explain Internet for a medieval knight?<|end|><|assistant|> Ah, noble knight, let me attempt to explain this mystical realm known as the Internet in terms that might resonate with your medieval understanding.
|