README.md · ontocord/phi-3-22b-128k at 55b8808b873c4686a5a51c3716bf92a320bbd876

metadata

license: mit

Model Summary

The Phi-3-22b is a depth upsampled version of the 14b Phi-3-medium-128k-instruct. We removed the bottom 8 layers of one copy of the 14b and the top 8 layers of another copy of the 14b model and stacked them. We plan to do continued pretraining to improve performance. Since this model has not been continued pretrained, the quality may vary.

!pip intsall transformers accelerate 
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("ontocord/phi-3-22b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("ontocord/phi-3-22b", 
    torch_dtype="auto", device_map="auto", trust_remote_code=True,  )
with torch.no_grad():
  print(tokenizer.batch_decode(model.generate(**tokenizer("<|user|>\nHow to explain Internet for a medieval knight?<|end|>\n<|assistant|>\n", return_tensors="pt").to('cuda'), max_new_tokens=128), use_cache=True)[0])

Will produce:

<|user|> How to explain Internet for a medieval knight?<|end|><|assistant|> Ah, noble knight, let me attempt to explain this mystical realm known as the Internet in terms that might resonate with your medieval understanding.

Imagine, if you will, a vast kingdom stretching beyond the horizon, where countless villages, towns, and cities are connected by a network of roads, bridges, and pathways. This kingdom is not bound by physical borders, but instead, it exists in a realm beyond our own, accessible only through magical devices known as computers, tablets, and smartphs.

In this kingdom, information flows like a mighty river,...

See the Phi-3-medium-128k-instruct model card for more details.