RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
Which version of pytorch should use in this case?
Are you using it on a CPU or GPU ?
Did you push the model to GPU before running?
I have the same problem, any idea to deal with?
@HassanStar
I got the same error when running on Torch version 2.1.2 on a Mac if I tried to put the model on the CPU, but if I use torch.set_default_device("mps")
to use the Metal acceleration it works just fine.
Hello everyone!
CPU with FP16 does not work since there is no CPU-FP16 LayerNormalization kernel implementation on PyTorch.
Best regards,
Gustavo.
Did you push the model to GPU before running?
how do i do this?
Hi
@andreariboni
, if you have a Nvidia gpu then you can do model.to("cuda")
or if you are working on apple silicon then do model.to("mps")
. BTW don't forget to do the same to the inputs.
Hii All,
For cpu you can use this code.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
torch.set_default_device("cpu")
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2", torch_dtype=torch.float32, device_map="cpu", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-2", trust_remote_code=True)
inputs = tokenizer('''def print_prime(n):
"""
Print all primes between 1 and n
"""''', return_tensors="pt", return_attention_mask=False)
outputs = model.generate(**inputs, max_length=200)
text = tokenizer.batch_decode(outputs)[0]
print(text)