Model fails to load with AutoModelForCausalLM
I see that the model was updated about 16 hours ago. When loading the model with AutoModelForCausalLM, it is failing. Could you take a look?
>>> model = AutoModelForCausalLM.from_pretrained("Xenova/llama2.c-stories15M", device_map="auto")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/gohashi/tmp2/llm-compressor/.venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
return model_class.from_pretrained(
File "/home/gohashi/tmp2/llm-compressor/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 262, in _wrapper
return func(*args, **kwargs)
File "/home/gohashi/tmp2/llm-compressor/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4397, in from_pretrained
dispatch_model(model, **device_map_kwargs)
File "/home/gohashi/tmp2/llm-compressor/.venv/lib/python3.10/site-packages/accelerate/big_modeling.py", line 496, in dispatch_model
model.to(device)
File "/home/gohashi/tmp2/llm-compressor/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3162, in to
return super().to(*args, **kwargs)
File "/home/gohashi/tmp2/llm-compressor/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1343, in to
return self._apply(convert)
File "/home/gohashi/tmp2/llm-compressor/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 903, in _apply
module._apply(fn)
File "/home/gohashi/tmp2/llm-compressor/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 903, in _apply
module._apply(fn)
File "/home/gohashi/tmp2/llm-compressor/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 930, in _apply
param_applied = fn(param)
File "/home/gohashi/tmp2/llm-compressor/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1336, in convert
raise NotImplementedError(
NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
Confirmed that the parameter that is on the meta device is the lm_head weight, and that this does not occur when setting use_safetensors=False
>>> model = AutoModelForCausalLM.from_pretrained("Xenova/llama2.c-stories15M")
seems to work for me (without device_map). Strange.
Indeed that seems to be it!
From what I can tell, the issue is that this model does not include the embed_tokens.weight
tensor in the state dict.
This means that the model is loaded with an lm_head.weight
on the execution device but the embed_tokens.weight
on the meta device
Later, when model.tie_weights()
is called, the output embeddings (lm_head
) gets assigned to the input embeddings (embed_tokens
), resulting in both lm_head
and embed_tokens
being on the meta device