Text Generation
Transformers
English
llama
Inference Endpoints

text-generation-webui: AttributeError: 'Offload_LlamaModel' object has no attribute 'preload', when trying to generate text

#21
by hpnyaggerman - opened

Traceback (most recent call last):
File "D:\oobabooga\text-generation-webui\modules\callbacks.py", line 66, in gentask
ret = self.mfunc(callback=_callback, **self.kwargs)
File "D:\oobabooga\text-generation-webui\modules\text_generation.py", line 290, in generate_with_callback
shared.model.generate(**kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\transformers\generation\utils.py", line 1485, in generate
return self.sample(
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\transformers\generation\utils.py", line 2524, in sample
outputs = self(
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\accelerate\hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\transformers\models\llama\modeling_llama.py", line 687, in forward
outputs = self.model(
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\accelerate\hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "D:\oobabooga\text-generation-webui\repositories\GPTQ-for-LLaMa\llama_inference_offload.py", line 135, in forward
if idx <= (self.preload - 1):
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\torch\nn\modules\module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'Offload_LlamaModel' object has no attribute 'preload'

Note: trying to load the GPTQ safetensors model

Try the ooba fork of GPTQ. Do not use the new one.

Doesn't seem to do much, unless I am tarded and using the wrong version. Using https://github.com/oobabooga/GPTQ-for-LLaMa/

That is definitely the right one. I only have this set up on linux tho.

Worked for me after I downloaded the config.json in the other folder

What folder? Where? How?

https://huggingface.co/reeducator/vicuna-13b-free/tree/main/hf-output

Also has FP16 so you can convert it to whatever you want. Like act order and no group size.

The issue for me was that I was using an outdated gptq-for-llama repo. I checked the readme and it says to delete that folder before updating.

For anyone that needs instruction:
Windows: Simply delete the GPTQ-for-LLaMa folder (located at /text-generation-webui/repositories/) then run the update_windows.bat if you used the windows version
Linux: Delete the same folder as the windows one, replace with the newest from https://github.com/oobabooga/GPTQ-for-LLaMa.git
Clone the repo to your machine in the repositories folder with:

git clone https://github.com/oobabooga/GPTQ-for-LLaMa.git -b cuda

cd to the newly cloned directory, then

python -m pip install -r requirements.txt

Sign up or log in to comment