Help needed to load model

#13

by sanjay-dev-ds-28 - opened Aug 18, 2023

Aug 18, 2023

!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose

n_gpu_layers = 40 # Change this value based on your model and your GPU VRAM pool.
n_batch = 256 # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.

Loading model,

llm = LlamaCpp(
model_path=model_path,
max_tokens=256,
n_gpu_layers=n_gpu_layers,
n_batch=n_batch,
callback_manager=callback_manager,
n_ctx=1024,
verbose=False,
)

ValidationError: 1 validation error for LlamaCpp
root
Could not load Llama model from path: /root/.cache/huggingface/hub/models--TheBloke--Llama-2-13B-chat-GGML/snapshots/47d28ef5de4f3de523c421f325a2e4e039035bab/llama-2-13b-chat.ggmlv3.q5_1.bin. Received error fileno (type=value_error)

karimasbar

Aug 22, 2023

same problem

Brobles

Aug 29, 2023

Same problem :(

actionpace

Aug 29, 2023

llama.cpp and llama-cpp-python only support GGUF (not GGML) after a certain version - so try this
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip -qq install --upgrade --force-reinstall llama-cpp-python==0.1.78 --no-cache-dir

TheBloke

Owner Aug 29, 2023

I will be making GGUFs for these models tonight, so they're coming very soon

Brobles

Aug 29, 2023

@actionpace tried !CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip -qq install --upgrade --force-reinstall llama-cpp-python==0.1.78 --no-cache-dir with the same result :(

So we will have to wait for the GGUFs versions :)

qnixsynapse

Aug 29, 2023

Have you tried my version in my repo?

Brobles

Aug 29, 2023

yup @akarshanbiswas same result

/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py in __init__(__pydantic_self__, **data)
    339         values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
    340         if validation_error:
--> 341             raise validation_error
    342         try:
    343             object_setattr(__pydantic_self__, '__dict__', values)

ValidationError: 1 validation error for LlamaCpp
__root__
  Could not load Llama model from path: /root/.cache/huggingface/hub/models--akarshanbiswas--llama-2-chat-13b-gguf/snapshots/141acdcfecba05f5c0e046ee0339863fc9621004/ggml-llama-2-13b-chat-q4_k_m.gguf. Received error fileno (type=value_error)

qnixsynapse

Aug 29, 2023

•

edited Aug 29, 2023

It works correctly here.

Edit: replaced the console log with a screenshot:

Which version of llama.cpp python are you using?

Brobles

Aug 29, 2023

I just do

!pip install llama-cpp-python

and then

!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose

also tried with

!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip -qq install --upgrade --force-reinstall llama-cpp-python==0.1.78 --no-cache-dir

model_name_or_path = "akarshanbiswas/llama-2-chat-13b-gguf"
model_basename = "ggml-llama-2-13b-chat-q4_k_m.gguf"
model_path = hf_hub_download(repo_id=model_name_or_path, filename=model_basename)

n_gpu_layers = 40
n_batch = 256 

# Loading model,
llm = LlamaCpp(
    model_path=model_path,
    max_tokens=256,
    n_gpu_layers=n_gpu_layers,
    n_batch=n_batch,
    callback_manager=callback_manager,
    n_ctx=1024,
    verbose=False,
)

qnixsynapse

Aug 29, 2023

Try downloading it using browser. Save it to a location and pass the file path to the class

Brobles

Aug 29, 2023

Same result on collab sorry :(

AbeEstrada

Aug 29, 2023

Try with:

curl -OL https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/llama-2-13b-chat.ggmlv3.q5_1.bin

AbeEstrada

Aug 29, 2023

Oh, I see, you need the GGUF version

https://huggingface.co/TheBloke/CodeLlama-13B-GGUF

AbdelrahmanAhmed

Aug 30, 2023

I have the same problem and couldn't find any solution yet

AbdelrahmanAhmed

Sep 1, 2023

Fix for "Could not load Llama model from path":

Download GGUF model from this link:
https://huggingface.co/TheBloke/CodeLlama-13B-Python-GGUF

Code Example:

model_name_or_path = "TheBloke/CodeLlama-13B-Python-GGUF"
model_basename = "codellama-13b-python.Q5_K_M.gguf"
model_path = hf_hub_download(repo_id=model_name_or_path, filename=model_basename)

Then Change "verbose=False" to "verbose=True" like the following code:

llm = LlamaCpp(
model_path=model_path,
max_tokens=256,
n_gpu_layers=n_gpu_layers,
n_batch=n_batch,
callback_manager=callback_manager,
n_ctx=1024,
verbose=True,
)

paulinajohn

Sep 6, 2023

Please @TheBloke , is there GGUF for 7B-Chat yet? I can't seem to find one.

TheBloke

Owner Sep 6, 2023

Here you go: https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF

paulinajohn

Sep 6, 2023

Thank you, @TheBloke

krenova

Sep 15, 2023

Fix for "Could not load Llama model from path":

Download GGUF model from this link:
https://huggingface.co/TheBloke/CodeLlama-13B-Python-GGUF

Code Example:

model_name_or_path = "TheBloke/CodeLlama-13B-Python-GGUF"
model_basename = "codellama-13b-python.Q5_K_M.gguf"
model_path = hf_hub_download(repo_id=model_name_or_path, filename=model_basename)

Then Change "verbose=False" to "verbose=True" like the following code:

llm = LlamaCpp(
model_path=model_path,
max_tokens=256,
n_gpu_layers=n_gpu_layers,
n_batch=n_batch,
callback_manager=callback_manager,
n_ctx=1024,
verbose=True,
)

Thank you. This worked for me. Any ideas why this might be the case?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment