Could you provide Usage Example:
#1
by
simsim314
- opened
I am failing to make it run.
After installing exllamav2
!pip install https://github.com/turboderp/exllamav2/releases/download/v0.0.16/exllamav2-0.0.16+cu118-cp310-cp310-linux_x86_64.whl
And downloading the repository
huggingface-cli download turboderp/dbrx-instruct-exl2 --revision "2.75bpw" --local-dir dbrx_275
I was trying to run examples/chat.py:
python examples/chat.py -m "dbrx_275" -mode raw --gpu_split auto
Getting this error:
-- Model: dbrx_275
-- Options: ['gpu_split: auto']
!! Warning, unknown architecture: DbrxForCausalLM
!! Loading as LlamaForCausalLM
Traceback (most recent call last):
File "/workspace/exllamav2/examples/chat.py", line 93, in <module>
model, tokenizer = model_init.init(args, allow_auto_split = True, max_output_len = 16)
File "/usr/local/lib/python3.10/dist-packages/exllamav2/model_init.py", line 82, in init
config.prepare()
File "/usr/local/lib/python3.10/dist-packages/exllamav2/config.py", line 100, in prepare
self.num_hidden_layers = read_config["num_hidden_layers"]
KeyError: 'num_hidden_layers'
Please advice how to run this model.
you'll need a newer version of exl2, dbrx is only supported on master, not any released versions
simsim314
changed discussion status to
closed