QMB15
/

VicUnlocked-30B-gptq-cuda

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

VicUnlocked-30B-gptq-cuda / README.md

QMB15's picture

Create README.md

417b435 over 1 year ago

|

293 Bytes

	Made by merging the following lora:
	https://huggingface.co/Neko-Institute-of-Science/VicUnLocked-30b-LoRA

	Then quantizing with ooba's old CUDA branch of GPTQ
	```
	python llama.py vicunlocked-30b c4 --wbits 4 --true-sequential --act-order --save_safetensors vicunlocked-30b-4bit.safetensors
	```