OPT-30B-Erebus-4bit-128g

Model description

Warning: THIS model is NOT suitable for use by minors. The model will output X-rated content.

This is a 4-bit GPTQ quantization of OPT-30B-Erebus, original model: https://huggingface.co/KoboldAI/OPT-30B-Erebus

Quantization Information

Quantized with: https://github.com/0cc4m/GPTQ-for-LLaMa

python opt.py --wbits 4 models/OPT-30B-Erebus c4 --groupsize 128 --save models/OPT-30B-Erebus-4bit-128g/OPT-30B-Erebus-4bit-128g.pt
python opt.py --wbits 4 models/OPT-30B-Erebus c4 --groupsize 128 --save_safetensors models/OPT-30B-Erebus-4bit-128g/OPT-30B-Erebus-4bit-128g.safetensors

Output generated in 54.23 seconds (0.87 tokens/s, 47 tokens, context 44, seed 593020441)

Command text-generation-webui:

https://github.com/oobabooga/text-generation-webui

call python server.py --model_type gptj --model OPT-30B-Erebus-4bit-128g --chat --wbits 4 --groupsize 128 --xformers --sdp-attention

Credit

https://huggingface.co/notstoic

License

OPT-30B is licensed under the OPT-175B license, Copyright (c) Meta Platforms, Inc. All Rights Reserved.

Downloads last month
1,817
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Dataset used to train Zicara/OPT-30B-Erebus-4bit-128g