--- title: Llama Hqq 1 Bit emoji: 📊 colorFrom: green colorTo: pink sdk: gradio sdk_version: 4.24.0 app_file: app.py license: llama2 train: false inference: false pipeline_tag: text-generation --- Demo for HQQ 1-bit quantized (binary weights) Llama2-7B-chat model using a low-rank adapter to improve the performance (referred to as HQQ+). You will need a GPU for this. https://huggingface.co/mobiuslabsgmbh/Llama-2-7b-chat-hf_1bitgs8_hqq