---
title: Llama Hqq 1 Bit
emoji: 📊
colorFrom: green
colorTo: pink
sdk: gradio
sdk_version: 4.24.0
app_file: app.py
license: llama2
train: false
inference: false
pipeline_tag: text-generation
---

Demo for  HQQ 1-bit quantized (binary weights) Llama2-7B-chat model using a low-rank adapter to improve the performance (referred to as HQQ+).
You will need a GPU for this.

https://huggingface.co/mobiuslabsgmbh/Llama-2-7b-chat-hf_1bitgs8_hqq