File size: 749 Bytes
04fe3e6 45c53f2 04fe3e6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
---
license: apache-2.0
language:
- en
base_model:
- Qwen/QwQ-32B-Preview
pipeline_tag: text-generation
tags:
- chat
- qwen2
---
# QwQ-32B-Preview-bnb-4bit
## Introduction
QwQ-32B-Preview-bnb-4bit is a 4-bit quantized version of the [QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview) model, utilizing the Bits and Bytes (bnb) quantization technique. This quantization significantly reduces the model's size and inference latency, making it more accessible for deployment on resource-constrained hardware.
## Model Details
- **Quantization:** 4-bit using Bits and Bytes (bnb)
- **Base Model:** [Qwen/QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview)
- **Parameters:** 32.5 billion
- **Context Length:** Up to 32,768 tokens |