File size: 749 Bytes
04fe3e6
 
 
 
 
 
 
 
 
 
 
45c53f2
 
 
 
 
 
 
 
 
 
 
04fe3e6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---
license: apache-2.0
language:
- en
base_model:
- Qwen/QwQ-32B-Preview
pipeline_tag: text-generation
tags:
- chat
- qwen2
---
# QwQ-32B-Preview-bnb-4bit

## Introduction

QwQ-32B-Preview-bnb-4bit is a 4-bit quantized version of the [QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview) model, utilizing the Bits and Bytes (bnb) quantization technique. This quantization significantly reduces the model's size and inference latency, making it more accessible for deployment on resource-constrained hardware.

## Model Details

- **Quantization:** 4-bit using Bits and Bytes (bnb)
- **Base Model:** [Qwen/QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview)
- **Parameters:** 32.5 billion
- **Context Length:** Up to 32,768 tokens