---
licence: mit
tags:
- text-generation
- quantized
- bitsandbytes
- deepseek
- 4bit
---

# Quantized DeepSeek-R1-Distill-Qwen-1.5B

![Model Preview](https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true)

This is a **4-bit quantized version** of the [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) model using `bitsandbytes` quantization.

## Model Details
- **Base Model:** `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`
- **Quantization:** 4-bit (`NF4`)
- **Library:** [bitsandbytes](https://github.com/TimDettmers/bitsandbytes)
- **Framework:** `transformers`
- **Use Case:** Text generation, chatbot applications, and other NLP tasks.

## How to Load the Model

```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

model_id = "Deepak7376/DeepSeek-R1-Distill-Qwen-1.5B-bnb-4bit"

bnb_config_4bit = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config_4bit)
tokenizer = AutoTokenizer.from_pretrained(model_id)

pipe = pipeline(
        'text-generation',
        model=model,
        tokenizer=tokenizer,
        max_length=1024,
        truncation=True,
        do_sample=True,
        temperature=0.6,
        top_p=0.95,
    )

messages = [
    {"role": "user", "content": "suggest me top movies in 2021? <think>\n"},
]
pipe(messages)

```
or 

```python

from transformers import pipeline

pipe = pipeline("text-generation", model="Deepak7376/DeepSeek-R1-Distill-Qwen-1.5B-bnb-4bit")

messages = [
    {"role": "user", "content": "suggest me top movies in 2021? <think>\n"},
]
pipe(messages)
```

## Model Performance
Quantizing the model significantly reduces memory usage while maintaining good performance. Here are the memory footprints:

| Model Version | Memory Usage |
|--------------|-------------|
| Base Model | ~3.5GB |
| 4-bit Quantized | ~1.5GB |

## License
This model follows the `apache-2.0` license.

## Acknowledgments
- [DeepSeek-AI](https://huggingface.co/deepseek-ai) for the original model.
- [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes) for quantization support.