--- licence: mit tags: - text-generation - quantized - bitsandbytes - deepseek - 4bit --- # Quantized DeepSeek-R1-Distill-Qwen-1.5B ![Model Preview](https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true) This is a **4-bit quantized version** of the [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) model using `bitsandbytes` quantization. ## Model Details - **Base Model:** `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` - **Quantization:** 4-bit (`NF4`) - **Library:** [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) - **Framework:** `transformers` - **Use Case:** Text generation, chatbot applications, and other NLP tasks. ## How to Load the Model ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig model_id = "Deepak7376/DeepSeek-R1-Distill-Qwen-1.5B-bnb-4bit" bnb_config_4bit = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.float16, bnb_4bit_use_double_quant=True, ) model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config_4bit) tokenizer = AutoTokenizer.from_pretrained(model_id) pipe = pipeline( 'text-generation', model=model, tokenizer=tokenizer, max_length=1024, truncation=True, do_sample=True, temperature=0.6, top_p=0.95, ) messages = [ {"role": "user", "content": "suggest me top movies in 2021? \n"}, ] pipe(messages) ``` or ```python from transformers import pipeline pipe = pipeline("text-generation", model="Deepak7376/DeepSeek-R1-Distill-Qwen-1.5B-bnb-4bit") messages = [ {"role": "user", "content": "suggest me top movies in 2021? \n"}, ] pipe(messages) ``` ## Model Performance Quantizing the model significantly reduces memory usage while maintaining good performance. Here are the memory footprints: | Model Version | Memory Usage | |--------------|-------------| | Base Model | ~3.5GB | | 4-bit Quantized | ~1.5GB | ## License This model follows the `apache-2.0` license. ## Acknowledgments - [DeepSeek-AI](https://huggingface.co/deepseek-ai) for the original model. - [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes) for quantization support.