arishiki's picture
Update Readme.md
1ae2e3b verified
|
raw
history blame
608 Bytes
---
license: llama3.2
base_model:
- meta-llama/Llama-3.2-1B-Instruct
---
This model is a quantized version of Llama-3.2-1B-Instruct.
Code used for generation is as follows:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, GPTQConfig
import torch
model_id = "meta-llama/Llama-3.2-1B-Instruct"
quantization_config = GPTQConfig(
bits=4,
group_size=128,
dataset="c4",
desc_act=False,
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
quant_model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=quantization_config, device_map='auto')
```