|
--- |
|
license: apache-2.0 |
|
base_model: |
|
- ibm-granite/granite-3.3-8b-instruct |
|
--- |
|
|
|
# Micro-G3.3-8B-Instruct-1B |
|
|
|
**Model Summary:** |
|
Micro-G3.3-8B-Instruct-1B is a 1-billion parameter micro language model fine-tuned for reasoning and instruction-following capabilities. Built on top of Granite-3.3-8B-Instruct, with only 3 hidden layers, this model is trained to maximize performance and hardware compatibility at minimal compute cost. |
|
|
|
**Generation:** |
|
This is a simple example of how to use Micro-G3.3-8B-Instruct-1B model. |
|
|
|
Install the following libraries: |
|
|
|
```shell |
|
pip install torch torchvision torchaudio |
|
pip install accelerate |
|
pip install transformers |
|
``` |
|
Then, copy the snippet from the section that is relevant for your use case. |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed |
|
import torch |
|
|
|
model_path="ibm-ai-platform/micro-g3.3-8b-instruct-1b" |
|
device="cuda" |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_path, |
|
device_map=device, |
|
torch_dtype=torch.bfloat16, |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained( |
|
model_path |
|
) |
|
|
|
conv = [{"role": "user", "content":"What is your favorite color?"}] |
|
|
|
input_ids = tokenizer.apply_chat_template(conv, return_tensors="pt", thinking=True, return_dict=True, add_generation_prompt=True).to(device) |
|
|
|
set_seed(42) |
|
output = model.generate( |
|
**input_ids, |
|
max_new_tokens=8, |
|
) |
|
|
|
prediction = tokenizer.decode(output[0, input_ids["input_ids"].shape[1]:], skip_special_tokens=True) |
|
print(prediction) |
|
``` |
|
|
|
|