ibm-ai-platform
/

micro-g3.3-8b-instruct-1b

Model card Files Files and versions

mwjohnson commited on 24 days ago

Commit

6573b3b

·

verified ·

1 Parent(s): be6dc55

Update README.md

Files changed (1) hide show

README.md +52 -3

README.md CHANGED Viewed

@@ -1,3 +1,52 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+base_model:
+- ibm-granite/granite-3.3-8b-instruct
+---
+# Micro-G3.3-8B-Instruct-1B
+**Model Summary:**
+Micro-G3.3-8B-Instruct-1B is a 1-billion parameter micro language model fine-tuned for reasoning and instruction-following capabilities. Built on top of Granite-3.3-8B-Instruct, with only 3 hidden layers, this model is trained to maximize performance and hardware compatibility at minimal compute cost.
+**Generation:**
+This is a simple example of how to use Micro-G3.3-8B-Instruct-1B model.
+Install the following libraries:
+```shell
+pip install torch torchvision torchaudio
+pip install accelerate
+pip install transformers
+```
+Then, copy the snippet from the section that is relevant for your use case.
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed
+import torch
+model_path="ibm-ai-platform/micro-g3.3-8b-instruct-1b"
+device="cuda"
+model = AutoModelForCausalLM.from_pretrained(
+        model_path,
+        device_map=device,
+        torch_dtype=torch.bfloat16,
+    )
+tokenizer = AutoTokenizer.from_pretrained(
+        model_path
+)
+conv = [{"role": "user", "content":"What is your favorite color?"}]
+input_ids = tokenizer.apply_chat_template(conv, return_tensors="pt", thinking=True, return_dict=True, add_generation_prompt=True).to(device)
+set_seed(42)
+output = model.generate(
+    **input_ids,
+    max_new_tokens=8,
+)
+prediction = tokenizer.decode(output[0, input_ids["input_ids"].shape[1]:], skip_special_tokens=True)
+print(prediction)
+```