File size: 5,402 Bytes
63090e3 2de27b5 63090e3 2de27b5 63090e3 2de27b5 63090e3 14da2e6 2de27b5 63090e3 2de27b5 4cd524c 2de27b5 4cd524c 2de27b5 4cd524c 2de27b5 4cd524c 2de27b5 4cd524c 2de27b5 63090e3 2de27b5 63090e3 2de27b5 63090e3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
---
base_model: ibm-granite/granite-3.1-8b-instruct
tags:
- text-generation
- transformers
- safetensors
- english
- granite
- text-generation-inference
- ruslanmv
- trl
- grpo
- conversational
- inference-endpoints
license: apache-2.0
language:
- en
---
# Granite-3.1-8B-Reasoning (Fine-Tuned for Advanced Reasoning)
## Model Overview
This model is a **fine-tuned version** of **ibm-granite/granite-3.1-8b-instruct**, optimized for **logical reasoning and analytical tasks**. Fine-tuning has been performed to **enhance structured problem-solving, long-context comprehension, and instruction-following capabilities**.
- **Developed by:** [ruslanmv](https://huggingface.co/ruslanmv)
- **License:** Apache 2.0
- **Base Model:** [ibm-granite/granite-3.1-8b-instruct](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct)
- **Fine-tuned for:** Logical reasoning, structured problem-solving, and long-context tasks
- **Training Framework:** **Unsloth & Hugging Face TRL** (2x faster training)
- **Supported Languages:** English
- **Model Size:** **8.17B params**
- **Tensor Type:** **BF16**
---
## Why Use This Model?
This **fine-tuned model** improves upon the base **Granite-3.1-8B** model by enhancing its **reasoning capabilities** while retaining its general text-generation abilities.
✅ **Optimized for complex reasoning tasks**
✅ **Enhanced long-context understanding**
✅ **Improved instruction-following abilities**
✅ **Fine-tuned for structured analytical thinking**
---
## Installation & Usage
Install the required dependencies:
```bash
pip install torch torchvision torchaudio
pip install accelerate
pip install transformers
```
### Running the Model
Use the following Python snippet to load and generate text with **Granite-3.1-8B-Reasoning**:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
import torch
# Model and tokenizer
model_name = "ruslanmv/granite-3.1-8b-Reasoning" # Or "ruslanmv/granite-3.1-2b-Reasoning"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map='auto', # or 'cuda' if you have only one GPU
torch_dtype=torch.float16, # Use float16 for faster and less memory intensive inference
load_in_4bit=True # Enable 4-bit quantization for lower memory usage - requires bitsandbytes
)
# Prepare dataset
SYSTEM_PROMPT = """
Respond in the following format:
<reasoning>
...
</reasoning>
<answer>
...
</answer>
"""
text = tokenizer.apply_chat_template([
{"role" : "system", "content" : SYSTEM_PROMPT},
{"role" : "user", "content" : "Calculate pi."},
], tokenize = False, add_generation_prompt = True)
inputs = tokenizer(text, return_tensors="pt").to("cuda") # Move input tensor to GPU
# Sampling parameters
generation_config = GenerationConfig(
temperature = 0.8,
top_p = 0.95,
max_new_tokens = 1024, # Equivalent to max_tokens in the original code, but for generation
)
# Inference
with torch.inference_mode(): # Use inference mode for faster generation
outputs = model.generate(**inputs, generation_config=generation_config)
output = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Find the start of the actual response
start_index = output.find("assistant")
if start_index != -1:
# Remove the initial part including "assistant"
output = output[start_index + len("assistant"):].strip()
print(output)
```
You will get something like:
```
<reasoning>
Pi is an irrational number, which means it cannot be exactly calculated as it has an infinite number of decimal places. However, we can approximate pi using various mathematical formulas. One of the simplest methods is the Leibniz formula for pi, which is an infinite series:
pi = 4 * (1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 +...)
This series converges to pi as more terms are added.
</reasoning>
<answer>
The exact value of pi cannot be calculated due to its infinite decimal places. However, using the Leibniz formula, we can approximate pi to a certain number of decimal places. For example, after calculating the first 500 terms of the series, we get an approximation of pi as 3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679.
</answer>
```
---
## Intended Use
Granite-3.1-8B-Reasoning is designed for **tasks requiring structured and logical reasoning**, including:
- **Logical and analytical problem-solving**
- **Text-based reasoning tasks**
- **Mathematical and symbolic reasoning**
- **Advanced instruction-following**
- **Conversational AI with a focus on structured responses**
This model is particularly useful for **enterprise AI applications, research, and large-scale NLP tasks**.
---
## License & Acknowledgments
This model is released under the **Apache 2.0** license. It is fine-tuned from IBM’s **Granite 3.1-8B-Instruct** model. Special thanks to the **IBM Granite Team** for developing the base model.
For more details, visit the [IBM Granite Documentation](https://huggingface.co/ibm-granite).
---
### Citation
If you use this model in your research or applications, please cite:
```
@misc{ruslanmv2025granite,
title={Fine-Tuning Granite-3.1-8B for Advanced Reasoning},
author={Ruslan M.V.},
year={2025},
url={https://huggingface.co/ruslanmv/granite-3.1-8b-Reasoning}
}
```
|