Update README.md
Browse files
README.md
CHANGED
@@ -15,8 +15,26 @@ tags:
|
|
15 |
license: apache-2.0
|
16 |
language:
|
17 |
- en
|
|
|
18 |
---
|
19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
# Granite-3.1-8B-Reasoning (Fine-Tuned for Advanced Reasoning)
|
21 |
|
22 |
## Model Overview
|
@@ -162,5 +180,4 @@ If you use this model in your research or applications, please cite:
|
|
162 |
year={2025},
|
163 |
url={https://huggingface.co/ruslanmv/granite-3.1-8b-Reasoning}
|
164 |
}
|
165 |
-
```
|
166 |
-
|
|
|
15 |
license: apache-2.0
|
16 |
language:
|
17 |
- en
|
18 |
+
library_name: exllamav2
|
19 |
---
|
20 |
+
# Granite-3.1-8B-Reasoning-exl2
|
21 |
+
Original model: [granite-3.1-8b-Reasoning](https://huggingface.co/ruslanmv/granite-3.1-8b-Reasoning) by [https://huggingface.co/ruslanmv](https://huggingface.co/ruslanmv)
|
22 |
+
Based on: [granite-3.1-8b-instruct](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct) by [Granite Team, IBM](https://huggingface.co/ibm-granite)
|
23 |
+
|
24 |
+
## Quants
|
25 |
+
[4bpw h6 (main)](https://huggingface.co/cgus/granite-3.1-8b-Reasoning-exl2/tree/main)
|
26 |
+
[4.5bpw h6](https://huggingface.co/cgus/granite-3.1-8b-Reasoning-exl2/tree/4.5bpw-h6)
|
27 |
+
[5bpw h6](https://huggingface.co/cgus/granite-3.1-8b-Reasoning-exl2/tree/5bpw-h6)
|
28 |
+
[6bpw h6](https://huggingface.co/cgus/granite-3.1-8b-Reasoning-exl2/tree/6bpw-h6)
|
29 |
+
[8bpw h8](https://huggingface.co/cgus/granite-3.1-8b-Reasoning-exl2/tree/8bpw-h8)
|
30 |
+
|
31 |
+
## Quantization notes
|
32 |
+
Made with Exllamav2 0.2.8 with default dataset. These quants require Exllamav2 0.2.7 or newer.
|
33 |
+
They meant to be used with apps that support exl2 models such as TabbyAPI, Text-Generation-WebUI and others.
|
34 |
+
On Windows it requires a Nvidia RTX2xxx or newer GPU, on Linux it can be used with Nvidia RTX or AMD ROCm cards.
|
35 |
+
Models are required to be fully loaded into GPU, native RAM offloading isn't supported.
|
36 |
+
If you need RAM offloading or have some other GPU, try GGUF quants instead.
|
37 |
+
# Original model card
|
38 |
# Granite-3.1-8B-Reasoning (Fine-Tuned for Advanced Reasoning)
|
39 |
|
40 |
## Model Overview
|
|
|
180 |
year={2025},
|
181 |
url={https://huggingface.co/ruslanmv/granite-3.1-8b-Reasoning}
|
182 |
}
|
183 |
+
```
|
|