cgus commited on
Commit
8e14ef6
·
verified ·
1 Parent(s): 25e2134

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -3
README.md CHANGED
@@ -15,8 +15,26 @@ tags:
15
  license: apache-2.0
16
  language:
17
  - en
 
18
  ---
19
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  # Granite-3.1-8B-Reasoning (Fine-Tuned for Advanced Reasoning)
21
 
22
  ## Model Overview
@@ -162,5 +180,4 @@ If you use this model in your research or applications, please cite:
162
  year={2025},
163
  url={https://huggingface.co/ruslanmv/granite-3.1-8b-Reasoning}
164
  }
165
- ```
166
-
 
15
  license: apache-2.0
16
  language:
17
  - en
18
+ library_name: exllamav2
19
  ---
20
+ # Granite-3.1-8B-Reasoning-exl2
21
+ Original model: [granite-3.1-8b-Reasoning](https://huggingface.co/ruslanmv/granite-3.1-8b-Reasoning) by [https://huggingface.co/ruslanmv](https://huggingface.co/ruslanmv)
22
+ Based on: [granite-3.1-8b-instruct](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct) by [Granite Team, IBM](https://huggingface.co/ibm-granite)
23
+
24
+ ## Quants
25
+ [4bpw h6 (main)](https://huggingface.co/cgus/granite-3.1-8b-Reasoning-exl2/tree/main)
26
+ [4.5bpw h6](https://huggingface.co/cgus/granite-3.1-8b-Reasoning-exl2/tree/4.5bpw-h6)
27
+ [5bpw h6](https://huggingface.co/cgus/granite-3.1-8b-Reasoning-exl2/tree/5bpw-h6)
28
+ [6bpw h6](https://huggingface.co/cgus/granite-3.1-8b-Reasoning-exl2/tree/6bpw-h6)
29
+ [8bpw h8](https://huggingface.co/cgus/granite-3.1-8b-Reasoning-exl2/tree/8bpw-h8)
30
+
31
+ ## Quantization notes
32
+ Made with Exllamav2 0.2.8 with default dataset. These quants require Exllamav2 0.2.7 or newer.
33
+ They meant to be used with apps that support exl2 models such as TabbyAPI, Text-Generation-WebUI and others.
34
+ On Windows it requires a Nvidia RTX2xxx or newer GPU, on Linux it can be used with Nvidia RTX or AMD ROCm cards.
35
+ Models are required to be fully loaded into GPU, native RAM offloading isn't supported.
36
+ If you need RAM offloading or have some other GPU, try GGUF quants instead.
37
+ # Original model card
38
  # Granite-3.1-8B-Reasoning (Fine-Tuned for Advanced Reasoning)
39
 
40
  ## Model Overview
 
180
  year={2025},
181
  url={https://huggingface.co/ruslanmv/granite-3.1-8b-Reasoning}
182
  }
183
+ ```