mabaochang
commited on
Commit
·
2707373
1
Parent(s):
5681744
Update README.md
Browse files
README.md
CHANGED
@@ -8,12 +8,14 @@ language:
|
|
8 |
- en
|
9 |
---
|
10 |
# GPTQ-for-Bloom
|
11 |
-
|
12 |
|
13 |
GPTQ is SOTA one-shot weight quantization method.
|
14 |
|
15 |
The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/gptq.
|
16 |
|
|
|
|
|
17 |
**This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**
|
18 |
|
19 |
## Model list
|
|
|
8 |
- en
|
9 |
---
|
10 |
# GPTQ-for-Bloom
|
11 |
+
8 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)
|
12 |
|
13 |
GPTQ is SOTA one-shot weight quantization method.
|
14 |
|
15 |
The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/gptq.
|
16 |
|
17 |
+
Basically, 8-bit quantization and 128 groupsize are recommended.
|
18 |
+
|
19 |
**This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**
|
20 |
|
21 |
## Model list
|