TheBloke commited on
Commit
b564638
·
1 Parent(s): 4ea20c1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -10
README.md CHANGED
@@ -28,20 +28,30 @@ It is the result of quantising to 4bit using [AutoGPTQ](https://github.com/PanQi
28
  * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/CAMEL-33B-Combined-Data-GPTQ)
29
  * [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/baichuan-inc/baichuan-7B)
30
 
31
- ## Experimental first GPTQ, requires AutoGPTQ PR
32
 
33
  This is a first quantisation of a brand new model type.
34
 
35
- It will only work with AutoGPTQ, and only by merging [LaaZa's PR](https://github.com/PanQiWei/AutoGPTQ/pull/164).
36
 
37
- To merge this PR, please follow these steps to install AutoGPTQ from source:
 
38
  ```
39
  pip uninstall -y auto-gptq
40
- git clone -b Baichuan https://github.com/LaaZa/AutoGPTQ baichuan_AutoGPTQ
41
- cd baichuan_AutoGPTQ
42
  GITHUB_ACTIONS=true pip install .
43
  ```
44
 
 
 
 
 
 
 
 
 
 
45
  ## Trust Remote Code
46
 
47
  As this is a new model type, not yet supported by Transformers, you must run inference with Trust Remote Code set.
@@ -59,7 +69,6 @@ The example given in the README is a 1-shot categorisation:
59
  Hamlet->Shakespeare\nOne Hundred Years of Solitude->
60
  ```
61
 
62
-
63
  ## How to easily download and use this model in text-generation-webui
64
 
65
  Please make sure you're using the latest version of text-generation-webui
@@ -78,7 +87,7 @@ Please make sure you're using the latest version of text-generation-webui
78
 
79
  ## How to use this GPTQ model from Python code
80
 
81
- First make sure you have the [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) PR installed as mentioned above.
82
 
83
  Then try the following example code:
84
 
@@ -86,7 +95,9 @@ Then try the following example code:
86
  from transformers import AutoTokenizer
87
  from auto_gptq import AutoGPTQForCausalLM
88
 
89
- model_name_or_path = "/workspace/process/baichuan-7B/gptq"
 
 
90
 
91
  tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)
92
 
@@ -112,10 +123,10 @@ print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
112
 
113
  **gptq_model-4bit-128g.safetensors**
114
 
115
- This will work only with [AutoGPTQ using LaaZa's PR](https://github.com/PanQiWei/AutoGPTQ/pull/164).
116
 
117
  * `gptq_model-4bit-128g.safetensors`
118
- * Works only with AutoGPTQ, currently requiring using [LaaZa's PR](https://github.com/PanQiWei/AutoGPTQ/pull/164).
119
  * Requires `trust_remote_code`.
120
  * Works with text-generation-webui, but not yet with one-click-installers unless you manually re-compile AutoGPTQ.
121
  * Parameters: Groupsize = 128. Act Order / desc_act = False.
 
28
  * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/CAMEL-33B-Combined-Data-GPTQ)
29
  * [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/baichuan-inc/baichuan-7B)
30
 
31
+ ## Experimental first GPTQ, requires latest AutoGPTq code
32
 
33
  This is a first quantisation of a brand new model type.
34
 
35
+ It will only work with AutoGPTQ, and only using the latest version of AutoGPTQ, compiled from source
36
 
37
+ To merge this PR, please follow these steps to install the latest AutoGPTQ from source:
38
+ **Linux**
39
  ```
40
  pip uninstall -y auto-gptq
41
+ git clone https://github.com/PanQiWei/AutoGPTQ
42
+ cd AutoGPTQ
43
  GITHUB_ACTIONS=true pip install .
44
  ```
45
 
46
+ **Windows (command prompt)**:
47
+ ```
48
+ pip uninstall -y auto-gptq
49
+ git clone https://github.com/PanQiWei/AutoGPTQ
50
+ cd AutoGPTQ
51
+ set GITHUB_ACTIONS=true
52
+ pip install .
53
+ ```
54
+
55
  ## Trust Remote Code
56
 
57
  As this is a new model type, not yet supported by Transformers, you must run inference with Trust Remote Code set.
 
69
  Hamlet->Shakespeare\nOne Hundred Years of Solitude->
70
  ```
71
 
 
72
  ## How to easily download and use this model in text-generation-webui
73
 
74
  Please make sure you're using the latest version of text-generation-webui
 
87
 
88
  ## How to use this GPTQ model from Python code
89
 
90
+ First make sure you have the latest [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) installed from source as mentioned above.
91
 
92
  Then try the following example code:
93
 
 
95
  from transformers import AutoTokenizer
96
  from auto_gptq import AutoGPTQForCausalLM
97
 
98
+ model_name_or_path = 'TheBloke/baichuan-7B-GPTQ'
99
+ # Or you can clone the model locally and reference it on disk, eg with:
100
+ # model_name_or_path = "/path/to/TheBloke_baichuan-7B"
101
 
102
  tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)
103
 
 
123
 
124
  **gptq_model-4bit-128g.safetensors**
125
 
126
+ This will currently only work with the latest [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ), compiled from source.
127
 
128
  * `gptq_model-4bit-128g.safetensors`
129
+ * Works only with latest AutoGPTQ, compiled from source.
130
  * Requires `trust_remote_code`.
131
  * Works with text-generation-webui, but not yet with one-click-installers unless you manually re-compile AutoGPTQ.
132
  * Parameters: Groupsize = 128. Act Order / desc_act = False.