TheBloke commited on
Commit
9e96020
·
1 Parent(s): 73ee3c7

Initial GPTQ model commit

Browse files
Files changed (1) hide show
  1. README.md +78 -11
README.md CHANGED
@@ -37,15 +37,13 @@ Multiple GPTQ parameter permutations are provided; see Provided Files below for
37
  * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/stablecode-instruct-alpha-3b-GGML)
38
  * [StabilityAI's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/stabilityai/stablecode-instruct-alpha-3b)
39
 
40
- ## Prompt template: Alpaca
41
 
42
  ```
43
- Below is an instruction that describes a task. Write a response that appropriately completes the request.
44
-
45
- ### Instruction:
46
  {prompt}
47
 
48
- ### Response:
49
  ```
50
 
51
  ## Provided files and GPTQ parameters
@@ -76,7 +74,7 @@ All GPTQ files are made with AutoGPTQ.
76
  | [gptq-4bit-64g-actorder_True](https://huggingface.co/TheBloke/stablecode-instruct-alpha-3b-GPTQ/tree/gptq-4bit-64g-actorder_True) | 4 | 64 | Yes | 0.1 | [Evol Instruct Code](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1) | 4096 | 1.86 GB | No | 4-bit, with Act Order and group size 64g. Uses less VRAM than 32g, but with slightly lower accuracy. Poor AutoGPTQ CUDA speed. |
77
  | [gptq-4bit-128g-actorder_True](https://huggingface.co/TheBloke/stablecode-instruct-alpha-3b-GPTQ/tree/gptq-4bit-128g-actorder_True) | 4 | 128 | Yes | 0.1 | [Evol Instruct Code](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1) | 4096 | 1.82 GB | No | 4-bit, with Act Order and group size 128g. Uses even less VRAM than 64g, but with slightly lower accuracy. Poor AutoGPTQ CUDA speed. |
78
  | [gptq-8bit-128g-actorder_True](https://huggingface.co/TheBloke/stablecode-instruct-alpha-3b-GPTQ/tree/gptq-8bit-128g-actorder_True) | 8 | 128 | Yes | 0.1 | [Evol Instruct Code](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1) | 4096 | 3.08 GB | No | 8-bit, with group size 128g for higher inference quality and with Act Order for even higher accuracy. Poor AutoGPTQ CUDA speed. |
79
- | [gptq-8bit-64g-actorder_True](https://huggingface.co/TheBloke/stablecode-instruct-alpha-3b-GPTQ/tree/gptq-8bit-64g-actorder_True) | 8 | 64 | Yes | 0.1 | [Evol Instruct Code](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1) | 4096 | 3.14 GB | No | 8-bit, with group size 64g and Act Order for maximum inference quality. Poor AutoGPTQ CUDA speed. |
80
 
81
  ## How to download from branches
82
 
@@ -154,12 +152,10 @@ model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
154
  """
155
 
156
  prompt = "Tell me about AI"
157
- prompt_template=f'''Below is an instruction that describes a task. Write a response that appropriately completes the request.
158
-
159
- ### Instruction:
160
  {prompt}
161
 
162
- ### Response:
163
  '''
164
 
165
  print("\n\n*** Generate:")
@@ -224,4 +220,75 @@ Thank you to all my generous patrons and donaters!
224
 
225
  # Original model card: StabilityAI's Stablecode Instruct Alpha 3B
226
 
227
- No original model card was provided.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/stablecode-instruct-alpha-3b-GGML)
38
  * [StabilityAI's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/stabilityai/stablecode-instruct-alpha-3b)
39
 
40
+ ## Prompt template: StableCode
41
 
42
  ```
43
+ ###Instruction:
 
 
44
  {prompt}
45
 
46
+ ###Response:
47
  ```
48
 
49
  ## Provided files and GPTQ parameters
 
74
  | [gptq-4bit-64g-actorder_True](https://huggingface.co/TheBloke/stablecode-instruct-alpha-3b-GPTQ/tree/gptq-4bit-64g-actorder_True) | 4 | 64 | Yes | 0.1 | [Evol Instruct Code](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1) | 4096 | 1.86 GB | No | 4-bit, with Act Order and group size 64g. Uses less VRAM than 32g, but with slightly lower accuracy. Poor AutoGPTQ CUDA speed. |
75
  | [gptq-4bit-128g-actorder_True](https://huggingface.co/TheBloke/stablecode-instruct-alpha-3b-GPTQ/tree/gptq-4bit-128g-actorder_True) | 4 | 128 | Yes | 0.1 | [Evol Instruct Code](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1) | 4096 | 1.82 GB | No | 4-bit, with Act Order and group size 128g. Uses even less VRAM than 64g, but with slightly lower accuracy. Poor AutoGPTQ CUDA speed. |
76
  | [gptq-8bit-128g-actorder_True](https://huggingface.co/TheBloke/stablecode-instruct-alpha-3b-GPTQ/tree/gptq-8bit-128g-actorder_True) | 8 | 128 | Yes | 0.1 | [Evol Instruct Code](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1) | 4096 | 3.08 GB | No | 8-bit, with group size 128g for higher inference quality and with Act Order for even higher accuracy. Poor AutoGPTQ CUDA speed. |
77
+ | [gptq-8bit-64g-actorder_True](https://huggingface.co/TheBloke/stablecode-instruct-alpha-3b-GPTQ/tree/gptq-8bit-64g-actorder_True) | 8 | 64 | Yes | 0.1 | [Evol Instruct Code](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1) | 4096 | 3.14 GB | No | 8-bit, with group size 64g and Act Order for even higher inference quality. Poor AutoGPTQ CUDA speed. |
78
 
79
  ## How to download from branches
80
 
 
152
  """
153
 
154
  prompt = "Tell me about AI"
155
+ prompt_template=f'''###Instruction:
 
 
156
  {prompt}
157
 
158
+ ###Response:
159
  '''
160
 
161
  print("\n\n*** Generate:")
 
220
 
221
  # Original model card: StabilityAI's Stablecode Instruct Alpha 3B
222
 
223
+ # `StableCode-Instruct-Alpha-3B`
224
+
225
+ ## Model Description
226
+
227
+ `StableCode-Instruct-Alpha-3B` is a 3 billion parameter decoder-only instruction tuned code model pre-trained on diverse set of programming languages that topped the stackoverflow developer survey.
228
+
229
+ ## Usage
230
+ The model is intended to follow instruction to generate code. The dataset used to train the model is formatted in Alpaca format.
231
+ Get started generating code with `StableCode-Instruct-Alpha-3B` by using the following code snippet:
232
+
233
+ ```python
234
+ from transformers import AutoModelForCausalLM, AutoTokenizer
235
+ tokenizer = AutoTokenizer.from_pretrained("stabilityai/stablecode-instruct-alpha-3b")
236
+ model = AutoModelForCausalLM.from_pretrained(
237
+ "stabilityai/stablecode-instruct-alpha-3b",
238
+ trust_remote_code=True,
239
+ torch_dtype="auto",
240
+ )
241
+ model.cuda()
242
+ inputs = tokenizer("###Instruction\nGenerate a python function to find number of CPU cores###Response\n", return_tensors="pt").to("cuda")
243
+ tokens = model.generate(
244
+ **inputs,
245
+ max_new_tokens=48,
246
+ temperature=0.2,
247
+ do_sample=True,
248
+ )
249
+ print(tokenizer.decode(tokens[0], skip_special_tokens=True))
250
+ ```
251
+
252
+ ## Model Details
253
+
254
+ * **Developed by**: [Stability AI](https://stability.ai/)
255
+ * **Model type**: `StableCode-Instruct-Alpha-3B` models are auto-regressive language models based on the transformer decoder architecture.
256
+ * **Language(s)**: Code
257
+ * **Library**: [GPT-NeoX](https://github.com/EleutherAI/gpt-neox)
258
+ * **License** : Model checkpoints are licensed under the [StableCode Research License](https://huggingface.co/stabilityai/stablecode-instruct-alpha-3b/blob/main/LICENSE.md) Copyright (c) Stability AI Ltd. All Rights Reserved
259
+ * **Contact**: For questions and comments about the model, please email `[email protected]`
260
+
261
+ ### Model Architecture
262
+
263
+ | Parameters | Hidden Size | Layers | Heads | Sequence Length |
264
+ |----------------|-------------|--------|-------|-----------------|
265
+ | 2,796,431,360 | 2560 | 32 | 32 | 4096 |
266
+
267
+
268
+ * **Decoder Layer**: Parallel Attention and MLP residuals with a single input LayerNorm ([Wang & Komatsuzaki, 2021](https://github.com/kingoflolz/mesh-transformer-jax/tree/master))
269
+ * **Position Embeddings**: Rotary Position Embeddings ([Su et al., 2021](https://arxiv.org/abs/2104.09864))
270
+ * **Bias**: LayerNorm bias terms only
271
+
272
+ ## Training
273
+
274
+ `StableCode-Instruct-Alpha-3B` is the instruction finetuned version on [StableCode-Completion-Alpha-3B](https://huggingface.co/stabilityai/stablecode-completion-alpha-3b) with code instruction datasets.
275
+
276
+ ## Use and Limitations
277
+
278
+ ### Intended Use
279
+
280
+ StableCode-Instruct-Alpha-3B independently generates new code completions, but we recommend that you use StableCode-Instruct-Alpha-3B together with the tool developed by BigCode and HuggingFace [(huggingface/huggingface-vscode: Code completion VSCode extension for OSS models (github.com))](https://github.com/huggingface/huggingface-vscode), to identify and, if necessary, attribute any outputs that match training code.
281
+
282
+ ### Limitations and bias
283
+
284
+ This model is intended to be used responsibly. It is not intended to be used to create unlawful content of any kind, to further any unlawful activity, or to engage in activities with a high risk of physical or economic harm.
285
+
286
+ ## How to cite
287
+
288
+ ```bibtex
289
+ @misc{StableCodeInstructAlpha,
290
+ url={[https://huggingface.co/stabilityai/stablecode-instruct-alpha-3b](https://huggingface.co/stabilityai/stablecode-instruct-alpha-3b)},
291
+ title={Stable Code Instruct Alpha},
292
+ author={Adithyan, Reshinth and Phung, Duy and Cooper, Nathan and Pinnaparaju, Nikhil and Laforte, Christian}
293
+ }
294
+ ```