rpand002 commited on
Commit
6454988
·
verified ·
1 Parent(s): 61baea1

update context length

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  pipeline_tag: text-generation
3
- base_model: ibm-granite/granite-20b-code-base
4
  inference: true
5
  license: apache-2.0
6
  datasets:
@@ -19,7 +19,7 @@ tags:
19
  - code
20
  - granite
21
  model-index:
22
- - name: granite-20b-code-instruct
23
  results:
24
  - task:
25
  type: text-generation
@@ -205,10 +205,10 @@ model-index:
205
 
206
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png)
207
 
208
- # Granite-20B-Code-Instruct
209
 
210
  ## Model Summary
211
- **Granite-20B-Code-Instruct** is a 20B parameter model fine tuned from *Granite-20B-Code-Base* on a combination of **permissively licensed** instruction data to enhance instruction following capabilities including logical reasoning and problem-solving skills.
212
 
213
  - **Developers:** IBM Research
214
  - **GitHub Repository:** [ibm-granite/granite-code-models](https://github.com/ibm-granite/granite-code-models)
@@ -223,13 +223,13 @@ The model is designed to respond to coding related instructions and can be used
223
  <!-- TO DO: Check starcoder2 instruct code example that includes the template https://huggingface.co/bigcode/starcoder2-15b-instruct-v0.1 -->
224
 
225
  ### Generation
226
- This is a simple example of how to use **Granite-20B-Code-Instruct** model.
227
 
228
  ```python
229
  import torch
230
  from transformers import AutoModelForCausalLM, AutoTokenizer
231
  device = "cuda" # or "cpu"
232
- model_path = "ibm-granite/granite-20b-code-instruct"
233
  tokenizer = AutoTokenizer.from_pretrained(model_path)
234
  # drop device_map if running on CPU
235
  model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
@@ -265,4 +265,4 @@ Granite Code Instruct models are trained on the following types of data.
265
  We train the Granite Code models using two of IBM's super computing clusters, namely Vela and Blue Vela, both outfitted with NVIDIA A100 and H100 GPUs respectively. These clusters provide a scalable and efficient infrastructure for training our models over thousands of GPUs.
266
 
267
  ## Ethical Considerations and Limitations
268
- Granite code instruct models are primarily finetuned using instruction-response pairs across a specific set of programming languages. Thus, their performance may be limited with out-of-domain programming languages. In this situation, it is beneficial providing few-shot examples to steer the model's output. Moreover, developers should perform safety testing and target-specific tuning before deploying these models on critical applications. The model also inherits ethical considerations and limitations from its base model. For more information, please refer to *[Granite-20B-Code-Base](https://huggingface.co/ibm-granite/granite-20b-code-base)* model card.
 
1
  ---
2
  pipeline_tag: text-generation
3
+ base_model: ibm-granite/granite-20b-code-base-8k
4
  inference: true
5
  license: apache-2.0
6
  datasets:
 
19
  - code
20
  - granite
21
  model-index:
22
+ - name: granite-20b-code-instruct-8k
23
  results:
24
  - task:
25
  type: text-generation
 
205
 
206
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png)
207
 
208
+ # Granite-20B-Code-Instruct-8K
209
 
210
  ## Model Summary
211
+ **Granite-20B-Code-Instruct-8K** is a 20B parameter model fine tuned from *Granite-20B-Code-Base-8K* on a combination of **permissively licensed** instruction data to enhance instruction following capabilities including logical reasoning and problem-solving skills.
212
 
213
  - **Developers:** IBM Research
214
  - **GitHub Repository:** [ibm-granite/granite-code-models](https://github.com/ibm-granite/granite-code-models)
 
223
  <!-- TO DO: Check starcoder2 instruct code example that includes the template https://huggingface.co/bigcode/starcoder2-15b-instruct-v0.1 -->
224
 
225
  ### Generation
226
+ This is a simple example of how to use **Granite-20B-Code-Instruct-8K** model.
227
 
228
  ```python
229
  import torch
230
  from transformers import AutoModelForCausalLM, AutoTokenizer
231
  device = "cuda" # or "cpu"
232
+ model_path = "ibm-granite/granite-20b-code-instruct-8k"
233
  tokenizer = AutoTokenizer.from_pretrained(model_path)
234
  # drop device_map if running on CPU
235
  model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
 
265
  We train the Granite Code models using two of IBM's super computing clusters, namely Vela and Blue Vela, both outfitted with NVIDIA A100 and H100 GPUs respectively. These clusters provide a scalable and efficient infrastructure for training our models over thousands of GPUs.
266
 
267
  ## Ethical Considerations and Limitations
268
+ Granite code instruct models are primarily finetuned using instruction-response pairs across a specific set of programming languages. Thus, their performance may be limited with out-of-domain programming languages. In this situation, it is beneficial providing few-shot examples to steer the model's output. Moreover, developers should perform safety testing and target-specific tuning before deploying these models on critical applications. The model also inherits ethical considerations and limitations from its base model. For more information, please refer to *[Granite-20B-Code-Base-8K](https://huggingface.co/ibm-granite/granite-20b-code-base-8k)* model card.