ibm
/

mayank-mishra commited on
Commit
e32422a
·
verified ·
1 Parent(s): e7ee55e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -8
README.md CHANGED
@@ -124,13 +124,7 @@ model-index:
124
  # Granite-8B-Code-Instruct-128K
125
 
126
  ## Model Summary
127
- **Granite-8B-Code-Instruct-128K** is a 8B parameter long-context instruct model fine tuned from *Granite-8B-Code-Base-128K* on a combination of **permissively licensed** data used in training the original Granite code instruct models, in addition to synthetically generated code instruction datasets tailored for solving long context problems. By exposing the model to both short and long context data, we aim to enhance its long-context capability without sacrificing code generation performance at short input context.
128
-
129
- - **Developers:** IBM Research
130
- - **GitHub Repository:** [ibm-granite/granite-code-models](https://github.com/ibm-granite/granite-code-models)
131
- - **Paper:** [Scaling Granite Code Models to 128K Context](https://arxiv.org/abs/2407.13739)
132
- - **Release Date**: July 18th, 2024
133
- - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
134
 
135
  ## Usage
136
  ### Intended use
@@ -142,7 +136,7 @@ This is a simple example of how to use **PowerLM-3b** model.
142
  import torch
143
  from transformers import AutoModelForCausalLM, AutoTokenizer
144
  device = "cuda" # or "cpu"
145
- model_path = "ibm-granite/granite-8B-Code-instruct-128k"
146
  tokenizer = AutoTokenizer.from_pretrained(model_path)
147
  # drop device_map if running on CPU
148
  model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
 
124
  # Granite-8B-Code-Instruct-128K
125
 
126
  ## Model Summary
127
+ PowerLM-3B is a 3B state-of-the-art small language model trained with the Power learning rate scheduler. It is trained on a wide range of open-source and synthetic datasets with permissive licenses. PowerLM-3B has shown promising results compared to other models in the size categories across various benchmarks, including natural language multi-choices, code generation, and math reasoning.
 
 
 
 
 
 
128
 
129
  ## Usage
130
  ### Intended use
 
136
  import torch
137
  from transformers import AutoModelForCausalLM, AutoTokenizer
138
  device = "cuda" # or "cpu"
139
+ model_path = "ibm/PowerLM-3b"
140
  tokenizer = AutoTokenizer.from_pretrained(model_path)
141
  # drop device_map if running on CPU
142
  model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)