mayank-mishra
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -124,13 +124,7 @@ model-index:
|
|
124 |
# Granite-8B-Code-Instruct-128K
|
125 |
|
126 |
## Model Summary
|
127 |
-
|
128 |
-
|
129 |
-
- **Developers:** IBM Research
|
130 |
-
- **GitHub Repository:** [ibm-granite/granite-code-models](https://github.com/ibm-granite/granite-code-models)
|
131 |
-
- **Paper:** [Scaling Granite Code Models to 128K Context](https://arxiv.org/abs/2407.13739)
|
132 |
-
- **Release Date**: July 18th, 2024
|
133 |
-
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
|
134 |
|
135 |
## Usage
|
136 |
### Intended use
|
@@ -142,7 +136,7 @@ This is a simple example of how to use **PowerLM-3b** model.
|
|
142 |
import torch
|
143 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
144 |
device = "cuda" # or "cpu"
|
145 |
-
model_path = "ibm
|
146 |
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
147 |
# drop device_map if running on CPU
|
148 |
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
|
|
|
124 |
# Granite-8B-Code-Instruct-128K
|
125 |
|
126 |
## Model Summary
|
127 |
+
PowerLM-3B is a 3B state-of-the-art small language model trained with the Power learning rate scheduler. It is trained on a wide range of open-source and synthetic datasets with permissive licenses. PowerLM-3B has shown promising results compared to other models in the size categories across various benchmarks, including natural language multi-choices, code generation, and math reasoning.
|
|
|
|
|
|
|
|
|
|
|
|
|
128 |
|
129 |
## Usage
|
130 |
### Intended use
|
|
|
136 |
import torch
|
137 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
138 |
device = "cuda" # or "cpu"
|
139 |
+
model_path = "ibm/PowerLM-3b"
|
140 |
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
141 |
# drop device_map if running on CPU
|
142 |
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
|