ibm-research
/

PowerLM-3b

Text Generation

Model card Files Files and versions Community

mayank-mishra commited on Aug 19, 2024

Commit

e32422a

·

verified ·

1 Parent(s): e7ee55e

Update README.md

Files changed (1) hide show

README.md +2 -8

README.md CHANGED Viewed

@@ -124,13 +124,7 @@ model-index:
 # Granite-8B-Code-Instruct-128K
 ## Model Summary
-**Granite-8B-Code-Instruct-128K** is a 8B parameter long-context instruct model fine tuned from *Granite-8B-Code-Base-128K* on a combination of **permissively licensed** data used in training the original Granite code instruct models, in addition to synthetically generated code instruction datasets tailored for solving long context problems. By exposing the model to both short and long context data, we aim to enhance its long-context capability without sacrificing code generation performance at short input context.
-- **Developers:** IBM Research
-- **GitHub Repository:** [ibm-granite/granite-code-models](https://github.com/ibm-granite/granite-code-models)
-- **Paper:** [Scaling Granite Code Models to 128K Context](https://arxiv.org/abs/2407.13739)
-- **Release Date**: July 18th, 2024
-- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
 ## Usage
 ### Intended use
@@ -142,7 +136,7 @@ This is a simple example of how to use **PowerLM-3b** model.
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
 device = "cuda" # or "cpu"
-model_path = "ibm-granite/granite-8B-Code-instruct-128k"
 tokenizer = AutoTokenizer.from_pretrained(model_path)
 # drop device_map if running on CPU
 model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)

 # Granite-8B-Code-Instruct-128K
 ## Model Summary
+PowerLM-3B is a 3B state-of-the-art small language model trained with the Power learning rate scheduler. It is trained on a wide range of open-source and synthetic datasets with permissive licenses. PowerLM-3B has shown promising results compared to other models in the size categories across various benchmarks, including natural language multi-choices, code generation, and math reasoning.
 ## Usage
 ### Intended use
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
 device = "cuda" # or "cpu"
+model_path = "ibm/PowerLM-3b"
 tokenizer = AutoTokenizer.from_pretrained(model_path)
 # drop device_map if running on CPU
 model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)