nuprl
/

MultiPL-T-StarCoderBase_1b

Text Generation

text-generation-inference

Model card Files Files and versions

cassanof commited on Aug 19, 2023

Commit

1fc7947

·

1 Parent(s): 16cc1dc

Update README.md

Files changed (1) hide show

README.md +26 -7

README.md CHANGED Viewed

@@ -15,17 +15,36 @@ State-of-the-art StarCoder-based models for low-resource languages
 ## Language Revision Index
-This is the revision index for the best-performing models on their respective HumanEval benchmarks.
 | Langauge      | Revision ID | Epoch |
 | ------------- | ----------- | ----- |
 | Lua           | `7e96d931547e342ad0661cdd91236fe4ccf52545`         | 3    |
-| Racket           | `2cdc541bee1db4da80c0b43384b0d6a0cacca5b2`         | 5    |
-| OCaml           | `e8a24f9e2149cbda8c3cca264a53c2b361b7a031`         | 6 |
 ## Usage
-To utilize one of the models in this repository, you must first select a commit revision for that model.

 ## Language Revision Index
+This is the revision index for the best-performing models for their respective langauge.
 | Langauge      | Revision ID | Epoch |
 | ------------- | ----------- | ----- |
 | Lua           | `7e96d931547e342ad0661cdd91236fe4ccf52545`         | 3    |
+| Racket        | `2cdc541bee1db4da80c0b43384b0d6a0cacca5b2`         | 5    |
+| OCaml         | `e8a24f9e2149cbda8c3cca264a53c2b361b7a031`         | 6    |
 ## Usage
+To utilize one of the models in this repository, you must first select a commit revision for that model from the table above.
+For example, to use the Lua model:
+```py
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("nuprl/MultiPLCoder-1b")
+lua_revision="7e96d931547e342ad0661cdd91236fe4ccf52545"
+model = AutoModelForCausalLM.from_pretrained("nuprl/MultiPLCoder-1b", revision=lua_revision)
+```
+Note that the model's default configuration does not enable caching, therefore you must specify to use the cache on generation.
+```py
+toks = tokenizer.encode("-- Hello World", return_tensors="pt")
+out = model.generate(toks, use_cache=True,  do_sample=True, temperature=0.2, top_p=0.95, max_length=50)
+print(tokenizer.decode(out[0], skip_special_tokens=True))
+```
+```
+-- Hello World!
+-- :param name: The name of the person to say hello to
+-- :return: A greeting
+local function say_hello(name)
+  return "Hello ".. name
+end
+```