BEE-spoke-data
/

smol_llama-220M-open_instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

pszemraj commited on Jan 2

Commit

e3d0683

•

1 Parent(s): 342bc7f

Create README.md

Files changed (1) hide show

README.md +85 -0

README.md ADDED Viewed

	@@ -0,0 +1,85 @@

+---
+license: apache-2.0
+base_model: BEE-spoke-data/smol_llama-220M-GQA
+datasets:
+- VMware/open-instruct
+inference:
+  parameters:
+    do_sample: true
+    renormalize_logits: true
+    temperature: 0.25
+    top_p: 0.95
+    top_k: 50
+    min_new_tokens: 2
+    max_new_tokens: 96
+    repetition_penalty: 1.04
+    no_repeat_ngram_size: 6
+    epsilon_cutoff: 0.0006
+widget:
+- text: >
+    Below is an instruction that describes a task, paired with an input that
+    provides further context. Write a response that appropriately completes the
+    request.
+    ### Instruction:
+    Write an ode to Chipotle burritos.
+    ### Response:
+  example_title: burritos
+---
+# BEE-spoke-data/smol_llama-220M-open_instruct
+> Please note that this is an experiment, and the model has limitations because it is smol.
+prompt format is alpaca.
+```
+Below is an instruction that describes a task, paired with an input that
+provides further context. Write a response that appropriately completes
+the request.
+### Instruction:
+How can I increase my meme production/output? Currently, I only create them in ancient babylonian which is time consuming.
+### Response:
+```
+This was **not** trained using a separate 'inputs' field (as `VMware/open-instruct` doesn't use one).
+## Example
+Output on the text above ^. The inference API is set to sample with low temp so you should see (_at least slightly_) different generations each time.
+Note that the inference API parameters used here are an initial educated guess, and may be updated over time:
+```yml
+inference:
+  parameters:
+    do_sample: true
+    renormalize_logits: true
+    temperature: 0.25
+    top_p: 0.95
+    top_k: 50
+    min_new_tokens: 2
+    max_new_tokens: 96
+    repetition_penalty: 1.04
+    no_repeat_ngram_size: 6
+    epsilon_cutoff: 0.0006
+```
+Feel free to experiment with the parameters using the model in Python and let us know if you have improved results with other params!
+## Data
+This was trained on `VMware/open-instruct` so do whatever you want, provided it falls under the base apache-2.0 license :)
+---