pszemraj commited on
Commit
e3d0683
1 Parent(s): 342bc7f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +85 -0
README.md ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: BEE-spoke-data/smol_llama-220M-GQA
4
+ datasets:
5
+ - VMware/open-instruct
6
+ inference:
7
+ parameters:
8
+ do_sample: true
9
+ renormalize_logits: true
10
+ temperature: 0.25
11
+ top_p: 0.95
12
+ top_k: 50
13
+ min_new_tokens: 2
14
+ max_new_tokens: 96
15
+ repetition_penalty: 1.04
16
+ no_repeat_ngram_size: 6
17
+ epsilon_cutoff: 0.0006
18
+ widget:
19
+ - text: >
20
+ Below is an instruction that describes a task, paired with an input that
21
+ provides further context. Write a response that appropriately completes the
22
+ request.
23
+
24
+ ### Instruction:
25
+
26
+ Write an ode to Chipotle burritos.
27
+
28
+ ### Response:
29
+ example_title: burritos
30
+ ---
31
+
32
+
33
+ # BEE-spoke-data/smol_llama-220M-open_instruct
34
+
35
+ > Please note that this is an experiment, and the model has limitations because it is smol.
36
+
37
+
38
+ prompt format is alpaca.
39
+
40
+
41
+ ```
42
+ Below is an instruction that describes a task, paired with an input that
43
+ provides further context. Write a response that appropriately completes
44
+ the request.
45
+
46
+ ### Instruction:
47
+
48
+ How can I increase my meme production/output? Currently, I only create them in ancient babylonian which is time consuming.
49
+
50
+ ### Response:
51
+ ```
52
+
53
+ This was **not** trained using a separate 'inputs' field (as `VMware/open-instruct` doesn't use one).
54
+
55
+
56
+ ## Example
57
+
58
+ Output on the text above ^. The inference API is set to sample with low temp so you should see (_at least slightly_) different generations each time.
59
+
60
+
61
+
62
+ Note that the inference API parameters used here are an initial educated guess, and may be updated over time:
63
+
64
+ ```yml
65
+ inference:
66
+ parameters:
67
+ do_sample: true
68
+ renormalize_logits: true
69
+ temperature: 0.25
70
+ top_p: 0.95
71
+ top_k: 50
72
+ min_new_tokens: 2
73
+ max_new_tokens: 96
74
+ repetition_penalty: 1.04
75
+ no_repeat_ngram_size: 6
76
+ epsilon_cutoff: 0.0006
77
+ ```
78
+
79
+ Feel free to experiment with the parameters using the model in Python and let us know if you have improved results with other params!
80
+
81
+ ## Data
82
+
83
+ This was trained on `VMware/open-instruct` so do whatever you want, provided it falls under the base apache-2.0 license :)
84
+
85
+ ---