euclaise commited on
Commit
38dc0f6
1 Parent(s): 12e5655

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -0
README.md CHANGED
@@ -29,6 +29,30 @@ I then performed the following steps 4 times:
29
 
30
  This should be more efficient than either STaR or SPIN, as it uses a ranking loss rather than rejection sampling (unlike STaR), and verifies correctness instead of assuming all model responses are incorrect (unlike SPIN).
31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  ## Hyperparameters
33
 
34
  For the initial supervised finetuning step:
 
29
 
30
  This should be more efficient than either STaR or SPIN, as it uses a ranking loss rather than rejection sampling (unlike STaR), and verifies correctness instead of assuming all model responses are incorrect (unlike SPIN).
31
 
32
+
33
+ ## Prompt formats
34
+
35
+ The format for reddit-instruct and oasst2 was:
36
+
37
+ ```
38
+ ### User:
39
+ [insert instruction here]
40
+ ### Assistant:
41
+ [insert response here]
42
+ ### User:
43
+ ...
44
+ ```
45
+
46
+ The format for TinyCot was:
47
+ ```
48
+ ### User:
49
+ [insert instruction here]
50
+ ### Rationale:
51
+ [insert reasoning here]
52
+ ### Answer:
53
+ [insert direct answer here]
54
+ ```
55
+
56
  ## Hyperparameters
57
 
58
  For the initial supervised finetuning step: