Spaces:

GerbilLab
/

README

Running

crumb commited on Apr 3, 2023

Commit

c0c50b8

1 Parent(s): 9ecdcde

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -65,6 +65,12 @@ Nearly every base model that isn't finetuned for a specific task was trained on
 ```
 Some applications where I can imagine these being useful are: warm-starting very small encoder-decoder models, fitting a new scaling law that takes into account smaller models, or having a "fuzzy wrapper" around an API. They also could be usable on their own (for classification or other) when finetuned on more specific datasets. I don't expect the 3.3m models to be useful for any task whatsoever. Every model was trained on a singular GPU, either a RTX2060, RTX3060, or a T4.
 I'd , uh , appreciate help in evaluating all these models probably with lm harness!!

 ```
+"Instruct" models have these special tokens:
+```
+<prompt> your prompt goes here <output> the model outputs a result here.
+```
 Some applications where I can imagine these being useful are: warm-starting very small encoder-decoder models, fitting a new scaling law that takes into account smaller models, or having a "fuzzy wrapper" around an API. They also could be usable on their own (for classification or other) when finetuned on more specific datasets. I don't expect the 3.3m models to be useful for any task whatsoever. Every model was trained on a singular GPU, either a RTX2060, RTX3060, or a T4.
 I'd , uh , appreciate help in evaluating all these models probably with lm harness!!