Update README.md
Browse files
README.md
CHANGED
@@ -19,9 +19,9 @@ datasets:
|
|
19 |
|
20 |
Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations
|
21 |
|
22 |
-
This is our most spectacular outcome ever. FFT, all parameters, 16bit. 70.9 MMLU on 9b
|
23 |
|
24 |
-
Although the max positional embeddings is 4k, we used rope theta of 1000000.0 and we trained with sequence length
|
25 |
|
26 |
Discord: https://discord.gg/8fbBeC7ZGx
|
27 |
|
|
|
19 |
|
20 |
Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations
|
21 |
|
22 |
+
This is our most spectacular outcome ever. FFT, all parameters, 16bit. 70.9 MMLU on 9b! And it talks like a dream.
|
23 |
|
24 |
+
Although the max positional embeddings is 4k, we used rope theta of 1000000.0 and we trained with sequence length 12k. We plan to train on the upcoming 32k version as well.
|
25 |
|
26 |
Discord: https://discord.gg/8fbBeC7ZGx
|
27 |
|