Sao10K
/

L3-8B-Stheno-v3.3-32K

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Sao10K commited on Jun 22, 2024

Commit

1aad7aa

•

1 Parent(s): 0cf0388

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -3,13 +3,15 @@ Training Details:
 Trained at 8K Context -> Expanded to 32K Context due to context extension with PoSE training.
 Dataset Modifications:
-- Further Cleaned up Roleplaying Samples
 - Removed Low Quality Samples from Manual Check
-- More Creative Writing Samples
 - Remade and Refined Detailed Instruct Data
 Needle in a Haystack Results:
 ```
 sequence_len: 8192

 Trained at 8K Context -> Expanded to 32K Context due to context extension with PoSE training.
 Dataset Modifications:
+- Further Cleaned up Roleplaying Samples -> Quality Check
 - Removed Low Quality Samples from Manual Check
+- More Creative Writing Samples -> 2x
 - Remade and Refined Detailed Instruct Data
 Needle in a Haystack Results:
+![Results](Linkhere)
+Coherent at 32K Context. Not as good as a natively trained 32K model, but much better than regular rope scaling.
 ```
 sequence_len: 8192