AuriAetherwiing commited on
Commit
c01c14f
1 Parent(s): 33b3b99

Structure readme better

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -15,13 +15,18 @@ pipeline_tag: text-generation
15
  # Qwen2.5-14B Sugarquill v1
16
 
17
  A continued pretrain of SuperNova-Medius on assorted short story data from the web. Supernova already had a nice prose, but diversifying it a bit definitely doesn't hurt.
18
- Also, finally a storywriter model with enough context for something more than a short story, that's also nice. It's a fair bit more temperamental than Gemma, but can be tamed with some sampling.
 
 
19
  Instruction following also stayed rather strong, so it works for both RP and storywriting, both in chat mode via back-and-forth co-writing and on raw completion.
 
20
  Overall, I'd say it successfully transfers the essence of what I liked about Gemma Sugarquill. I will also make a Qwen version of Aletheia, but with a brand new LoRA, based on a brand new RP dataset that's in the making right now.
21
 
22
 
23
  Model was trained by Auri.
24
 
 
 
25
  **Training notes**
26
 
27
  This model was trained for 2 epochs on 10k rows (~18.7M tokens), taken equally from Erebus-87k and r_shortstories_24k datasets. It was trained on 5x3090Ti workstation for 7.5 hours with rsLoRA.
 
15
  # Qwen2.5-14B Sugarquill v1
16
 
17
  A continued pretrain of SuperNova-Medius on assorted short story data from the web. Supernova already had a nice prose, but diversifying it a bit definitely doesn't hurt.
18
+ Also, finally a storywriter model with enough context for something more than a short story, that's also nice.
19
+
20
+ It's a fair bit more temperamental than Gemma, but can be tamed with some sampling.
21
  Instruction following also stayed rather strong, so it works for both RP and storywriting, both in chat mode via back-and-forth co-writing and on raw completion.
22
+
23
  Overall, I'd say it successfully transfers the essence of what I liked about Gemma Sugarquill. I will also make a Qwen version of Aletheia, but with a brand new LoRA, based on a brand new RP dataset that's in the making right now.
24
 
25
 
26
  Model was trained by Auri.
27
 
28
+ ---
29
+
30
  **Training notes**
31
 
32
  This model was trained for 2 epochs on 10k rows (~18.7M tokens), taken equally from Erebus-87k and r_shortstories_24k datasets. It was trained on 5x3090Ti workstation for 7.5 hours with rsLoRA.