AuriAetherwiing
commited on
Commit
•
c01c14f
1
Parent(s):
33b3b99
Structure readme better
Browse files
README.md
CHANGED
@@ -15,13 +15,18 @@ pipeline_tag: text-generation
|
|
15 |
# Qwen2.5-14B Sugarquill v1
|
16 |
|
17 |
A continued pretrain of SuperNova-Medius on assorted short story data from the web. Supernova already had a nice prose, but diversifying it a bit definitely doesn't hurt.
|
18 |
-
Also, finally a storywriter model with enough context for something more than a short story, that's also nice.
|
|
|
|
|
19 |
Instruction following also stayed rather strong, so it works for both RP and storywriting, both in chat mode via back-and-forth co-writing and on raw completion.
|
|
|
20 |
Overall, I'd say it successfully transfers the essence of what I liked about Gemma Sugarquill. I will also make a Qwen version of Aletheia, but with a brand new LoRA, based on a brand new RP dataset that's in the making right now.
|
21 |
|
22 |
|
23 |
Model was trained by Auri.
|
24 |
|
|
|
|
|
25 |
**Training notes**
|
26 |
|
27 |
This model was trained for 2 epochs on 10k rows (~18.7M tokens), taken equally from Erebus-87k and r_shortstories_24k datasets. It was trained on 5x3090Ti workstation for 7.5 hours with rsLoRA.
|
|
|
15 |
# Qwen2.5-14B Sugarquill v1
|
16 |
|
17 |
A continued pretrain of SuperNova-Medius on assorted short story data from the web. Supernova already had a nice prose, but diversifying it a bit definitely doesn't hurt.
|
18 |
+
Also, finally a storywriter model with enough context for something more than a short story, that's also nice.
|
19 |
+
|
20 |
+
It's a fair bit more temperamental than Gemma, but can be tamed with some sampling.
|
21 |
Instruction following also stayed rather strong, so it works for both RP and storywriting, both in chat mode via back-and-forth co-writing and on raw completion.
|
22 |
+
|
23 |
Overall, I'd say it successfully transfers the essence of what I liked about Gemma Sugarquill. I will also make a Qwen version of Aletheia, but with a brand new LoRA, based on a brand new RP dataset that's in the making right now.
|
24 |
|
25 |
|
26 |
Model was trained by Auri.
|
27 |
|
28 |
+
---
|
29 |
+
|
30 |
**Training notes**
|
31 |
|
32 |
This model was trained for 2 epochs on 10k rows (~18.7M tokens), taken equally from Erebus-87k and r_shortstories_24k datasets. It was trained on 5x3090Ti workstation for 7.5 hours with rsLoRA.
|