Update README.md
Browse files
README.md
CHANGED
@@ -7,4 +7,38 @@ datasets:
|
|
7 |
- allura-org/fujin-cleaned-stage-2
|
8 |
base_model:
|
9 |
- internlm/internlm3-8b-instruct
|
10 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
- allura-org/fujin-cleaned-stage-2
|
8 |
base_model:
|
9 |
- internlm/internlm3-8b-instruct
|
10 |
+
---
|
11 |
+
|
12 |
+
# Ruby-Music-8B
|
13 |
+
|
14 |
+
*Note that this model is based on InternLM3, **not** LLaMA 3.*
|
15 |
+
|
16 |
+
A roleplaying/creative-writing fine tune of [internlm/internlm3-8b-instruct](https://huggingface.co/internlm/internlm3-8b-instruct), provided as an alternative to L3 8B for folks with 8GB VRAM.
|
17 |
+
|
18 |
+
This was trained on a mix of private instruct (~1k samples) and roleplaying (~2.5k human and ~1k synthetic samples), along with the following public datasets:
|
19 |
+
- allenai/tulu-3-sft-personas-instruction-following (~500 samples)
|
20 |
+
- PocketDoc/Dans-Prosemaxx-Gutenberg (all samples)
|
21 |
+
- ToastyPigeon/SpringDragon-Instruct (~500 samples)
|
22 |
+
- allura-org/fujin-cleaned-stage-2 (~500 samples)
|
23 |
+
|
24 |
+
The instruct format is standard ChatML:
|
25 |
+
```
|
26 |
+
<|im_start|>system
|
27 |
+
{system prompt}<|im_end|>
|
28 |
+
<|im_start|>user
|
29 |
+
{user message}<|im_end|>
|
30 |
+
<|im_start|>assistant
|
31 |
+
{assistant response}<|im_end|>
|
32 |
+
```
|
33 |
+
|
34 |
+
## Recommended sampler settings:
|
35 |
+
- temp 1
|
36 |
+
- smoothing factor 0.5, smoothing curve 1
|
37 |
+
- DRY 0.5/1.75/5/1024
|
38 |
+
|
39 |
+
There may be better sampler settings, but this at least has proven stable in my testing. InternLM3 requires a high amount of tail filtering (high min-p, top-a, or something similar) to avoid making strange typos and spelling mistakes. *Note: this might be a current issue with llama.cpp and the GGUF versions I tested.*
|
40 |
+
|
41 |
+
## Notes:
|
42 |
+
I noticed this model has trouble outputting the EOS token sometimes (despite confirming that `<|im_end|>` appears at the end of every turn in the training data). This can cause it to ramble at the end of a message instead of ending its turn.
|
43 |
+
|
44 |
+
You can either cut the end out of the messages until it picks up the right response length, or use logit bias. I've had success getting right-sized turns setting logit bias for `<|im_end|>` to 2.
|