ToastyPigeon commited on
Commit
a2aae38
·
verified ·
1 Parent(s): ebf74f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -1
README.md CHANGED
@@ -7,4 +7,38 @@ datasets:
7
  - allura-org/fujin-cleaned-stage-2
8
  base_model:
9
  - internlm/internlm3-8b-instruct
10
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - allura-org/fujin-cleaned-stage-2
8
  base_model:
9
  - internlm/internlm3-8b-instruct
10
+ ---
11
+
12
+ # Ruby-Music-8B
13
+
14
+ *Note that this model is based on InternLM3, **not** LLaMA 3.*
15
+
16
+ A roleplaying/creative-writing fine tune of [internlm/internlm3-8b-instruct](https://huggingface.co/internlm/internlm3-8b-instruct), provided as an alternative to L3 8B for folks with 8GB VRAM.
17
+
18
+ This was trained on a mix of private instruct (~1k samples) and roleplaying (~2.5k human and ~1k synthetic samples), along with the following public datasets:
19
+ - allenai/tulu-3-sft-personas-instruction-following (~500 samples)
20
+ - PocketDoc/Dans-Prosemaxx-Gutenberg (all samples)
21
+ - ToastyPigeon/SpringDragon-Instruct (~500 samples)
22
+ - allura-org/fujin-cleaned-stage-2 (~500 samples)
23
+
24
+ The instruct format is standard ChatML:
25
+ ```
26
+ <|im_start|>system
27
+ {system prompt}<|im_end|>
28
+ <|im_start|>user
29
+ {user message}<|im_end|>
30
+ <|im_start|>assistant
31
+ {assistant response}<|im_end|>
32
+ ```
33
+
34
+ ## Recommended sampler settings:
35
+ - temp 1
36
+ - smoothing factor 0.5, smoothing curve 1
37
+ - DRY 0.5/1.75/5/1024
38
+
39
+ There may be better sampler settings, but this at least has proven stable in my testing. InternLM3 requires a high amount of tail filtering (high min-p, top-a, or something similar) to avoid making strange typos and spelling mistakes. *Note: this might be a current issue with llama.cpp and the GGUF versions I tested.*
40
+
41
+ ## Notes:
42
+ I noticed this model has trouble outputting the EOS token sometimes (despite confirming that `<|im_end|>` appears at the end of every turn in the training data). This can cause it to ramble at the end of a message instead of ending its turn.
43
+
44
+ You can either cut the end out of the messages until it picks up the right response length, or use logit bias. I've had success getting right-sized turns setting logit bias for `<|im_end|>` to 2.