Fizzarolli commited on
Commit
71ebbd5
·
verified ·
1 Parent(s): 8b89f67

Update non-lore-README.md

Browse files
Files changed (1) hide show
  1. non-lore-README.md +50 -49
non-lore-README.md CHANGED
@@ -1,50 +1,51 @@
1
- [English](./non-lore-README.md) | [简体中文](./non-lore-README-cn.md)
2
-
3
- # Bigger Body 8b
4
- ![image/png](AETEG6110A00KPFHTKMZVNG5C0.jpeg)
5
- A roleplay-focused pseudo full-finetune of Ministral Instruct 2410.
6
- The successor to the Ink series.
7
-
8
- ## Dataset
9
- The Bigger Body (referred to as Ink v2.1, because that's still the internal name) mix is absolutely disgusting. It's even more cursed than the original Ink mix.
10
-
11
- <details>
12
- <summary>(Public) Original Datasets</summary>
13
-
14
- <ul>
15
- <li><a href="https://huggingface.co/datasets/Fizzarolli/limarp-processed">Fizzarolli/limarp-processed</a></li>
16
- <li><a href="https://huggingface.co/datasets/Norquinal/OpenCAI">Norquinal/OpenCAI</a> - <code>two_users</code> split</li>
17
- <li><a href="https://huggingface.co/datasets/allura-org/Celeste1.x-data-mixture">allura-org/Celeste1.x-data-mixture</a></li>
18
- <li><a href="https://huggingface.co/datasets/mapsila/PIPPA-ShareGPT-formatted-named">mapsila/PIPPA-ShareGPT-formatted-named</a></li>
19
- <li><a href="https://huggingface.co/datasets/allenai/tulu-3-sft-personas-instruction-following">allenai/tulu-3-sft-personas-instruction-following</a></li>
20
- <li><a href="https://huggingface.co/datasets/readmehay/medical-01-reasoning-SFT-json">readmehay/medical-01-reasoning-SFT-json</a></li>
21
- <li><a href="https://huggingface.co/datasets/LooksJuicy/ruozhiba">LooksJuicy/ruozhiba</a></li>
22
- <li><a href="https://huggingface.co/datasets/shibing624/roleplay-zh-sharegpt-gpt4-data">shibing624/roleplay-zh-sharegpt-gpt4-data</a></li>
23
- <li><a href="https://huggingface.co/datasets/CausalLM/Retrieval-SFT-Chat">CausalLM/Retrieval-SFT-Chat</a></li>
24
- <li><a href="https://huggingface.co/datasets/ToastyPigeon/fujin-filtered-instruct">ToastyPigeon/fujin-filtered-instruct</a></li>
25
- </ul>
26
- </details>
27
-
28
- ## Quants
29
- TODO!
30
-
31
- ## Recommended Settings
32
- Chat template: Mistral *v7-tekken* (NOT v3-tekken !!!! the main difference is that v7 has specific `[SYSTEM_PROMPT]` and `[/SYSTEM_PROMPT]` tags)
33
- Recommended samplers (not the be-all-end-all, try some on your own!):
34
- - I have literally no idea. you're on your own.
35
-
36
- ## Hyperparams
37
- ### General
38
- - Epochs = 2
39
- - LR = 2e-6
40
- - LR Scheduler = Cosine
41
- - Optimizer = [Apollo-mini](https://github.com/zhuhanqing/APOLLO)
42
- - Optimizer target modules = `all_linear`
43
- - Effective batch size = 16
44
- - Weight Decay = 0.01
45
- - Warmup steps = 50
46
- - Total steps = 920
47
-
48
- ## Credits
49
- Humongous thanks to the people who created the data.
 
50
  Big thanks to all Allura members for testing and emotional support ilya /platonic
 
1
+ [English](./non-lore-README.md) | [简体中文](./non-lore-README-cn.md)
2
+
3
+ # Bigger Body 8b
4
+ ![image/png](AETEG6110A00KPFHTKMZVNG5C0.jpeg)
5
+ A roleplay-focused pseudo full-finetune of Ministral Instruct 2410.
6
+ The successor to the Ink series.
7
+
8
+ ## Dataset
9
+ The Bigger Body (referred to as Ink v2.1, because that's still the internal name) mix is absolutely disgusting. It's even more cursed than the original Ink mix.
10
+
11
+ <details>
12
+ <summary>(Public) Original Datasets</summary>
13
+
14
+ <ul>
15
+ <li><a href="https://huggingface.co/datasets/Fizzarolli/limarp-processed">Fizzarolli/limarp-processed</a></li>
16
+ <li><a href="https://huggingface.co/datasets/Norquinal/OpenCAI">Norquinal/OpenCAI</a> - <code>two_users</code> split</li>
17
+ <li><a href="https://huggingface.co/datasets/allura-org/Celeste1.x-data-mixture">allura-org/Celeste1.x-data-mixture</a></li>
18
+ <li><a href="https://huggingface.co/datasets/mapsila/PIPPA-ShareGPT-formatted-named">mapsila/PIPPA-ShareGPT-formatted-named</a></li>
19
+ <li><a href="https://huggingface.co/datasets/allenai/tulu-3-sft-personas-instruction-following">allenai/tulu-3-sft-personas-instruction-following</a></li>
20
+ <li><a href="https://huggingface.co/datasets/readmehay/medical-01-reasoning-SFT-json">readmehay/medical-01-reasoning-SFT-json</a></li>
21
+ <li><a href="https://huggingface.co/datasets/LooksJuicy/ruozhiba">LooksJuicy/ruozhiba</a></li>
22
+ <li><a href="https://huggingface.co/datasets/shibing624/roleplay-zh-sharegpt-gpt4-data">shibing624/roleplay-zh-sharegpt-gpt4-data</a></li>
23
+ <li><a href="https://huggingface.co/datasets/CausalLM/Retrieval-SFT-Chat">CausalLM/Retrieval-SFT-Chat</a></li>
24
+ <li><a href="https://huggingface.co/datasets/ToastyPigeon/fujin-filtered-instruct">ToastyPigeon/fujin-filtered-instruct</a></li>
25
+ </ul>
26
+ </details>
27
+
28
+ ## Quants
29
+ - [bartowski's imatrix ggufs](https://huggingface.co/bartowski/allura-org_Bigger-Body-8b-GGUF)
30
+ thanks to all quanters <3
31
+
32
+ ## Recommended Settings
33
+ Chat template: Mistral *v7-tekken* (NOT v3-tekken !!!! the main difference is that v7 has specific `[SYSTEM_PROMPT]` and `[/SYSTEM_PROMPT]` tags)
34
+ Recommended samplers (not the be-all-end-all, try some on your own!):
35
+ - I have literally no idea. you're on your own.
36
+
37
+ ## Hyperparams
38
+ ### General
39
+ - Epochs = 2
40
+ - LR = 2e-6
41
+ - LR Scheduler = Cosine
42
+ - Optimizer = [Apollo-mini](https://github.com/zhuhanqing/APOLLO)
43
+ - Optimizer target modules = `all_linear`
44
+ - Effective batch size = 16
45
+ - Weight Decay = 0.01
46
+ - Warmup steps = 50
47
+ - Total steps = 920
48
+
49
+ ## Credits
50
+ Humongous thanks to the people who created the data.
51
  Big thanks to all Allura members for testing and emotional support ilya /platonic