Bigger-Body-8b / non-lore-README.md
Fizzarolli's picture
Update non-lore-README.md
71ebbd5 verified
[English](./non-lore-README.md) | [简体中文](./non-lore-README-cn.md)
# Bigger Body 8b
![image/png](AETEG6110A00KPFHTKMZVNG5C0.jpeg)
A roleplay-focused pseudo full-finetune of Ministral Instruct 2410.
The successor to the Ink series.
## Dataset
The Bigger Body (referred to as Ink v2.1, because that's still the internal name) mix is absolutely disgusting. It's even more cursed than the original Ink mix.
<details>
<summary>(Public) Original Datasets</summary>
<ul>
<li><a href="https://huggingface.co/datasets/Fizzarolli/limarp-processed">Fizzarolli/limarp-processed</a></li>
<li><a href="https://huggingface.co/datasets/Norquinal/OpenCAI">Norquinal/OpenCAI</a> - <code>two_users</code> split</li>
<li><a href="https://huggingface.co/datasets/allura-org/Celeste1.x-data-mixture">allura-org/Celeste1.x-data-mixture</a></li>
<li><a href="https://huggingface.co/datasets/mapsila/PIPPA-ShareGPT-formatted-named">mapsila/PIPPA-ShareGPT-formatted-named</a></li>
<li><a href="https://huggingface.co/datasets/allenai/tulu-3-sft-personas-instruction-following">allenai/tulu-3-sft-personas-instruction-following</a></li>
<li><a href="https://huggingface.co/datasets/readmehay/medical-01-reasoning-SFT-json">readmehay/medical-01-reasoning-SFT-json</a></li>
<li><a href="https://huggingface.co/datasets/LooksJuicy/ruozhiba">LooksJuicy/ruozhiba</a></li>
<li><a href="https://huggingface.co/datasets/shibing624/roleplay-zh-sharegpt-gpt4-data">shibing624/roleplay-zh-sharegpt-gpt4-data</a></li>
<li><a href="https://huggingface.co/datasets/CausalLM/Retrieval-SFT-Chat">CausalLM/Retrieval-SFT-Chat</a></li>
<li><a href="https://huggingface.co/datasets/ToastyPigeon/fujin-filtered-instruct">ToastyPigeon/fujin-filtered-instruct</a></li>
</ul>
</details>
## Quants
- [bartowski's imatrix ggufs](https://huggingface.co/bartowski/allura-org_Bigger-Body-8b-GGUF)
thanks to all quanters <3
## Recommended Settings
Chat template: Mistral *v7-tekken* (NOT v3-tekken !!!! the main difference is that v7 has specific `[SYSTEM_PROMPT]` and `[/SYSTEM_PROMPT]` tags)
Recommended samplers (not the be-all-end-all, try some on your own!):
- I have literally no idea. you're on your own.
## Hyperparams
### General
- Epochs = 2
- LR = 2e-6
- LR Scheduler = Cosine
- Optimizer = [Apollo-mini](https://github.com/zhuhanqing/APOLLO)
- Optimizer target modules = `all_linear`
- Effective batch size = 16
- Weight Decay = 0.01
- Warmup steps = 50
- Total steps = 920
## Credits
Humongous thanks to the people who created the data.
Big thanks to all Allura members for testing and emotional support ilya /platonic