Antler-7B-evolve / README.md
Elizezen's picture
Update README.md
6d45dcd verified
---
base_model: []
library_name: transformers
tags:
- mergekit
- merge
license: apache-2.0
---
# Antler 7B Evolve
<img src="https://huggingface.co/Elizezen/Antler-7B/resolve/main/OIG3.UAjshTXCEJU.jpg" alt="drawing" style="width:512px;"/>
## Model Description
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit), using **Evolutionary Model Merging**.
Generally better than Antler-7B at writing novels, especially at maintaining context, but can fall short on eroticism compared to the original model. It also tends to generate eos tokens quite early, which I'm currently working on improving.
## Example
### Input
```
ใใฎๆ—ฅใฎๆ˜ผไธ‹ใŒใ‚Šใ€‚็งใจใ‚ใ‚„ใฏใŠๆƒใ„ใฎๆตด่กฃใ‚’่บซใซ็บใ„ใ€ๅ†ฌ็ฅญใ‚Šใ‚’ๆฅฝใ—ใ‚€ใŸใ‚ใซ็”บใธใจ็นฐใ‚Šๅ‡บใ—ใฆใ„ใŸใ€‚ใ‚€ใ‚ใ‚“ใ€ๅฟ่€…ใฎ็ด ๆ€งใ‚’้š ใ™ใŸใ‚ใซ็š†ๅค‰่ฃ…ใ—ใฆใ„ใ‚‹ใ€‚
ๆ™ฎๆฎต็€ๆ…ฃใ‚Œใชใ„ๆœ่ฃ…ใฎใŸใ‚ๅฐ‘ใ€…่ฝใก็€ใ‹ใชใ„ใ‚‚ใฎใฎใ€ๅธฏใŒ็ทฉใ‚“ใ ใ‚Š็€ๅดฉใ‚Œใ—ใชใ„ใ‚ˆใ†ๆ„่ญ˜ใ—ใชใŒใ‚‰ๆญฉใใ€‚ใ™ใ‚‹ใจใ‚ใ‚„ใ‚‚ๅŒใ˜ใ‚ˆใ†ใซใŽใ“ใกใชใๆญฉใ„ใฆใ„ใ‚‹ใฎใŒๅˆ†ใ‹ใฃใŸใ€‚
ใ‚„ใŒใฆ่ก—ไธฆใฟใฏๆดปๆฐ—ใซๆบ€ใกๅง‹ใ‚ใ€้“่กŒใไบบใ€…ใฎ่ณ‘ใ‚„ใ‹ใชๅฃฐใŒ่žใ“ใˆใฆใใ‚‹ใ€‚
ๅบƒๅ ดใซๅˆฐ็€ใ™ใ‚‹ใจใ€ใใ“ใฏๅคงๅ‹ขใฎไบบใ€…ใงใซใŽใ‚ใ„ใ€่‰ฒใจใ‚Šใฉใ‚Šใฎๆ็ฏใŒ่พบใ‚Šใ‚’็…งใ‚‰ใ—ใฆใ„ใŸใ€‚ๆง˜ใ€…ใชๅ‡บๅบ—ใŒไธฆใณใ€ๅคงๅ‹ขใฎๅญไพ›้”ใจใใฎ่ฆชๅพกใ•ใ‚“ใŒ้ง†ใ‘ๅ›žใฃใฆใ„ใ‚‹ใ€‚
ๅบƒๅ ดใฎไธญๅคฎไป˜่ฟ‘ใซใ‚ใ‚‹่ˆžๅฐใงใฏๅ‚ฌใ—็‰ฉใŒ้–‹ๅ‚ฌใ•ใ‚ŒใฆใŠใ‚Šใ€ๅคช้ผ“ใฎ้Ÿณใซๅˆใ‚ใ›ใฆๆญŒใฃใฆใ„ใ‚‹ๆผ”่€…ใŸใกใŒใ„ใŸใ€‚
ใ€Œใ‚ใ๏ฝžใ€ใใ‚Œใ„๏ผใ€
็›ฎใ‚’่ผใ‹ใ›ใฆ่พบใ‚Šใ‚’่ฆ‹ๅ›žใ™ใ‚ใ‚„ใ€‚ใ“ใ†ใ—ใฆใฟใ‚‹ใจๅนด็›ธๅฟœใฎๅญใฉใ‚‚ใซ่ฆ‹ใˆใ‚‹ใ€‚
ใ€Œใ“ใ‚‰ใ€ๅ‹ๆ‰‹ใซ่ตฐใ‚Šๅ›žใ‚‰ใชใ„ใ€
ใ€Œใˆใธใธ๏ฝžใ”ใ‚ใ‚“ใชใ•ใ„ใ€
ใŸใ—ใชใ‚ใ‚‰ใ‚ŒใชใŒใ‚‰ใ‚‚ใ€้ก”ใฏ็ถปใ‚“ใงใ„ใ‚‹ๆง˜ๅญใ‹ใ‚‰ใ‚‚ๅˆ†ใ‹ใ‚‹้€šใ‚Šใ€ๅฝผๅฅณใ‚‚ๆฅฝใ—ใฟใซใ—ใฆใ„ใ‚‹ใฎใฏๆ˜Žใ‚‰ใ‹ใ ใ‚ใ†ใ€‚
ใ‚ใ‚„ใŒๆฅฝใ—ใใ†ใ ใจใ€็งใ‚‚ๅฌ‰ใ—ใ„ใ€‚ไธๆ€่ญฐใชใ‚‚ใฎใ ใชใ€‚ไปŠใพใงใ“ใ‚“ใชๆฐ—ๆŒใกใซใชใฃใŸใ“ใจใฏใชใ‹ใฃใŸใ€‚
ๆ€ใ‚ใš็งใพใง็ฌ‘้ก”ใซใชใฃใฆใ—ใพใ†ใ€‚
ใ€Œใ“ใฎใกใ‚ƒใ‚“ใ€ใ“ใฎใกใ‚ƒใ‚“๏ผ่ฆ‹ใฆ๏ผใ“ใฎใกใ‚‡ใ†ใกใ‚“ๅฏๆ„›ใ„๏ฝž๏ผใ€
ใ€ŒใŠใŠใ€
ๅฝผๅฅณใซ่ข–ใ‚’ๅผ•ใ‹ใ‚Œใฆ้€ฃใ‚Œใฆ่กŒใ‹ใ‚ŒใŸใฎใฏใ€ๅฏๆ„›ใ‚‰ใ—ใ„ๅ‹•็‰ฉใฎ็ตตๆŸ„ใŒใ‚ใ—ใ‚‰ใ‚ใ‚ŒใŸ่กŒ็‡ˆใ€‚ใฉใ†ใ‚„ใ‚‰ๅฃฒๅบ—ใŒๅ‡บใ—ใฆใ„ใ‚‹ใ‚‰ใ—ใ„ใ€‚
ๅฑ‹ๅฐใงใฏ
```
### Output
```
ๅฑ‹ๅฐใงใฏ็„ผใใใฐใ‚„็„ผใ้ณฅใชใฉใฎๅฎš็•ชใจไธฆใ‚“ใงใ€ๅ†ฌ้™ๅฎšใฎ้‹ๅฐ‚้–€ๅบ—ใชใฉใ‚‚ๅ‡บใฆใ„ใŸใ€‚
ใ€Œใ‚ใ‚„ใฏใชใซใ‹้ฃŸในใŸใ„ใ‚‚ใฎใ‚ใ‚‹ใ‹๏ผŸใ€
ใ€Œใˆใฃใจใญ๏ฝžใ€ใŠๅฅฝใฟ็„ผใใจใ‹โ€ฆโ€ฆใ€
ใ€ŒใŠๅฅฝใฟ็„ผใ๏ผŸใ€
็งใฏๅˆใ‚ใฆ่žใๆ–™็†ๅใซ้ฆ–ใ‚’ๅ‚พใ’ใ‚‹ใ€‚ใ‚ใ‚„ใฏ็›ฎใ‚’ใ‚ญใƒฉใ‚ญใƒฉใ•ใ›ใชใŒใ‚‰ใ€ใใฎๆ–™็†ใซใคใ„ใฆ่ชฌๆ˜Žใ—ใฆใใ‚ŒใŸใ€‚
ๅฐ้บฆ็ฒ‰ใฎ็”Ÿๅœฐใซใ‚ญใƒฃใƒ™ใƒ„ใ‚„่ฑš่‚‰ใ€ๅคฉใ‹ใ™ใ€ใใ—ใฆใŠๅฅฝใฟ็„ผใใ‚ฝใƒผใ‚นใ‚’ใ‹ใ‘ใฆ็„ผใ„ใŸใ€ๅคง้˜ชๅ็‰ฉใฎๆ–™็†ใ‚‰ใ—ใ„ใ€‚
ใ€Œใใ‚ŒใฏใพใŸ้ข็™ฝใใ†ใชใ‚‚ใฎใ ใชใ€‚ใงใฏใใ“ใฎๅฑ‹ๅฐใซ่กŒใฃใฆใฟใ‚ˆใ†ใ€
็ง้”ใฏ็›ฎๆ˜Ÿใ‚’ใคใ‘ใŸๅฑ‹ๅฐใธๅ‘ใ‹ใ†ใ“ใจใซใ—ใŸใ€‚
ใŠๅฅฝใฟ็„ผใใฎๅฑ‹ๅฐใฏใ€ไบˆๆƒณไปฅไธŠใฎ็››ๆณใถใ‚Šใง่กŒๅˆ—ใŒใงใใฆใ„ใŸใ€‚ใ—ใ‹ใ—ใ€ไธฆใถใ“ใจ30ๅˆ†ใปใฉใง็งใŸใกใฎ็•ชใŒใ‚„ใฃใฆใใ‚‹ใ€‚
ใ€ŒใŠใฐใกใ‚ƒใ‚“ใ€ใ“ใ‚Œใใ ใ•ใ„ใ€
ใ€Œใ‚ใ„ใ‚ˆ๏ผใกใ‚‡ใฃใจๅพ…ใฃใฆใช๏ผใ€
ๅฑ‹ๅฐใฎใŠใฐใกใ‚ƒใ‚“ใฏๅจๅ‹ขใฎใ„ใ„ๅฃฐใง่ฟ”ไบ‹ใ‚’ใ™ใ‚‹ใจใ€ๆ‰‹ๆ…ฃใ‚ŒใŸๆง˜ๅญใง้‰„ๆฟใฎไธŠใงใŠๅฅฝใฟ็„ผใใ‚’็„ผใไธŠใ’ใ‚‹ใ€‚
ใ€Œใ“ใ‚ŒใŒใŠๅฅฝใฟ็„ผใใ ใ‚ˆใ€
ๅ‡บๆฅไธŠใŒใฃใŸใŠๅฅฝใฟ็„ผใใ‚’ๆ‰‹ใซใ—ใŸใ‚ใ‚„ใŒใ€ใ†ใฃใจใ‚Šใจใ—ใŸๆง˜ๅญใงใใ†่จ€ใฃใŸใ€‚
ใ€Œใปใ†ใ€‚่ฆ‹ใ‚‹ใ‹ใ‚‰ใซ็พŽๅ‘ณใ—ใใ†ใ ใ€
็งใ‚‚ใใฎ่‰ฒๅˆใ„ใซ่ช˜ใ‚ใ‚Œใ‚‹ใ‚ˆใ†ใซใ—ใฆใ€ไธ€ๅฃ้ ฌ
```
### Intended Use
The model is mainly intended to be used for generating novels. It may not be so capable with instruction-based responses.
## Merge Details
### Merge Method
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using evol_merge_storage\input_models\Antler7B_2159541861 as a base.
### Models Merged
The following models were included in the merge:
* evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
* evol_merge_storage\input_models\antler-starling-08_4074283220
* evol_merge_storage\input_models\Phos7b-RP_654656604
### Configuration
The following YAML configuration was used to produce this model:
```yaml
base_model: evol_merge_storage\input_models\Antler7B_2159541861
dtype: bfloat16
merge_method: dare_ties
parameters:
int8_mask: 1.0
normalize: 1.0
slices:
- sources:
- layer_range: [0, 8]
model: evol_merge_storage\input_models\Phos7b-RP_654656604
parameters:
density: 0.584107666175788
weight: 0.47231634419785595
- layer_range: [0, 8]
model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
parameters:
density: 0.9357007414387093
weight: 0.25531843586626907
- layer_range: [0, 8]
model: evol_merge_storage\input_models\antler-starling-08_4074283220
parameters:
density: 0.9750447748820433
weight: 0.4753247646722287
- layer_range: [0, 8]
model: evol_merge_storage\input_models\Antler7B_2159541861
- sources:
- layer_range: [8, 16]
model: evol_merge_storage\input_models\Phos7b-RP_654656604
parameters:
density: 0.8802238329444649
weight: 0.4482746205621599
- layer_range: [8, 16]
model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
parameters:
density: 1.0
weight: 0.5524329574915081
- layer_range: [8, 16]
model: evol_merge_storage\input_models\antler-starling-08_4074283220
parameters:
density: 1.0
weight: 0.22634815425570032
- layer_range: [8, 16]
model: evol_merge_storage\input_models\Antler7B_2159541861
- sources:
- layer_range: [16, 24]
model: evol_merge_storage\input_models\Phos7b-RP_654656604
parameters:
density: 0.9921437573982935
weight: 0.44636209472148164
- layer_range: [16, 24]
model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
parameters:
density: 0.8757091247914811
weight: 0.15431351637040108
- layer_range: [16, 24]
model: evol_merge_storage\input_models\antler-starling-08_4074283220
parameters:
density: 0.8667200206865777
weight: 0.37827962987746055
- layer_range: [16, 24]
model: evol_merge_storage\input_models\Antler7B_2159541861
- sources:
- layer_range: [24, 32]
model: evol_merge_storage\input_models\Phos7b-RP_654656604
parameters:
density: 0.966615155256828
weight: 0.5041762338947331
- layer_range: [24, 32]
model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917
parameters:
density: 1.0
weight: 0.22555101554235693
- layer_range: [24, 32]
model: evol_merge_storage\input_models\antler-starling-08_4074283220
parameters:
density: 0.7616963147939114
weight: 0.397020374822854
- layer_range: [24, 32]
model: evol_merge_storage\input_models\Antler7B_2159541861
```