|
--- |
|
base_model: [] |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
license: apache-2.0 |
|
--- |
|
|
|
# Antler 7B Evolve |
|
|
|
<img src="https://huggingface.co/Elizezen/Antler-7B/resolve/main/OIG3.UAjshTXCEJU.jpg" alt="drawing" style="width:512px;"/> |
|
|
|
## Model Description |
|
|
|
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit), using **Evolutionary Model Merging**. |
|
|
|
Generally better than Antler-7B at writing novels, especially at maintaining context, but can fall short on eroticism compared to the original model. It also tends to generate eos tokens quite early, which I'm currently working on improving. |
|
## Example |
|
|
|
### Input |
|
|
|
``` |
|
ใใฎๆฅใฎๆผไธใใใ็งใจใใใฏใๆใใฎๆตด่กฃใ่บซใซ็บใใๅฌ็ฅญใใๆฅฝใใใใใซ็บใธใจ็นฐใๅบใใฆใใใใใใใๅฟ่
ใฎ็ด ๆงใ้ ใใใใซ็ๅค่ฃ
ใใฆใใใ |
|
ๆฎๆฎต็ๆ
ฃใใชใๆ่ฃ
ใฎใใๅฐใ
่ฝใก็ใใชใใใฎใฎใๅธฏใ็ทฉใใ ใ็ๅดฉใใใชใใใๆ่ญใใชใใๆญฉใใใใใจใใใๅใใใใซใใใกใชใๆญฉใใฆใใใฎใๅใใฃใใ |
|
ใใใฆ่กไธฆใฟใฏๆดปๆฐใซๆบใกๅงใใ้่กใไบบใ
ใฎ่ณใใใชๅฃฐใ่ใใใฆใใใ |
|
ๅบๅ ดใซๅฐ็ใใใจใใใใฏๅคงๅขใฎไบบใ
ใงใซใใใใ่ฒใจใใฉใใฎๆ็ฏใ่พบใใ็
งใใใฆใใใๆงใ
ใชๅบๅบใไธฆใณใๅคงๅขใฎๅญไพ้ใจใใฎ่ฆชๅพกใใใ้งใๅใฃใฆใใใ |
|
ๅบๅ ดใฎไธญๅคฎไป่ฟใซใใ่ๅฐใงใฏๅฌใ็ฉใ้ๅฌใใใฆใใใๅคช้ผใฎ้ณใซๅใใใฆๆญใฃใฆใใๆผ่
ใใกใใใใ |
|
ใใใ๏ฝใใใใ๏ผใ |
|
็ฎใ่ผใใใฆ่พบใใ่ฆๅใใใใใใใใฆใฟใใจๅนด็ธๅฟใฎๅญใฉใใซ่ฆใใใ |
|
ใใใใๅๆใซ่ตฐใๅใใชใใ |
|
ใใใธใธ๏ฝใใใใชใใใ |
|
ใใใชใใใใชใใใใ้กใฏ็ถปใใงใใๆงๅญใใใๅใใ้ใใๅฝผๅฅณใๆฅฝใใฟใซใใฆใใใฎใฏๆใใใ ใใใ |
|
ใใใๆฅฝใใใใ ใจใ็งใๅฌใใใไธๆ่ญฐใชใใฎใ ใชใไปใพใงใใใชๆฐๆใกใซใชใฃใใใจใฏใชใใฃใใ |
|
ๆใใ็งใพใง็ฌ้กใซใชใฃใฆใใพใใ |
|
ใใใฎใกใใใใใฎใกใใ๏ผ่ฆใฆ๏ผใใฎใกใใใกใๅฏๆใ๏ฝ๏ผใ |
|
ใใใใ |
|
ๅฝผๅฅณใซ่ขใๅผใใใฆ้ฃใใฆ่กใใใใฎใฏใๅฏๆใใใๅ็ฉใฎ็ตตๆใใใใใใใ่ก็ใใฉใใใๅฃฒๅบใๅบใใฆใใใใใใ |
|
ๅฑๅฐใงใฏ |
|
``` |
|
|
|
### Output |
|
|
|
``` |
|
ๅฑๅฐใงใฏ็ผใใใฐใ็ผใ้ณฅใชใฉใฎๅฎ็ชใจไธฆใใงใๅฌ้ๅฎใฎ้ๅฐ้ๅบใชใฉใๅบใฆใใใ |
|
ใใใใฏใชใซใ้ฃในใใใใฎใใใ๏ผใ |
|
ใใใฃใจใญ๏ฝใใๅฅฝใฟ็ผใใจใโฆโฆใ |
|
ใใๅฅฝใฟ็ผใ๏ผใ |
|
็งใฏๅใใฆ่ใๆ็ๅใซ้ฆใๅพใใใใใใฏ็ฎใใญใฉใญใฉใใใชใใใใใฎๆ็ใซใคใใฆ่ชฌๆใใฆใใใใ |
|
ๅฐ้บฆ็ฒใฎ็ๅฐใซใญใฃใใใ่ฑ่ใๅคฉใใใใใใฆใๅฅฝใฟ็ผใใฝใผในใใใใฆ็ผใใใๅคง้ชๅ็ฉใฎๆ็ใใใใ |
|
ใใใใฏใพใ้ข็ฝใใใชใใฎใ ใชใใงใฏใใใฎๅฑๅฐใซ่กใฃใฆใฟใใใ |
|
็ง้ใฏ็ฎๆใใคใใๅฑๅฐใธๅใใใใจใซใใใ |
|
ใๅฅฝใฟ็ผใใฎๅฑๅฐใฏใไบๆณไปฅไธใฎ็ๆณใถใใง่กๅใใงใใฆใใใใใใใไธฆใถใใจ30ๅใปใฉใง็งใใกใฎ็ชใใใฃใฆใใใ |
|
ใใใฐใกใใใใใใใ ใใใ |
|
ใใใใ๏ผใกใใฃใจๅพ
ใฃใฆใช๏ผใ |
|
ๅฑๅฐใฎใใฐใกใใใฏๅจๅขใฎใใๅฃฐใง่ฟไบใใใใจใๆๆ
ฃใใๆงๅญใง้ๆฟใฎไธใงใๅฅฝใฟ็ผใใ็ผใไธใใใ |
|
ใใใใใๅฅฝใฟ็ผใใ ใใ |
|
ๅบๆฅไธใใฃใใๅฅฝใฟ็ผใใๆใซใใใใใใใใฃใจใใจใใๆงๅญใงใใ่จใฃใใ |
|
ใใปใใ่ฆใใใใซ็พๅณใใใใ ใ |
|
็งใใใฎ่ฒๅใใซ่ชใใใใใใซใใฆใไธๅฃ้ ฌ |
|
``` |
|
|
|
### Intended Use |
|
|
|
The model is mainly intended to be used for generating novels. It may not be so capable with instruction-based responses. |
|
|
|
|
|
## Merge Details |
|
### Merge Method |
|
|
|
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using evol_merge_storage\input_models\Antler7B_2159541861 as a base. |
|
|
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917 |
|
* evol_merge_storage\input_models\antler-starling-08_4074283220 |
|
* evol_merge_storage\input_models\Phos7b-RP_654656604 |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
base_model: evol_merge_storage\input_models\Antler7B_2159541861 |
|
dtype: bfloat16 |
|
merge_method: dare_ties |
|
parameters: |
|
int8_mask: 1.0 |
|
normalize: 1.0 |
|
slices: |
|
- sources: |
|
- layer_range: [0, 8] |
|
model: evol_merge_storage\input_models\Phos7b-RP_654656604 |
|
parameters: |
|
density: 0.584107666175788 |
|
weight: 0.47231634419785595 |
|
- layer_range: [0, 8] |
|
model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917 |
|
parameters: |
|
density: 0.9357007414387093 |
|
weight: 0.25531843586626907 |
|
- layer_range: [0, 8] |
|
model: evol_merge_storage\input_models\antler-starling-08_4074283220 |
|
parameters: |
|
density: 0.9750447748820433 |
|
weight: 0.4753247646722287 |
|
- layer_range: [0, 8] |
|
model: evol_merge_storage\input_models\Antler7B_2159541861 |
|
- sources: |
|
- layer_range: [8, 16] |
|
model: evol_merge_storage\input_models\Phos7b-RP_654656604 |
|
parameters: |
|
density: 0.8802238329444649 |
|
weight: 0.4482746205621599 |
|
- layer_range: [8, 16] |
|
model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917 |
|
parameters: |
|
density: 1.0 |
|
weight: 0.5524329574915081 |
|
- layer_range: [8, 16] |
|
model: evol_merge_storage\input_models\antler-starling-08_4074283220 |
|
parameters: |
|
density: 1.0 |
|
weight: 0.22634815425570032 |
|
- layer_range: [8, 16] |
|
model: evol_merge_storage\input_models\Antler7B_2159541861 |
|
- sources: |
|
- layer_range: [16, 24] |
|
model: evol_merge_storage\input_models\Phos7b-RP_654656604 |
|
parameters: |
|
density: 0.9921437573982935 |
|
weight: 0.44636209472148164 |
|
- layer_range: [16, 24] |
|
model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917 |
|
parameters: |
|
density: 0.8757091247914811 |
|
weight: 0.15431351637040108 |
|
- layer_range: [16, 24] |
|
model: evol_merge_storage\input_models\antler-starling-08_4074283220 |
|
parameters: |
|
density: 0.8667200206865777 |
|
weight: 0.37827962987746055 |
|
- layer_range: [16, 24] |
|
model: evol_merge_storage\input_models\Antler7B_2159541861 |
|
- sources: |
|
- layer_range: [24, 32] |
|
model: evol_merge_storage\input_models\Phos7b-RP_654656604 |
|
parameters: |
|
density: 0.966615155256828 |
|
weight: 0.5041762338947331 |
|
- layer_range: [24, 32] |
|
model: evol_merge_storage\input_models\chatntq-ja-7b-v1.0-westlake_932715917 |
|
parameters: |
|
density: 1.0 |
|
weight: 0.22555101554235693 |
|
- layer_range: [24, 32] |
|
model: evol_merge_storage\input_models\antler-starling-08_4074283220 |
|
parameters: |
|
density: 0.7616963147939114 |
|
weight: 0.397020374822854 |
|
- layer_range: [24, 32] |
|
model: evol_merge_storage\input_models\Antler7B_2159541861 |
|
``` |