File size: 3,316 Bytes
8e7c324
 
 
 
 
 
 
 
 
 
 
 
 
c38568c
 
 
4e6f728
c38568c
 
 
 
 
d809606
ce8ab0f
c38568c
4e6f728
c38568c
 
 
 
 
 
 
 
 
8e7c324
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
---
base_model:
- Steelskull/L3.3-Electra-R1-70b
- Nexesenex/Llama_3.x_70b_SmarTracks_V1.01
- NexesMess/Llama_3.3_70b_DoppelGanger_R1
- nbeerbower/Llama3.1-Gutenberg-Doppel-70B
- NexesMess/Llama_3.1_70b_Priestess_V1
- migtissera/Tess-3-Llama-3.1-70B
library_name: transformers
tags:
- mergekit
- merge

---
# about

The base of Hexagon Purple V2, Smartracks, remains unchanged, and is a "3 levels" stock merge including Deepseek Distill R1 (3 flavors), Nemotron, and Tulu capabilities.

Hexagon Purple V2 diverges from V1 with the following:
  - Steelskull's Electra R1 replace Black-Ink-Guild's Perniscious Prophecy, because it's even better. 70Blivion is recovered elsewhere.
  - A Priestess stock merge replaces the Hostess one, and brings 70Blivision in and the Lumitron merge out, on the top of Tess R1 and Llama Creative Writer. 
  - Dobby, Wayfarer and Drummer's Fallen Llama R1 (already present in a Smartracks sub-submerge and now Electra R1) go out as standalone models, replaced by a stock-merge of these 3, DoppelGanger R1.
  - Nbeerbower's Doppel Gutemberg goes in, as a 3.1 instruct (and novel writing) stabilizator working in tandem with the following model.
  - Miguel Tissera's Tess 3.0 70B 3.1 goes in also, as a perplexity dropper.

As usual, abliterated and lorablated (thanks Huihui-ai, Maxime Labonne, and ofc Failspy), are used systematically when they exist, and otherwise, the focus is on very low censorship.

---
# benchs

Benchs are traded for creativity in this merge too, but we progress neatly compared to V1 :
- PPL Wikitext Eng 512 : 3.43 (good)
- ARC-C : 60.55 (good)
- ARC-E : 81.05 (good also)

---
# merge

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

## Merge Details
### Merge Method

This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [Nexesenex/Llama_3.x_70b_SmarTracks_V1.01](https://huggingface.co/Nexesenex/Llama_3.x_70b_SmarTracks_V1.01) as a base.

### Models Merged

The following models were included in the merge:
* [Steelskull/L3.3-Electra-R1-70b](https://huggingface.co/Steelskull/L3.3-Electra-R1-70b)
* [NexesMess/Llama_3.3_70b_DoppelGanger_R1](https://huggingface.co/NexesMess/Llama_3.3_70b_DoppelGanger_R1)
* [nbeerbower/Llama3.1-Gutenberg-Doppel-70B](https://huggingface.co/nbeerbower/Llama3.1-Gutenberg-Doppel-70B)
* [NexesMess/Llama_3.1_70b_Priestess_V1](https://huggingface.co/NexesMess/Llama_3.1_70b_Priestess_V1)
* [migtissera/Tess-3-Llama-3.1-70B](https://huggingface.co/migtissera/Tess-3-Llama-3.1-70B)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
merge_method: model_stock
models:
  - model: migtissera/Tess-3-Llama-3.1-70B
    parameters:
      weight: 1.0
  - model: nbeerbower/Llama3.1-Gutenberg-Doppel-70B
    parameters:
      weight: 1.0
  - model: NexesMess/Llama_3.1_70b_Priestess_V1
    parameters:
      weight: 1.0
  - model: Steelskull/L3.3-Electra-R1-70b
    parameters:
      weight: 1.0
  - model: NexesMess/Llama_3.3_70b_DoppelGanger_R1
    parameters:
      weight: 1.0
base_model: Nexesenex/Llama_3.x_70b_SmarTracks_V1.01
dtype: bfloat16
out_dtype: bfloat16
parameters:
  int8_mask: true
  normalize: true
  rescale: false
chat_template: auto
tokenizer:
  source: union
```