--- base_model: - Steelskull/L3.3-Electra-R1-70b - Nexesenex/Llama_3.x_70b_SmarTracks_V1.01 - NexesMess/Llama_3.3_70b_DoppelGanger_R1 - nbeerbower/Llama3.1-Gutenberg-Doppel-70B - NexesMess/Llama_3.1_70b_Priestess_V1 - migtissera/Tess-3-Llama-3.1-70B library_name: transformers tags: - mergekit - merge --- # about The base of Hexagon Purple V2, Smartracks, remains unchanged, and is a "3 levels" stock merge including Deepseek Distill R1 (3 flavors), Nemotron, and Tulu capabilities. Hexagon Purple V2 diverges from V1 with the following: - Steelskull's Electra R1 replace Black-Ink-Guild's Perniscious Prophecy, because it's even better. 70Blivion is recovered elsewhere. - A Priestess stock merge replaces the Hostess one, and brings 70Blivision in and the Lumitron merge out, on the top of Tess R1 and Llama Creative Writer. - Dobby, Wayfarer and Drummer's Fallen Llama R1 (already present in a Smartracks sub-submerge and now Electra R1) go out as standalone models, replaced by a stock-merge of these 3, DoppelGanger R1. - Nbeerbower's Doppel Gutemberg goes in, as a 3.1 instruct (and novel writing) stabilizator working in tandem with the following model. - Miguel Tissera's Tess 3.0 70B 3.1 goes in also, as a perplexity dropper. As usual, abliterated and lorablated (thanks Huihui-ai, Maxime Labonne, and ofc Failspy), are used systematically when they exist, and otherwise, the focus is on very low censorship. --- # benchs Benchs are traded for creativity in this merge too, but we progress neatly compared to V1 : - PPL Wikitext Eng 512 : 3.43 (good) - ARC-C : 60.55 (good) - ARC-E : 81.05 (good also) --- # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [Nexesenex/Llama_3.x_70b_SmarTracks_V1.01](https://huggingface.co/Nexesenex/Llama_3.x_70b_SmarTracks_V1.01) as a base. ### Models Merged The following models were included in the merge: * [Steelskull/L3.3-Electra-R1-70b](https://huggingface.co/Steelskull/L3.3-Electra-R1-70b) * [NexesMess/Llama_3.3_70b_DoppelGanger_R1](https://huggingface.co/NexesMess/Llama_3.3_70b_DoppelGanger_R1) * [nbeerbower/Llama3.1-Gutenberg-Doppel-70B](https://huggingface.co/nbeerbower/Llama3.1-Gutenberg-Doppel-70B) * [NexesMess/Llama_3.1_70b_Priestess_V1](https://huggingface.co/NexesMess/Llama_3.1_70b_Priestess_V1) * [migtissera/Tess-3-Llama-3.1-70B](https://huggingface.co/migtissera/Tess-3-Llama-3.1-70B) ### Configuration The following YAML configuration was used to produce this model: ```yaml merge_method: model_stock models: - model: migtissera/Tess-3-Llama-3.1-70B parameters: weight: 1.0 - model: nbeerbower/Llama3.1-Gutenberg-Doppel-70B parameters: weight: 1.0 - model: NexesMess/Llama_3.1_70b_Priestess_V1 parameters: weight: 1.0 - model: Steelskull/L3.3-Electra-R1-70b parameters: weight: 1.0 - model: NexesMess/Llama_3.3_70b_DoppelGanger_R1 parameters: weight: 1.0 base_model: Nexesenex/Llama_3.x_70b_SmarTracks_V1.01 dtype: bfloat16 out_dtype: bfloat16 parameters: int8_mask: true normalize: true rescale: false chat_template: auto tokenizer: source: union ```