mergeout

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the DARE TIES merge method using mistralai/Mixtral-8x7B-v0.1 as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: mistralai/Mixtral-8x7B-Instruct-v0.1
    parameters:
      density: 0.6
      weight: 1.0
  - model: rombodawg/Open_Gpt4_8x7B_v0.2
    parameters:
      density: 0.5
      weight: 0.8
  - model: mistralai/Mixtral-8x7B-v0.1+LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss-LoRA
    parameters:
      density: 0.5
      weight: 0.6        
  - model: Sao10K/Typhon-Mixtral-v1
    parameters:
      density: 0.5
      weight: 0.7
  - model: mistralai/Mixtral-8x7B-v0.1+Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora
    parameters:
      density: 0.5
      weight: 0.4      
merge_method: dare_ties
base_model: mistralai/Mixtral-8x7B-v0.1
parameters:
  normalize: true
  int8_mask: true
dtype: bfloat16
name: Mega-Destroyer-8x7B

High quality GGUF quants available here: https://huggingface.co/Artefact2/Mega-Destroyer-8x7B-GGUF (Thank you, Artefact for quanting it using an imatrix!)

Hello everyone, this is Dampf. You might know me as the creator of Mythical-Destroyer-13B.

This time, I collaborated with Mr.DragonFox aka FoxEngineAi, harnessing his powerful rig to deliver a Merge of multiple high quality Mixtral 8x7B models. My goal was to beat Bagel-Mistery-Tour V2 by Ycros and create the best Mixtral model to date. Did I succeed? Please try it out and decide for yourself!

Aside from the obvious Mixtral Instruct, to keep its intelligence, I've merged Rombo's excellent Open_Gpt4_v0.2 model that consists of Jon Durbin's Bagel-DPO-8x7B and another highly regarded model, namely smelborp/MixtralOrochi8x7B. This model also combines different datasets together, meaning it should be agood fit for every task you throw at it. This model acts like the reasoning part in the merge. In contrast, we have Air-Striker and LimaRP at the creative side which will allow for great roleplays in different styles, they are also a good fit to enhance the model's writing capabilities greatly.

And finally, I've merged Sao10K/Typhon-Mixtral-v1 to boost the story writing capabilities even further. It includes KoboldAI's latest Holodeck model, as well as a couple of his latest models and combines it into one package. My hope is that this will capture the magic Sao10K/Fimbulvetr-11B-v2 emits, just at the intelligence level of a Mixtral model. This one also includes Nous Hermes 2 DPO, a high quality instruct model that will boost its intelligence and sorta act like a balancer to all the creative stuff in the merge.

What we have here is a model that should be fantastic at instruct and roleplay/creative tasks a like. So basically a general purpose model. Perhaps the pinnacle of Rocksmashing? Idk xD I just know it includes nearly all datasets on the sun. As a reason, it will likely work with every prompt format as well. So feel free to use Alpaca, Vicuna, ChatML, Llama 2 Chat or whatever your heart desires.

A huge thank you to the creators of these fantastic datasets and fine tunes in the respective merges, namely Jon Durbin, Teknium, Sao10K, MistralAI, LoneStriker, NeverSleep, Suikamelon, Doctor-Shotgun, KoboldAI and more. All credit goes to them. A thank you to the creators of the different merges I've merged (Mergeception!) as well! And of course a thank you to MrDragonFox for lending his compute! Please enjoy :D

Downloads last month
82
Safetensors
Model size
46.7B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for FoxEngineAi/Mega-Destroyer-8x7B