File size: 2,893 Bytes
d27f92f
 
4721944
 
 
 
 
 
 
8b6bf8c
d27f92f
a6d50d4
 
 
8912723
37beb75
e3ab2fc
37beb75
f92937d
37beb75
f92937d
37beb75
8912723
2043b3e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
99ff38b
 
 
 
707db42
e3ab2fc
707db42
 
 
 
 
 
 
99ff38b
be3002c
c4fa1e9
fc66abc
 
be3002c
f92937d
be3002c
87c519c
 
be3002c
f92937d
 
 
 
 
be3002c
f92937d
 
 
 
 
 
 
 
fc66abc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
license: mit
pipeline_tag: text-generation
tags:
- merge
- mergekit
- mistral
- moe
- conversational
- chicka
---

### Model Description

This model is a Mixture of Experts merged LLM consisting of 3 mistral based models:

base model/conversational expert, **openchat/openchat-3.5-0106**

code expert, **beowolx/CodeNinja-1.0-OpenChat-7B**

math expert, **meta-math/MetaMath-Mistral-7B**

This is the Mergekit config used in the merging process:
``` yaml
base_model: openchat/openchat-3.5-0106
experts:
  - source_model: openchat/openchat-3.5-0106
    positive_prompts:
    - "chat"
    - "assistant"
    - "tell me"
    - "explain"
    - "I want"
  - source_model: beowolx/CodeNinja-1.0-OpenChat-7B
    positive_prompts:
    - "code"
    - "python"
    - "javascript"
    - "programming"
    - "algorithm"
    - "C#"
    - "C++"
    - "debug"
    - "runtime"
    - "html"
    - "command"
    - "nodejs"
  - source_model: meta-math/MetaMath-Mistral-7B
    positive_prompts:
    - "reason"
    - "math"
    - "mathematics"
    - "solve"
    - "count"
    - "calculate"
    - "arithmetic"
    - "algebra"
```

### Open LLM Leaderboards


| **Benchmark**    | **Chicka-Mixtral-3X7B**  | **Mistral-7B-Instruct-v0.2** | **Meta-Llama-3-8B** |
|--------------|----------------------|--------------------------|-----------------|
| **Average**      | **69.19**                |  60.97                   | 62.55           |
| **ARC**          | **64.08**                |  59.98                   | 59.47           |
| **Hellaswag**    | **83.96**                |  83.31                   | 82.09           |
| **MMLU**         | 64.87                |  64.16                   | **66.67**           |
| **TruthfulQA**   | **50.51**                |  42.15                   | 43.95           |
| **Winogrande**   | **81.06**                |  78.37                   | 77.35           |
| **GSM8K**        | **70.66**                |  37.83                   | 45.79           |

### Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("Chickaboo/Chicka-Mistral-3x7b")
tokenizer = AutoTokenizer.from_pretrained("Chickaboo/Chicka-Mixtral-3x7b")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
```