File size: 3,487 Bytes
adb4fc0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b2bed68
adb4fc0
b2bed68
 
adb4fc0
 
 
 
 
 
 
 
 
 
 
 
 
 
b2bed68
ca99f2e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
98000bc
ca99f2e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
98000bc
ca99f2e
 
 
98000bc
ca99f2e
27ac2cd
 
 
 
 
 
ca99f2e
 
 
98000bc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
license: apache-2.0
tags:
- merge
- mergekit
- lazymergekit
- Nitral-AI/KukulStanta-7B
- AlekseiPravdin/Seamaiiza-7B-v1
---

# KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge

KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
* [Nitral-AI/KukulStanta-7B](https://huggingface.co/Nitral-AI/KukulStanta-7B)
* [AlekseiPravdin/Seamaiiza-7B-v1](https://huggingface.co/AlekseiPravdin/Seamaiiza-7B-v1)

## 🧩 Merge Configuration

```yaml
slices:
  - sources:
      - model: Nitral-AI/KukulStanta-7B
        layer_range: [0, 31]
      - model: AlekseiPravdin/Seamaiiza-7B-v1
        layer_range: [0, 31]
merge_method: slerp
base_model: Nitral-AI/KukulStanta-7B
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: float16
```

---
license: apache-2.0
tags:
- merge
- mergekit
- lazymergekit
- Nitral-AI/KukulStanta-7B
- AlekseiPravdin/Seamaiiza-7B-v1
---

# KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge

KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge is an advanced language model created through a strategic fusion of two distinct models: [Nitral-AI/KukulStanta-7B](https://huggingface.co/Nitral-AI/KukulStanta-7B) and [AlekseiPravdin/Seamaiiza-7B-v1](https://huggingface.co/AlekseiPravdin/Seamaiiza-7B-v1). The merging process was executed using [mergekit](https://github.com/cg123/mergekit), a specialized tool designed for precise model blending to achieve optimal performance and synergy between the merged architectures.

## 🧩 Merge Configuration

The models were merged using the Spherical Linear Interpolation (SLERP) method, which ensures smooth interpolation between the two models across all layers. The base model chosen for this process was Nitral-AI/KukulStanta-7B, with parameters and configurations meticulously adjusted to harness the strengths of both source models.

**Configuration:**

```yaml
slices:
  - sources:
      - model: Nitral-AI/KukulStanta-7B
        layer_range: [0, 31]
      - model: AlekseiPravdin/Seamaiiza-7B-v1
        layer_range: [0, 31]
merge_method: slerp
base_model: Nitral-AI/KukulStanta-7B
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: float16
```

## Model Features

This fusion model combines the robust generative capabilities of Nitral-AI/KukulStanta-7B with the refined tuning of AlekseiPravdin/Seamaiiza-7B-v1, creating a versatile model suitable for a variety of text generation tasks. Leveraging the strengths of both parent models, KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge provides enhanced context understanding, nuanced text generation, and improved performance across diverse NLP tasks.

## Evaluation Results

### Nitral-AI/KukulStanta-7B

- **AI2 Reasoning Challenge (25-Shot):** 68.43 (normalized accuracy)
- **HellaSwag (10-Shot):** 86.37 (normalized accuracy)
- **MMLU (5-Shot):** 65.00 (accuracy)
- **TruthfulQA (0-shot):** 62.19
- **Winogrande (5-shot):** 80.03 (accuracy)
- **GSM8k (5-shot):** 63.68 (accuracy)

## Limitations

While the merged model benefits from the strengths of both parent models, it may also inherit certain limitations and biases. Users should be aware of potential biases present in the training data of the original models, which could affect the performance and fairness of the merged model in specific applications.