AlekseiPravdin commited on
Commit
ca99f2e
1 Parent(s): b2bed68

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +66 -1
README.md CHANGED
@@ -33,4 +33,69 @@ parameters:
33
  value: [1, 0.5, 0.7, 0.3, 0]
34
  - value: 0.5
35
  dtype: float16
36
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  value: [1, 0.5, 0.7, 0.3, 0]
34
  - value: 0.5
35
  dtype: float16
36
+ ```
37
+
38
+ ---
39
+ license: apache-2.0
40
+ tags:
41
+ - merge
42
+ - mergekit
43
+ - lazymergekit
44
+ - Nitral-AI/KukulStanta-7B
45
+ - AlekseiPravdin/Seamaiiza-7B-v1
46
+ ---
47
+
48
+ # KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge
49
+
50
+ KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge is an advanced language model created through a strategic fusion of two distinct models: [Nitral-AI/KukulStanta-7B](https://huggingface.co/Nitral-AI/KukulStanta-7B) and [AlekseiPravdin/Seamaiiza-7B-v1](https://huggingface.co/AlekseiPravdin/Seamaiiza-7B-v1). The merging process was executed using [mergekit](https://github.com/cg123/mergekit), a specialized tool designed for precise model blending to achieve optimal performance and synergy between the merged architectures.
51
+
52
+ ## 🧩 Merge Configuration
53
+
54
+ The models were merged using the Spherical Linear Interpolation (SLERP) method, which ensures smooth interpolation between the two models across all layers. The base model chosen for this process was [Nitral-AI/KukulStanta-7B], with parameters and configurations meticulously adjusted to harness the strengths of both source models.
55
+
56
+ **Configuration:**
57
+
58
+ ```yaml
59
+ slices:
60
+ - sources:
61
+ - model: Nitral-AI/KukulStanta-7B
62
+ layer_range: [0, 31]
63
+ - model: AlekseiPravdin/Seamaiiza-7B-v1
64
+ layer_range: [0, 31]
65
+ merge_method: slerp
66
+ base_model: Nitral-AI/KukulStanta-7B
67
+ parameters:
68
+ t:
69
+ - filter: self_attn
70
+ value: [0, 0.5, 0.3, 0.7, 1]
71
+ - filter: mlp
72
+ value: [1, 0.5, 0.7, 0.3, 0]
73
+ - value: 0.5
74
+ dtype: float16
75
+ ```
76
+
77
+ ## Model Features
78
+
79
+ This fusion model combines the robust generative capabilities of [Nitral-AI/KukulStanta-7B] with the refined tuning of [AlekseiPravdin/Seamaiiza-7B-v1], creating a versatile model suitable for a variety of text generation tasks. Leveraging the strengths of both parent models, KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge provides enhanced context understanding, nuanced text generation, and improved performance across diverse NLP tasks.
80
+
81
+ ## Evaluation Results
82
+
83
+ ### KukulStanta-7B
84
+ The evaluation results for [Nitral-AI/KukulStanta-7B](https://huggingface.co/Nitral-AI/KukulStanta-7B) are as follows:
85
+
86
+ | Metric | Value |
87
+ |---------------------------------|-------|
88
+ | Avg. | 70.95 |
89
+ | AI2 Reasoning Challenge (25-Shot)| 68.43 |
90
+ | HellaSwag (10-Shot) | 86.37 |
91
+ | MMLU (5-Shot) | 65.00 |
92
+ | TruthfulQA (0-shot) | 62.19 |
93
+ | Winogrande (5-shot) | 80.03 |
94
+ | GSM8k (5-shot) | 63.68 |
95
+
96
+ ### Seamaiiza-7B-v1
97
+ The evaluation results for [AlekseiPravdin/Seamaiiza-7B-v1](https://huggingface.co/AlekseiPravdin/Seamaiiza-7B-v1) are not provided in the input, but it is important to note that this model also contributes to the overall performance and capabilities of the merged model.
98
+
99
+ ## Limitations
100
+
101
+ While KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge inherits the strengths of both parent models, it may also carry over some limitations or biases present in them. Users should be aware of potential biases in the training data and the model's responses, which may affect its performance in certain contexts.