AlekseiPravdin
commited on
Commit
•
98000bc
1
Parent(s):
ca99f2e
Upload folder using huggingface_hub
Browse files
README.md
CHANGED
@@ -51,7 +51,7 @@ KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge is an advanced language model created
|
|
51 |
|
52 |
## 🧩 Merge Configuration
|
53 |
|
54 |
-
The models were merged using the Spherical Linear Interpolation (SLERP) method, which ensures smooth interpolation between the two models across all layers. The base model chosen for this process was
|
55 |
|
56 |
**Configuration:**
|
57 |
|
@@ -76,26 +76,19 @@ dtype: float16
|
|
76 |
|
77 |
## Model Features
|
78 |
|
79 |
-
This fusion model combines the robust generative capabilities of
|
80 |
|
81 |
## Evaluation Results
|
82 |
|
83 |
-
### KukulStanta-7B
|
84 |
-
The evaluation results for [Nitral-AI/KukulStanta-7B](https://huggingface.co/Nitral-AI/KukulStanta-7B) are as follows:
|
85 |
|
86 |
-
|
87 |
-
|
88 |
-
|
89 |
-
|
90 |
-
|
91 |
-
|
92 |
-
| TruthfulQA (0-shot) | 62.19 |
|
93 |
-
| Winogrande (5-shot) | 80.03 |
|
94 |
-
| GSM8k (5-shot) | 63.68 |
|
95 |
-
|
96 |
-
### Seamaiiza-7B-v1
|
97 |
-
The evaluation results for [AlekseiPravdin/Seamaiiza-7B-v1](https://huggingface.co/AlekseiPravdin/Seamaiiza-7B-v1) are not provided in the input, but it is important to note that this model also contributes to the overall performance and capabilities of the merged model.
|
98 |
|
99 |
## Limitations
|
100 |
|
101 |
-
While
|
|
|
51 |
|
52 |
## 🧩 Merge Configuration
|
53 |
|
54 |
+
The models were merged using the Spherical Linear Interpolation (SLERP) method, which ensures smooth interpolation between the two models across all layers. The base model chosen for this process was Nitral-AI/KukulStanta-7B, with parameters and configurations meticulously adjusted to harness the strengths of both source models.
|
55 |
|
56 |
**Configuration:**
|
57 |
|
|
|
76 |
|
77 |
## Model Features
|
78 |
|
79 |
+
This fusion model combines the robust generative capabilities of Nitral-AI/KukulStanta-7B with the refined tuning of AlekseiPravdin/Seamaiiza-7B-v1, creating a versatile model suitable for a variety of text generation tasks. Leveraging the strengths of both parent models, KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge provides enhanced context understanding, nuanced text generation, and improved performance across diverse NLP tasks.
|
80 |
|
81 |
## Evaluation Results
|
82 |
|
83 |
+
### Nitral-AI/KukulStanta-7B
|
|
|
84 |
|
85 |
+
- **AI2 Reasoning Challenge (25-Shot):** 68.43% normalized accuracy
|
86 |
+
- **HellaSwag (10-Shot):** 86.37% normalized accuracy
|
87 |
+
- **MMLU (5-Shot):** 65.00% accuracy
|
88 |
+
- **TruthfulQA (0-shot):** 62.19% accuracy
|
89 |
+
- **Winogrande (5-shot):** 80.03% accuracy
|
90 |
+
- **GSM8k (5-shot):** 63.68% accuracy
|
|
|
|
|
|
|
|
|
|
|
|
|
91 |
|
92 |
## Limitations
|
93 |
|
94 |
+
While the merged model benefits from the strengths of both parent models, it may also inherit certain limitations and biases. Users should be aware of potential biases present in the training data of the original models, which could affect the performance and fairness of the merged model in specific applications.
|