ZeroXClem
/

Stheno-Hercules-3.1-8B

@@ -6,15 +6,25 @@ tags:
 - lazymergekit
 - Locutusque/Hercules-6.1-Llama-3.1-8B
 - Sao10K/Llama-3.1-8B-Stheno-v3.4
 ---
 # ZeroXClem/Stheno-Hercules-3.1-8B
-ZeroXClem/Stheno-Hercules-3.1-8B is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
-* [Locutusque/Hercules-6.1-Llama-3.1-8B](https://huggingface.co/Locutusque/Hercules-6.1-Llama-3.1-8B)
-* [Sao10K/Llama-3.1-8B-Stheno-v3.4](https://huggingface.co/Sao10K/Llama-3.1-8B-Stheno-v3.4)
-## 🧩 Configuration
 ```yaml
 slices:
@@ -28,10 +38,37 @@ base_model: Locutusque/Hercules-6.1-Llama-3.1-8B
 parameters:
   t:
     - filter: self_attn
-      value: [0, 0.5, 0.3, 0.7, 1]
     - filter: mlp
-      value: [1, 0.5, 0.7, 0.3, 0]
-    - value: 0.5
-dtype: bfloat16
-```

 - lazymergekit
 - Locutusque/Hercules-6.1-Llama-3.1-8B
 - Sao10K/Llama-3.1-8B-Stheno-v3.4
+base_model:
+- Locutusque/Hercules-6.1-Llama-3.1-8B
 ---
+README.md
 # ZeroXClem/Stheno-Hercules-3.1-8B
+ZeroXClem/Stheno-Hercules-3.1-8B is an advanced model merge, combining the strengths of two state-of-the-art models using the powerful [mergekit](https://github.com/cg123/mergekit) framework. This model is designed to maximize performance by blending different architecture layers and leveraging cutting-edge interpolation techniques, bringing together the best of both worlds: **Hercules** and **Stheno**.
+## 🚀 Merged Models
+This model merge incorporates the following:
+- [**Locutusque/Hercules-6.1-Llama-3.1-8B**](https://huggingface.co/Locutusque/Hercules-6.1-Llama-3.1-8B): Known for its powerful attention mechanisms and deep neural layers, Hercules-6.1 serves as the base for this merge.
+- [**Sao10K/Llama-3.1-8B-Stheno-v3.4**](https://huggingface.co/Sao10K/Llama-3.1-8B-Stheno-v3.4): Complementing Hercules, Stheno-v3.4 contributes its refined, balanced network architecture for added depth and flexibility.
+## 🧩 Merge Configuration
+The configuration below outlines how the models are merged using **spherical linear interpolation (SLERP)**, which allows for smooth transitions between the layers of both models, ensuring an optimal blend of their unique attributes:
 ```yaml
 slices:
 parameters:
   t:
     - filter: self_attn
+      value: [0, 0.5, 0.3, 0.7, 1]  # Controls the blending of self-attention layers
     - filter: mlp
+      value: [1, 0.5, 0.7, 0.3, 0]  # Adjusts the blending across the MLP layers
+    - value: 0.5  # Global merge weight for layers not specified by filters
+dtype: bfloat16  # Optimized for efficiency and performance
+```
+### Key Parameters
+- **Self-Attention Filtering** (`self_attn`): Controls the extent of blending across self-attention layers, ranging from full to partial utilization from both models at various levels.
+- **MLP Filtering** (`mlp`): Similar to self-attention, this filter applies to the Multi-Layer Perceptrons, fine-tuning the neural network’s layer balance.
+- **Global Weight (`t.value`)**: A general interpolation factor for all layers not explicitly defined by the filters, set at 0.5 for an equal contribution from both models.
+- **Data Type (`dtype`)**: Uses `bfloat16` to maintain computational efficiency while ensuring a high level of precision.
+## 🎯 Use Case & Applications
+**ZeroXClem/Stheno-Hercules-3.1-8B** is where **imagination meets intelligence**, a model built to seamlessly weave together the **art of roleplay** and the **precision of science**. With the raw power of Hercules fueling your creations and Stheno’s delicate balance guiding every interaction, this model thrives in:
+- **Immersive storytelling and dynamic roleplaying**: Craft rich, believable characters and worlds with unparalleled depth, emotional nuance, and narrative flow.
+- **Scientific exploration and discovery**: Unleash your mind’s full potential for complex problem-solving, hypothesis testing, and advanced AI-driven research.
+- **Blending creativity and logic**: A harmonious fusion of heart and intellect, this model handles anything from playful creativity to rigorous scientific applications.
+## 📜 License
+This model is open-sourced under the **Apache-2.0 License**.
+## 💡 Tags
+- `merge`
+- `mergekit`
+- `lazymergekit`
+- `Locutusque/Hercules-6.1-Llama-3.1-8B`
+- `Sao10K/Llama-3.1-8B-Stheno-v3.4`