--- base_model: - CultriX/Qwen2.5-14B-Wernickev3 - djuna/Q2.5-Veltha-14B-0.5 - CultriX/Qwenfinity-2.5-14B - sometimesanotion/Lamarck-14B-v0.6 - CultriX/Qwen2.5-14B-Emerged - CultriX/Qwen2.5-14B-Broca - qingy2024/Fusion4-14B-Instruct - CultriX/SeQwence-14B-EvolMerge - allknowingroger/QwenSlerp5-14B library_name: transformers tags: - mergekit - merge --- # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the della_linear merge method using [CultriX/Qwen2.5-14B-Wernickev3](https://huggingface.co/CultriX/Qwen2.5-14B-Wernickev3) as a base. ### Models Merged The following models were included in the merge: * [djuna/Q2.5-Veltha-14B-0.5](https://huggingface.co/djuna/Q2.5-Veltha-14B-0.5) * [CultriX/Qwenfinity-2.5-14B](https://huggingface.co/CultriX/Qwenfinity-2.5-14B) * [sometimesanotion/Lamarck-14B-v0.6](https://huggingface.co/sometimesanotion/Lamarck-14B-v0.6) * [CultriX/Qwen2.5-14B-Emerged](https://huggingface.co/CultriX/Qwen2.5-14B-Emerged) * [CultriX/Qwen2.5-14B-Broca](https://huggingface.co/CultriX/Qwen2.5-14B-Broca) * [qingy2024/Fusion4-14B-Instruct](https://huggingface.co/qingy2024/Fusion4-14B-Instruct) * [CultriX/SeQwence-14B-EvolMerge](https://huggingface.co/CultriX/SeQwence-14B-EvolMerge) * [allknowingroger/QwenSlerp5-14B](https://huggingface.co/allknowingroger/QwenSlerp5-14B) ### Configuration The following YAML configuration was used to produce this model: ```yaml merge_method: della_linear base_model: CultriX/Qwen2.5-14B-Wernickev3 dtype: bfloat16 out_dtype: bfloat16 parameters: epsilon: 0.009 # Further reduced for ultra-fine parameter scaling. lambda: 1.6 # Increased to emphasize significant model contributions. normalize: true # Balances the parameter integration for stability. rescale: true # Enabled to align parameter scales across models. int8_mask: false # Disabled to allow full-precision computations for enhanced accuracy. density: 0.90 # Balanced density for optimal generalization and performance. adaptive_merge_parameters: task_weights: tinyArc: 1.6 # Prioritizes logical reasoning improvements. tinyHellaswag: 1.5 # Strengthened contextual understanding and consistency. tinyMMLU: 1.8 # Enhanced domain knowledge for multitask benchmarks. tinyTruthfulQA: 1.9 # Maximized for accurate factual reasoning and QA. tinyTruthfulQA_mc1: 1.75 # Increased focus for multiple-choice reasoning. tinyWinogrande: 1.75 # Advanced reasoning and contextual prediction improvement. IFEval: 2.15 # Enhanced instruction-following tasks boosted by multitask contributors. BBH: 1.95 # Further improved for complex reasoning tasks. MATH: 2.45 # Highest priority, focusing on mathematical excellence. GPQA: 2.1 # Boosted graduate-level QA capabilities. MUSR: 1.9 # Nuanced multi-step reasoning strengthened further. MMLU-PRO: 1.9 # Maximized domain multitask performance. smoothing_factor: 0.035 # Further reduced for precise task-specific blending. gradient_clipping: CultriX/Qwen2.5-14B-Wernickev3: 0.88 # Increased for enhanced stability. CultriX/Qwenfinity-2.5-14B: 0.85 # Adjusted for consistent multitask integration. djuna/Q2.5-Veltha-14B-0.5: 0.91 # Maintained advanced reasoning contributions. CultriX/SeQwence-14B-EvolMerge: 0.88 # Generalist multitask support remains stable. qingy2024/Fusion4-14B-Instruct: 0.93 # Mathematically focused tasks maximized. CultriX/Qwen2.5-14B-Emerged: 0.88 # Increased for logical reasoning enhancements. sometimesanotion/Lamarck-14B-v0.6: 0.89 # Balanced multi-step reasoning contributions. allknowingroger/QwenSlerp5-14B: 0.87 # Contextual and logical reasoning integration refined. models: - model: CultriX/Qwen2.5-14B-Wernickev3 parameters: weight: 0.30 # Core backbone for multitask reasoning. density: 0.75 # Further increased to preserve critical reasoning parameters. - model: CultriX/Qwenfinity-2.5-14B parameters: weight: 0.25 # Comprehensive multitask performer. density: 0.65 - model: djuna/Q2.5-Veltha-14B-0.5 parameters: weight: 0.25 # Advanced reasoning support for GPQA and MUSR. density: 0.74 - model: CultriX/SeQwence-14B-EvolMerge parameters: weight: 0.20 # Enhanced contributions to BBH and MUSR. density: 0.55 - model: qingy2024/Fusion4-14B-Instruct parameters: weight: 0.19 # Mathematical reasoning priority. density: 0.77 - model: CultriX/Qwen2.5-14B-Emerged parameters: weight: 0.21 # Maintains overall task performance with balanced strengths. density: 0.72 # Increased for better integration. - model: CultriX/Qwen2.5-14B-Broca parameters: weight: 0.16 # Logical reasoning and factual QA enhancements. density: 0.68 # Increased to better support Broca's specialized tasks. - model: sometimesanotion/Lamarck-14B-v0.6 parameters: weight: 0.15 # Multi-step reasoning tasks contributor. density: 0.63 # Slight increase for better integration. - model: allknowingroger/QwenSlerp5-14B parameters: weight: 0.16 # Contextual reasoning improvements. density: 0.64 # Increased for enhanced performance. ```