--- base_model: - nvidia/Mistral-NeMo-Minitron-8B-Base library_name: transformers tags: - mergekit - merge license: other license_name: nvidia-open-model-license license_link: >- https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf --- # merged ## Use v1! This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the passthrough merge method. ### Models Merged The following models were included in the merge: * [nvidia/Mistral-NeMo-Minitron-8B-Base](https://huggingface.co/nvidia/Mistral-NeMo-Minitron-8B-Base) ### Configuration The following YAML configuration was used to produce this model: ```yaml dtype: bfloat16 merge_method: passthrough slices: - sources: - layer_range: [0, 8] model: nvidia/Mistral-NeMo-Minitron-8B-Base - sources: - layer_range: [8, 16] model: nvidia/Mistral-NeMo-Minitron-8B-Base parameters: scale: - filter: o_proj value: 0.5 - filter: down_proj value: 0.5 - filter: q_proj value: 0.85355339059 - filter: k_proj value: 0.85355339059 - value: 1.0 - sources: - layer_range: [8, 16] model: nvidia/Mistral-NeMo-Minitron-8B-Base parameters: scale: - filter: q_proj value: 0.85355339059 - filter: k_proj value: 0.85355339059 - value: 1.0 - sources: - layer_range: [16, 17] model: nvidia/Mistral-NeMo-Minitron-8B-Base - sources: - layer_range: [17, 24] model: nvidia/Mistral-NeMo-Minitron-8B-Base parameters: scale: - filter: o_proj value: 0.5 - filter: down_proj value: 0.5 - filter: q_proj value: 0.85355339059 - filter: k_proj value: 0.85355339059 - value: 1.0 - sources: - layer_range: [17, 24] model: nvidia/Mistral-NeMo-Minitron-8B-Base parameters: scale: - filter: q_proj value: 0.85355339059 - filter: k_proj value: 0.85355339059 - value: 1.0 - sources: - layer_range: [24, 25] model: nvidia/Mistral-NeMo-Minitron-8B-Base - sources: - layer_range: [25, 32] model: nvidia/Mistral-NeMo-Minitron-8B-Base parameters: scale: - filter: o_proj value: 0.5 - filter: down_proj value: 0.5 - filter: q_proj value: 0.85355339059 - filter: k_proj value: 0.85355339059 - value: 1.0 - sources: - layer_range: [25, 32] model: nvidia/Mistral-NeMo-Minitron-8B-Base parameters: scale: - filter: q_proj value: 0.85355339059 - filter: k_proj value: 0.85355339059 - value: 1.0 - sources: - layer_range: [32, 40] model: nvidia/Mistral-NeMo-Minitron-8B-Base ```