BenevolenceMessiah
/

Reasoning-SCE-Coder-v1.0

Text Generation

text-generation-inference

Model card Files Files and versions Community

Reasoning-SCE-Coder-v1.0 / README.md

BenevolenceMessiah's picture

BenevolenceMessiah

Upload folder using huggingface_hub

59fefc9 verified 23 days ago

|

history blame contribute delete

1.87 kB

	---
	base_model:
	- FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-Flash-32B-Preview
	- Rombo-Org/Rombo-LLM-V3.1-QWQ-32b
	- FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview
	- FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview
	library_name: transformers
	tags:
	- mergekit
	- merge

	---
	# merge

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	### Merge Method

	This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Rombo-Org/Rombo-LLM-V3.1-QWQ-32b](https://huggingface.co/Rombo-Org/Rombo-LLM-V3.1-QWQ-32b) as a base.

	### Models Merged

	The following models were included in the merge:
	* [FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-Flash-32B-Preview](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-Flash-32B-Preview)
	* [FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview)
	* [FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview
	parameters:
	weight: 1.2 # Slightly favor
	density: 0.9 # Sparsified a bit to reduce noise
	- model: FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview
	parameters:
	weight: 1
	density: 0.9
	- model: FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-Flash-32B-Preview
	parameters:
	weight: 1
	density: 0.9
	merge_method: sce # SCE for adaptive weighting
	base_model: Rombo-Org/Rombo-LLM-V3.1-QWQ-32b
	parameters:
	normalize: true
	int8_mask: true
	select_topk: 0.1 # Retain the top 10% high-variance elements
	tokenizer_source: union # Union to combine vocabularies
	dtype: bfloat16
	```