twosmoothslateslabs
/

Nemesia-Qwen-2.5-7B-v1.0

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Nemesia-Qwen-2.5-7B-v1.0 / README.md

twosmoothslateslabs's picture

twosmoothslateslabs

Update README.md

9b2c19e verified about 1 month ago

|

history blame contribute delete

2.61 kB

	---
	base_model:
	- EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1
	- allura-org/Teleut-7b
	- FourOhFour/Vapor_v2_7B
	library_name: transformers
	tags:
	- mergekit
	- merge

	---

	![](https://i.imgur.com/3rVKAcZ.jpeg)

	## EDIT: MAY NOT WORK FOR GGUFs

	I don't know if its an issue with me or the model, but I can't seem to make quants of this model. I consistently get
	`llama_model_quantize: failed to quantize: tensor 'blk.24.attn_norm.weight' has invalid data`. My whole setup has so
	many stds and idiosynchasies that it may just be my system, but I tried redoing the whole thing and same stuff happened.
	At this point it may be an issue with the NuSLERP method or one of the models I'm using in the merge. Not sure gang. I will
	try swapping out a model or two in the merge and trying again to upload as a v2.0.

	## EDIT (again): DO NOT USE THIS MODEL

	I tried it four more times, swapping out models, swapping base models and models entirely, swapping params,
	`git pull`ing llamacpp and mergekit, nothing. Errors when making quants, every one. I'm declaring this a lost cause.
	I'm leaving this up in case someone gets it working.

	# info

	Merge using the brand new NuSLERP method. Fresh out of the oven. Performance not guaranteed.

	Uses the slightly-unstable EVA and two other finetunes I found. I also turned on both the NuSLERP exclusive mergekit options for fun.

	Named after the nemesia, a temperate shrubby flower. I tried to pick a flower that sounded kind of like NuSLERP.
	It doesn't, but the name still has the '''essence''' of NuSLERP I guess? (it doesn't.) Very pretty flower nonetheless

	# mergekit

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	### Merge Method

	This model was merged using the NuSLERP merge method using [EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1) as a base.

	### Models Merged

	The following models were included in the merge:
	* [allura-org/Teleut-7b](https://huggingface.co/allura-org/Teleut-7b)
	* [FourOhFour/Vapor_v2_7B](https://huggingface.co/FourOhFour/Vapor_v2_7B)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: allura-org/Teleut-7b
	parameters:
	weight: 0.6
	- model: FourOhFour/Vapor_v2_7B
	parameters:
	weight: 0.2
	- model: EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1
	parameters:
	weight: 1.0
	merge_method: nuslerp
	base_model: EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1
	parameters:
	normalize: true
	int8_mask: true
	nuslerp_flatten: false
	nuslerp_row_wise: true
	dtype: float16
	```