4x1.8B MoE Qwen Ckpt 50000

This is a MoE model project constructed based on the Qwen 1.8B model. In this project, we concatenated 4 original models and trained them using special training methods.

This model is a checkpoint model for the continue pretraining stage.

Evaluations

Groups	Metric	Value		Stderr
boolq	acc	0.6508	±	0.0083
ceval-valid	acc	0.5290	±	0.1912
	acc_norm	0.5290	±	0.1912
cmmlu	acc	0.5087	±	0.1237
	acc_norm	0.5087	±	0.1237
mathqa	acc	0.2647	±	0.0081
	acc_norm	0.2693	±	0.0081
mmlu	acc	0.4353	±	0.0830
- stem	acc	0.3809	±	0.0659
- social_sciences	acc	0.4959	±	0.0708
- other	acc	0.4844	±	0.0744
- humanities	acc	0.3998	±	0.0849

Acknowledgements

License Agreement

This project is open source under the Tongyi Qianwen Research License Agreement. You can view the complete license agreement in this link: [https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT].

During the use of this project, please ensure that your usage behavior complies with the terms and conditions of the license agreement.