thomas-yanxin
/

XinYuan-Qwen2-7B

Model card Files Files and versions Community

XinYuan-Qwen2-7B / README.md

thomas-yanxin's picture

Update README.md

c62d83e verified 5 months ago

|

history blame contribute delete

1.24 kB

	---
	license: other
	language:
	- zh
	- en
	datasets:
	- thomas-yanxin/MT-SFT-ShareGPT
	---


	The main purpose of this model is to validate the usability of [thomas-yanxin/MT-SFT-ShareGPT](https://huggingface.co/datasets/thomas-yanxin/MT-SFT-ShareGPT), i.e., the quality of the data is all you need. We found that when we meticulously extract the data through a better data governance approach, the corresponding model results can be vastly improved, even if only through SFT.

	Here are the results from our OpenCompass evaluation：

	\| Classification \| Benchmarks \| Models \|
	\| :------------: \| :--------: \| :--------: \|
	\| \| 名称 \| XinYuan-Qwen2-7B \|
	\| English \| MMLU \| 68.71 \|
	\| \| MMLU-Pro \| 30.56 \|
	\| \| Theorem QA \| 25.3 \|
	\| \| GPQA \| 29.2 \|
	\| \| BBH \| 60.3 \|
	\| \| IFEval (Prompt Strict-Acc.) \| 39.2 \|
	\| \| ARC-C \| 87.5 \|
	\| Math \| GSM8K \| 75.4 \|
	\| \| MATH \| 34.76 \|
	\| Chinese \| C-EVAL \| 82.0 \|
	\| \| CMMLU \| 77.9 \|
	\| Code \| MBPP \| 50.6 \|
	\| \| HumanEval \| 70.1 \|