README.md · GAIR/Abel-7B-002 at main

We released Abel-7B-002, resulting in a stronger (35% improvement on GSM8K, 126% improvement on MATH) and more generalized model, achieving the best performance among all 7B models (80.44 on GSM8K, 29.46 on MATH)

Model	GSM8k	MATH	MathQA	SVAMP	SCQ5K-EN	ARC-E	ARC-C	HellaSwag	MMLU
Abel-7B-002	80.44	29.46	69.78	77.67	55.95	77.67	55.05	77.72	61.19
Abel-7B-001	59.74	13	1.21	57.67	9.3	53.32	38.97	63.51	40.59
MetaMath-Mistral-7B	77.7	28.2	33.94	79.33	37.6	78.48	51.93	76.44	61.93
Qwen-7b	47.84	9.34	27.44	53	40.05	74.97	53.05	86.85	57.98
Mistral-7b	37.83	9.06	25.73	63	39.6	76.83	53.22	76.31	64.05
Yi-6b	32.6	5.78	26.98	55.67	35.5	73.66	49.53	68.97	64.02
LLaMA2-7b	12.96	2.78	11.52	44	28.24	71.12	46.61	71.32	46.7

Please cite the repo if the model/code/conclusion in this repo are helpful to you.

@misc{abel,
  author = {Chern, Ethan and Zou, Haoyang and Li, Xuefeng and Hu, Jiewen and Feng, Kehua and Li, Junlong and Liu, Pengfei},
  title = {Generative AI for Math: Abel},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/GAIR-NLP/abel}},
}