OpenRLHF
/

Mistral-7b-PRM-Math-Shepherd

Model card Files Files and versions Community

Mistral-7b-PRM-Math-Shepherd / README.md

chuyi777's picture

Update README.md

41d1ad8 verified 2 months ago

|

history blame contribute delete

173 Bytes

	Process Reward Model trained by OpenRLHF

	- Dataset: Math-Shepherd (https://huggingface.co/datasets/peiyi9979/Math-Shepherd)
	- Learning Rate: 1e-6
	- Training Accuracy: 0.922