OpenRLHF
/

Mistral-7b-PRM-Math-Shepherd

Model card Files Files and versions Community

chuyi777 commited on Oct 30, 2024

Commit

41d1ad8

·

verified ·

1 Parent(s): 79a5118

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 Process Reward Model trained by OpenRLHF
-```
-dataset Math-Shepherd
-Training accuracy 0.922
-```

 Process Reward Model trained by OpenRLHF
+- Dataset: Math-Shepherd (https://huggingface.co/datasets/peiyi9979/Math-Shepherd)
+- Learning Rate: 1e-6
+- Training Accuracy: 0.922