prithivMLmods
/

PRM-Math-7B-Reasoner

Text Classification

text-generation

text-generation-inference

Model card Files Files and versions Community

prithivMLmods commited on Jan 19

Commit

b5d7d03

·

verified ·

1 Parent(s): 12ea0a7

Update README.md

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -18,7 +18,11 @@ base_model:
 PRM-Math-7B-Reasoner is a fully reproducible model, fine-tuned on the Qwen2.5-Math-7B-PRM800K dataset, designed to evaluate its ability to identify erroneous steps in mathematical reasoning. The model is used for reward computation, where after each step, a special token "<extra_0>" is inserted. For reward calculation, the probability score of this token being classified as positive is extracted, resulting in a reward value between 0 and 1. It is primarily utilized for solution reformatting in mathematically driven tasks and as a Long Context Full Reasoner.
-# Reformatting Reasoning Intermediate
 | **Section**          | **Content**                                                                                                                                                                                                 |
 |-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

 PRM-Math-7B-Reasoner is a fully reproducible model, fine-tuned on the Qwen2.5-Math-7B-PRM800K dataset, designed to evaluate its ability to identify erroneous steps in mathematical reasoning. The model is used for reward computation, where after each step, a special token "<extra_0>" is inserted. For reward calculation, the probability score of this token being classified as positive is extracted, resulting in a reward value between 0 and 1. It is primarily utilized for solution reformatting in mathematically driven tasks and as a Long Context Full Reasoner.
+# **PROCESSBENCH : PAPER**
+*PROCESSBENCH: Identifying Process Errors in Mathematical Reasoning (arXiv)* : https://arxiv.org/pdf/2412.06559
+# **Reformatting Reasoning Intermediate**
 | **Section**          | **Content**                                                                                                                                                                                                 |
 |-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|