Text Classification
Safetensors
gemma2
Ray2333 commited on
Commit
6589ad1
1 Parent(s): 2f0b839

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -11,7 +11,7 @@ pipeline_tag: text-classification
11
  This reward model achieves a score of 88.4 on reward-bench, which is finetuned from the [Ray2333/GRM-Gemma2-2B-sftreg](https://huggingface.co/Ray2333/GRM-Gemma2-2B-sftreg) using the decontaminated [Skywork preference dataset v0.2](https://huggingface.co/datasets/Skywork/Skywork-Reward-Preference-80K-v0.2).
12
  We obtain a **SOTA 2B reward model** that can outperform a series of 8B reward models and even surpass gpt4/gemini as a judge.
13
 
14
- Check our GRM series at 🤗[hugging face](https://huggingface.co/collections/Ray2333/grm-66882bdf7152951779506c7b) and our paper at [Arxiv](https://arxiv.org/abs/2406.10216).
15
 
16
 
17
 
 
11
  This reward model achieves a score of 88.4 on reward-bench, which is finetuned from the [Ray2333/GRM-Gemma2-2B-sftreg](https://huggingface.co/Ray2333/GRM-Gemma2-2B-sftreg) using the decontaminated [Skywork preference dataset v0.2](https://huggingface.co/datasets/Skywork/Skywork-Reward-Preference-80K-v0.2).
12
  We obtain a **SOTA 2B reward model** that can outperform a series of 8B reward models and even surpass gpt4/gemini as a judge.
13
 
14
+ Check our GRM series at 🤗[hugging face](https://huggingface.co/collections/Ray2333/grm-66882bdf7152951779506c7b), our paper at [Arxiv](https://arxiv.org/abs/2406.10216), and github repo at [Github](https://github.com/YangRui2015/Generalizable-Reward-Model).
15
 
16
 
17