Update pipeline tag to text-ranking and add descriptive tags

#3
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +10 -4
README.md CHANGED
@@ -1,8 +1,14 @@
1
  ---
2
- license: apache-2.0
3
- pipeline_tag: text-generation
4
  library_name: transformers
 
 
5
  paper: 2507.09104
 
 
 
 
 
 
6
  ---
7
 
8
  # CompassJudger-2
@@ -113,7 +119,7 @@ CompassJudger-2 sets a new state-of-the-art for judge models, outperforming gene
113
 
114
  | Model | JudgerBench V2 | JudgeBench | RMB | RewardBench | Average |
115
  | :--------------------------------- | :------------: | :--------: | :-------: | :---------: | :-------: |
116
- | **7B Judge Models** | | | | | |
117
  | CompassJudger-1-7B-Instruct | 57.96 | 46.00 | 38.18 | 80.74 | 55.72 |
118
  | Con-J-7B-Instruct | 52.35 | 38.06 | 71.50 | 87.10 | 62.25 |
119
  | RISE-Judge-Qwen2.5-7B | 46.12 | 40.48 | 72.64 | 88.20 | 61.61 |
@@ -129,7 +135,7 @@ CompassJudger-2 sets a new state-of-the-art for judge models, outperforming gene
129
  | Qwen3-235B-A22B | 61.40 | 65.97 | 75.59 | 84.68 | 71.91 |
130
 
131
 
132
- For detailed benchmark performance and methodology, please refer to our [๐Ÿ“‘ Paper](https://arxiv.org/abs/2507.09104).
133
 
134
  ## License
135
 
 
1
  ---
 
 
2
  library_name: transformers
3
+ license: apache-2.0
4
+ pipeline_tag: text-ranking
5
  paper: 2507.09104
6
+ language: en
7
+ tags:
8
+ - judge-model
9
+ - evaluation
10
+ - reward-modeling
11
+ - text-ranking
12
  ---
13
 
14
  # CompassJudger-2
 
119
 
120
  | Model | JudgerBench V2 | JudgeBench | RMB | RewardBench | Average |
121
  | :--------------------------------- | :------------: | :--------: | :-------: | :---------: | :-------: |
122
+ | **7B Judge Models** | | | | | |\
123
  | CompassJudger-1-7B-Instruct | 57.96 | 46.00 | 38.18 | 80.74 | 55.72 |
124
  | Con-J-7B-Instruct | 52.35 | 38.06 | 71.50 | 87.10 | 62.25 |
125
  | RISE-Judge-Qwen2.5-7B | 46.12 | 40.48 | 72.64 | 88.20 | 61.61 |
 
135
  | Qwen3-235B-A22B | 61.40 | 65.97 | 75.59 | 84.68 | 71.91 |
136
 
137
 
138
+ For detailed benchmark performance and methodology, please refer to our ๐Ÿ“‘ [Paper](https://arxiv.org/abs/2507.09104).
139
 
140
  ## License
141