Update README.md
Browse files
README.md
CHANGED
@@ -1,9 +1,51 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
3 |
---
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
base_model:
|
4 |
+
- google/gemma-2b-it
|
5 |
---
|
6 |
+
|
7 |
+
# General Preference Representation Model (GPM)
|
8 |
+
|
9 |
+
+ **Authors** (* indicates equal contribution)
|
10 |
+
|
11 |
+
Yifan Zhang*, Ge Zhang*, Yue Wu*, Kangping Xu, Quanquan Gu
|
12 |
+
|
13 |
+
+ **Paper**: [General Preference Modeling with Preference Representations for Aligning Language Models (https://arxiv.org/abs/2410.02197)](https://arxiv.org/abs/2410.02197)
|
14 |
+
+ **As Huggingface Daily Papers**: [https://huggingface.co/papers/2410.02197](https://huggingface.co/papers/2410.02197)
|
15 |
+
+ **Code Repository**: [General-Preference-Model (https://github.com/general-preference/general-preference-model)](https://github.com/general-preference/general-preference-model)
|
16 |
+
+ **Dataset**: [Skywork/Skywork-Reward-Preference-80K-v0.1](https://huggingface.co/datasets/Skywork/Skywork-Reward-Preference-80K-v0.1)
|
17 |
+
+ **Base Model**: [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it)
|
18 |
+
|
19 |
+
## Overview
|
20 |
+
|
21 |
+
The General Preference Representation Model (GPM) improves preference-based reward modeling by embedding responses into a latent space to efficiently capture complex, intransitive human preferences. GPM achieves linear query complexity, allowing for expressive preference representation, and outperforms traditional Bradley-Terry (BT) reward models, particularly in handling cyclic preferences.
|
22 |
+
|
23 |
+
## Key Features
|
24 |
+
- **Preference Representation Learning**: Embeds responses in a multi-dimensional latent space to model intricate human preferences, including cyclic and intransitive structures.
|
25 |
+
- **Efficient Querying**: Reduces computational complexity to O(K), compared to O(K²) for traditional methods, making GPM scalable for large response sets.
|
26 |
+
- **General Preference Optimization (GPO)**: Introduces a preference score that integrates with reinforcement learning methods to optimize policy alignment with human preferences.
|
27 |
+
|
28 |
+
## Evaluation
|
29 |
+
|
30 |
+
The GPM is evaluated using the [RewardBench](https://github.com/allenai/reward-bench) leaderboard, showing significant improvements over the BT model, with a performance margin of up to 5.6%. GPM also excels in modeling cyclic preferences, achieving 100% accuracy on cyclic datasets.
|
31 |
+
|
32 |
+
## Usage
|
33 |
+
|
34 |
+
To use this model, please refer to the [General Preference Model Code Repository](https://github.com/general-preference/general-preference-model). The repository includes detailed instructions for finetuning, evaluation, and integration of the GPM with downstream tasks. Below is an example code snippet:
|
35 |
+
|
36 |
+
```python
|
37 |
+
TODO
|
38 |
+
```
|
39 |
+
|
40 |
+
## Citation
|
41 |
+
|
42 |
+
If you find this work useful for your research, please consider citing:
|
43 |
+
|
44 |
+
```
|
45 |
+
@article{zhang2024general,
|
46 |
+
title={General Preference Modeling with Preference Representations for Aligning Language Models},
|
47 |
+
author={Zhang, Yifan and Zhang, Ge and Wu, Yue and Xu, Kangping and Gu, Quanquan},
|
48 |
+
journal={arXiv preprint arXiv:2410.02197},
|
49 |
+
year={2024}
|
50 |
+
}
|
51 |
+
```
|