Safetensors
gemma
yifAI commited on
Commit
5a11484
·
verified ·
1 Parent(s): 2631853

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -6
README.md CHANGED
@@ -1,9 +1,51 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
4
- # Introduction
5
- This reward model is finetuned from the [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) using the dataset [Skywork/Skywork-Reward-Preference-80K-v0.1](https://huggingface.co/datasets/Skywork/Skywork-Reward-Preference-80K-v0.1)
6
- # Evaluation
7
- This reward model is evaluated using evaluation code adapted from [RewardBench](https://github.com/allenai/reward-bench). For detailed code information, please refer to [general-preference-model](https://github.com/general-preference/general-preference-model).
8
- # Usage
9
- Please refer to [general-preference-model](https://github.com/general-preference/general-preference-model) for detailed usage instructions.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ base_model:
4
+ - google/gemma-2b-it
5
  ---
6
+
7
+ # General Preference Representation Model (GPM)
8
+
9
+ + **Authors** (* indicates equal contribution)
10
+
11
+ Yifan Zhang*, Ge Zhang*, Yue Wu*, Kangping Xu, Quanquan Gu
12
+
13
+ + **Paper**: [General Preference Modeling with Preference Representations for Aligning Language Models (https://arxiv.org/abs/2410.02197)](https://arxiv.org/abs/2410.02197)
14
+ + **As Huggingface Daily Papers**: [https://huggingface.co/papers/2410.02197](https://huggingface.co/papers/2410.02197)
15
+ + **Code Repository**: [General-Preference-Model (https://github.com/general-preference/general-preference-model)](https://github.com/general-preference/general-preference-model)
16
+ + **Dataset**: [Skywork/Skywork-Reward-Preference-80K-v0.1](https://huggingface.co/datasets/Skywork/Skywork-Reward-Preference-80K-v0.1)
17
+ + **Base Model**: [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it)
18
+
19
+ ## Overview
20
+
21
+ The General Preference Representation Model (GPM) improves preference-based reward modeling by embedding responses into a latent space to efficiently capture complex, intransitive human preferences. GPM achieves linear query complexity, allowing for expressive preference representation, and outperforms traditional Bradley-Terry (BT) reward models, particularly in handling cyclic preferences.
22
+
23
+ ## Key Features
24
+ - **Preference Representation Learning**: Embeds responses in a multi-dimensional latent space to model intricate human preferences, including cyclic and intransitive structures.
25
+ - **Efficient Querying**: Reduces computational complexity to O(K), compared to O(K²) for traditional methods, making GPM scalable for large response sets.
26
+ - **General Preference Optimization (GPO)**: Introduces a preference score that integrates with reinforcement learning methods to optimize policy alignment with human preferences.
27
+
28
+ ## Evaluation
29
+
30
+ The GPM is evaluated using the [RewardBench](https://github.com/allenai/reward-bench) leaderboard, showing significant improvements over the BT model, with a performance margin of up to 5.6%. GPM also excels in modeling cyclic preferences, achieving 100% accuracy on cyclic datasets.
31
+
32
+ ## Usage
33
+
34
+ To use this model, please refer to the [General Preference Model Code Repository](https://github.com/general-preference/general-preference-model). The repository includes detailed instructions for finetuning, evaluation, and integration of the GPM with downstream tasks. Below is an example code snippet:
35
+
36
+ ```python
37
+ TODO
38
+ ```
39
+
40
+ ## Citation
41
+
42
+ If you find this work useful for your research, please consider citing:
43
+
44
+ ```
45
+ @article{zhang2024general,
46
+ title={General Preference Modeling with Preference Representations for Aligning Language Models},
47
+ author={Zhang, Yifan and Zhang, Ge and Wu, Yue and Xu, Kangping and Gu, Quanquan},
48
+ journal={arXiv preprint arXiv:2410.02197},
49
+ year={2024}
50
+ }
51
+ ```