general-preference
/

GPM-Gemma-2B

Safetensors

gemma

Model card Files Files and versions Community

yifAI commited on Oct 13, 2024

Commit

5a11484

verified ·

1 Parent(s): 2631853

Update README.md

Browse files

Files changed (1) hide show

README.md +48 -6

README.md CHANGED Viewed

@@ -1,9 +1,51 @@
 ---
 license: apache-2.0
 ---
-# Introduction
-This reward model is finetuned from the [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) using the dataset [Skywork/Skywork-Reward-Preference-80K-v0.1](https://huggingface.co/datasets/Skywork/Skywork-Reward-Preference-80K-v0.1)
-# Evaluation
-This reward model is evaluated using evaluation code adapted from [RewardBench](https://github.com/allenai/reward-bench). For detailed code information, please refer to [general-preference-model](https://github.com/general-preference/general-preference-model).
-# Usage
-Please refer to [general-preference-model](https://github.com/general-preference/general-preference-model) for detailed usage instructions.

 ---
 license: apache-2.0
+base_model:
+- google/gemma-2b-it
 ---
+# General Preference Representation Model (GPM)
++ **Authors** (* indicates equal contribution)
+    Yifan Zhang*, Ge Zhang*, Yue Wu*, Kangping Xu, Quanquan Gu
++ **Paper**: [General Preference Modeling with Preference Representations for Aligning Language Models (https://arxiv.org/abs/2410.02197)](https://arxiv.org/abs/2410.02197)
++ **As Huggingface Daily Papers**: [https://huggingface.co/papers/2410.02197](https://huggingface.co/papers/2410.02197)
++ **Code Repository**: [General-Preference-Model (https://github.com/general-preference/general-preference-model)](https://github.com/general-preference/general-preference-model)
++ **Dataset**: [Skywork/Skywork-Reward-Preference-80K-v0.1](https://huggingface.co/datasets/Skywork/Skywork-Reward-Preference-80K-v0.1)
++ **Base Model**: [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it)
+## Overview
+The General Preference Representation Model (GPM) improves preference-based reward modeling by embedding responses into a latent space to efficiently capture complex, intransitive human preferences. GPM achieves linear query complexity, allowing for expressive preference representation, and outperforms traditional Bradley-Terry (BT) reward models, particularly in handling cyclic preferences.
+## Key Features
+- **Preference Representation Learning**: Embeds responses in a multi-dimensional latent space to model intricate human preferences, including cyclic and intransitive structures.
+- **Efficient Querying**: Reduces computational complexity to O(K), compared to O(K²) for traditional methods, making GPM scalable for large response sets.
+- **General Preference Optimization (GPO)**: Introduces a preference score that integrates with reinforcement learning methods to optimize policy alignment with human preferences.
+## Evaluation
+The GPM is evaluated using the [RewardBench](https://github.com/allenai/reward-bench) leaderboard, showing significant improvements over the BT model, with a performance margin of up to 5.6%. GPM also excels in modeling cyclic preferences, achieving 100% accuracy on cyclic datasets.
+## Usage
+To use this model, please refer to the [General Preference Model Code Repository](https://github.com/general-preference/general-preference-model). The repository includes detailed instructions for finetuning, evaluation, and integration of the GPM with downstream tasks. Below is an example code snippet:
+```python
+TODO
+```
+## Citation
+If you find this work useful for your research, please consider citing:
+```
+@article{zhang2024general,
+  title={General Preference Modeling with Preference Representations for Aligning Language Models},
+  author={Zhang, Yifan and Zhang, Ge and Wu, Yue and Xu, Kangping and Gu, Quanquan},
+  journal={arXiv preprint arXiv:2410.02197},
+  year={2024}
+}
+```