Better-PairRM / README.md
maywell's picture
Update README.md
1b901ae verified
|
raw
history blame
851 Bytes
metadata
license: apache-2.0

Better Implementation for PairRM

Introduction

This version of PairRM have some fixes on training process, which improve model's performance significantly.

Minor Fixes

Longer Context Length (2048 -> 3380)

Thanks to deberta's tokenzer, original PairRM model had enough Context Length.

But, the longer the better :>


Major Fixes

Change Prompt Format

Why use something like

<Response i + 1> {response}

So, I changed to a format based on Vicuna 1.1.


Change Truncate side

The original process was using right side truncate even on Input. This can cause serious problem when Input exceeds model's seq len.


Dataset Filter

There was decent amount of empty assistant response on original dataset. So, I dropped them.