metadata
license: apache-2.0
Better Implementation for PairRM
Introduction
This version of PairRM have some fixes on training process, which improve model's performance significantly.
Minor Fixes
Longer Context Length (2048 -> 3380)
Thanks to deberta's tokenzer, original PairRM model had enough Context Length.
But, the longer the better :>
Major Fixes
Change Prompt Format
Why use something like
<Response i + 1> {response}
So, I changed to a format based on Vicuna 1.1.
Change Truncate side
The original process was using right side truncate even on Input. This can cause serious problem when Input exceeds model's seq len.
Dataset Filter
There was decent amount of empty assistant response on original dataset. So, I dropped them.