metadata

license: apache-2.0

Better Implementation for PairRM

Introduction

This version of PairRM have some fixes on training process, which improve model's performance significantly.

Thanks to deberta's tokenzer, original PairRM model had enough Context Length.

But, the longer the better :>

Why use something like

<Response i + 1> {response}

So, I changed to a format based on Vicuna 1.1.

The original process was using right side truncate even on Input. This can cause serious problem when Input exceeds model's seq len.

There was decent amount of empty assistant response on original dataset. So, I dropped them.