More details on training data for reward model

by reign12 - opened Sep 12, 2023

Sep 12, 2023

Many thanks for your great effort of open-sourcing this reward model! However, I am very curious about the details of the training data of this reward model.
What is the oasst_export exactly?
What does the fraction mean in the Datasets part?
And how can we use hellaswag as a comparison dataset?
Many thanks for any discussions in advance!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment