Great reward model, what dataset did you use to train?

by zolicsaki - opened Jul 25, 2024

Jul 25, 2024

Specifically I was wondering if you trained it on lmsys chatbot arena conversations, because your model is performing so well when evaluated on those preferences. Thanks for the help!

https://huggingface.co/datasets/lmsys/chatbot_arena_conversations

zolicsaki

Aug 6, 2024

@RangiLyu

zolicsaki

Aug 6, 2024

@ZwwWayne

RangiLyu

InternLM org Aug 7, 2024

Sorry for the late reply. We did use a portion of this dataset. We performed data cleaning and filtering, including removing toxic and unsafe data, to ensure quality and safety.

zolicsaki

Aug 8, 2024

@RangiLyu Thanks !

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment