metadata

license: cc-by-nc-4.0
language:
  - en
pipeline_tag: text-classification
tags:
  - pytorch
  - reward_model
  - transformers
  - RLHF

Model Card for Model ID

This is part of the Chai reward-model series, using the GPT2 architecture with a classification head, optimising for a user accepting the completion generated by the base model.

Its training dataset consists of purely user-generated content retry_and_continue_50m_reward_model, where a user has the option to decline the generated response via the retry button or end the conversation.

Model Details

Developed by Chai Research
Model type: Transformer-based Classification Model
Language: English
License: cc-by-nc-4.0
Contact: to ask questions about this model, join the Chai Discord. For general correspondence: [email protected]