OpenAssistant
/

reward-model-deberta-v3-large-v2

Text Classification

Model card Files Files and versions Community

Resources

View closed (1)

what score is high quality

#11 opened 4 months ago by

Hyperparameters training setting

#10 opened over 1 year ago by

synthetic-instruct-gptj-pairwise pairwise data how to pre-process for train data

#9 opened over 1 year ago by

How to fine tune this model with the Trainer API?

#8 opened over 1 year ago by

How to score a <instruction, input, output> pair?

#7 opened over 1 year ago by

Validation split indices?

#6 opened over 1 year ago by

np.int deprecation issue

#5 opened almost 2 years ago by

Question about evaluating this reward model on Anthropic/hh-rlhf

#4 opened almost 2 years ago by

Adding `safetensors` variant of this model

#3 opened about 2 years ago by

How to optimize loss function?

#1 opened about 2 years ago by