weepcat/summarization_sft_reward-model-deberta-v3-large-v2_RM-Gemma-2B_mask_partial_rm_random_length Text Classification • Updated 7 days ago • 437
weepcat/summarization_sft_reward-model-deberta-v3-large-v2 Text Classification • Updated 8 days ago • 455
weepcat/hh_sft_RM-Gemma-2B_RM-Gemma-7B_mask_partial_rm_random_length Text Classification • Updated 22 days ago • 414
weepcat/hh_sft_RM-Gemma-2B_RM-Gemma-7B_mask_partial_rm_token_by_token Text Classification • Updated 27 days ago • 90
weepcat/compute_weights_summarization_partial_reward_model_random_length-2 Viewer • Updated 8 days ago • 302k • 50
weepcat/compute_rewards_summarization_partial_reward_model_random_length-2 Viewer • Updated 8 days ago • 302k • 26
weepcat/compute_weights_hh_partial_reward_model_random_length-3 Viewer • Updated 22 days ago • 338k • 32
weepcat/compute_rewards_hh_partial_reward_model_random_length-3 Viewer • Updated 22 days ago • 338k • 31