GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-7_eta_1e5_bs_128_iter_1_1723117946 Text Generation • 8B • Updated Aug 8, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-7_eta_1e4_bs_128_iter_1_1723127350 Text Generation • 8B • Updated Aug 8, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e5_bs_128_iter_1_1723067861 Text Generation • 8B • Updated Aug 8, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e2_bs_128_iter_1_1723099136 Text Generation • 8B • Updated Aug 8, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-7_eta_1e6_bs_128_iter_1_1723108586 Text Generation • 8B • Updated Aug 8, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e6_bs_128_iter_1_1723089445 Text Generation • 8B • Updated Aug 8, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e6_bs_128_eps_60000_20k_1723066371 Text Generation • 8B • Updated Aug 8, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e4_bs_128_iter_1_1723079513 Text Generation • 8B • Updated Aug 8, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e6_bs_128_iter_2_1722906410 Text Generation • 8B • Updated Aug 7, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e6_bs_128_iter_2_20k_1722814689 Text Generation • 8B • Updated Aug 5, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e6_bs_128_iter_1_1722744283 Text Generation • 8B • Updated Aug 4, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e6_bs_128_iter_1_20k_1722744286 Text Generation • 8B • Updated Aug 4, 2024 • 5
GitBag/rebel_multiturn_chat_hybrid_gen_p_1_pairx_wm_0.1_eta_1_batch_size_32_kl_0_lr_3e-7_1722632593 Updated Aug 3, 2024
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e6_bs_128_20k_1722618511 Text Generation • 8B • Updated Aug 2, 2024 • 6
GitBag/rebel_multiturn_chat_pairx_last_continue_1000_batch_size_32_kl_0_lr_3e-7_1722453942 Updated Aug 2, 2024
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-7_eta_1e6_bs_128_1722453914 Text Generation • 8B • Updated Aug 1, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-7_eta_1e7_bs_128_1722458326 Text Generation • 8B • Updated Aug 1, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e7_bs_128_1722449463 Text Generation • 8B • Updated Aug 1, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-7_eta_1e5_bs_128_1722449463 Text Generation • 8B • Updated Jul 31, 2024 • 5
GitBag/rebel_multiturn_chat_pairx_continue_1600_batch_size_32_kl_0_lr_3e-7_1722280096 Updated Jul 31, 2024
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-6_eta_3e5_bs_128_1722330201 Text Generation • 8B • Updated Jul 30, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e6_bs_128_1722343397 Text Generation • 8B • Updated Jul 30, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e5_bs_128_1722339001 Text Generation • 8B • Updated Jul 30, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-6_eta_1e5_bs_128_1722325807 Text Generation • 8B • Updated Jul 30, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_eta_1e4_bs_128_1722334608 Text Generation • 8B • Updated Jul 30, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-6_eta_3e4_bs_128_1722321357 Text Generation • 8B • Updated Jul 30, 2024 • 4
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-5_eta_1e4_bs_128_1722297336 Text Generation • 8B • Updated Jul 30, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-5_eta_1e3_bs_128_1722292937 Text Generation • 8B • Updated Jul 30, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-6_eta_1e4_bs_128_1722288563 Text Generation • 8B • Updated Jul 30, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-6_eta_1e3_bs_128_1722284171 Text Generation • 8B • Updated Jul 30, 2024 • 5