gx-ai-architect/ultrafeedback-dice-iter1-sft-drsow-first-half-vanilla-router Viewer • Updated Apr 5 • 60.9k • 43
gx-ai-architect/ultrafeedback-qwen-32b-instruct-vanilla-router-alpha-normalize-0.04-bo32-correct-long Viewer • Updated Mar 31 • 52k • 39
gx-ai-architect/ultrafeedback-qwen-32b-instruct-vanilla-router-length-normalize-bo32-correct Viewer • Updated Mar 31 • 52k • 43
gx-ai-architect/ultrafeedback-qwen-32b-instruct-vanilla-router-length-normalize-bo32 Viewer • Updated Mar 31 • 60.9k • 40
gx-ai-architect/ultrafeedback-qwen-32b-instruct-vanilla-router-alpha-normalize-0.04-bo32 Viewer • Updated Mar 31 • 60.9k • 41
gx-ai-architect/ultrafeedback-eurus-7b-classifier-annotation-bo32 Viewer • Updated Mar 30 • 60.8k • 44
gx-ai-architect/ultrafeedback-qwen32b-instruct-vs-base-vanilla-router-filter-minus50-bo32 Viewer • Updated Mar 30 • 57.9k • 59
gx-ai-architect/ultrafeedback-llama-rdpo-vs-sft-dpo-vanilla-router-filter-minus50-bo32 Viewer • Updated Mar 27 • 58.4k • 42
gx-ai-architect/ultrafeedback-mistral-rdpo-vs-base-dpo-vanilla-router-filter-minus50-bo32 Viewer • Updated Mar 27 • 58.4k • 40
gx-ai-architect/ultrafeedback-rdpo-vs-zepher-dpo-vanilla-router-filter-minus50-bo32-updated1 Viewer • Updated Mar 26 • 51.1k • 41
gx-ai-architect/ultrafeedback-rdpo-vs-zepher-dpo-vanilla-router-filter-minus50-bo32-updated Viewer • Updated Mar 26 • 51.1k • 37
gx-ai-architect/ultrafeedback-rdpo-vs-zepher-dpo-vanilla-router-filter-minus50-bo32 Viewer • Updated Mar 26 • 51.1k • 30
gx-ai-architect/numinamath-178k-phi4-bon-verified-dpo-trl-40k-old-r1-format Viewer • Updated Feb 3 • 39k • 33
gx-ai-architect/official_dpo_r1_prompt_bo8_random_rej_balanced_fixed Viewer • Updated Feb 2 • 59.4k • 27