dshin/flan-t5-ppo-user-f-batch-size-8-epoch-1-use-violation Reinforcement Learning • Updated Mar 13, 2023 • 14
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-1-use-violation Reinforcement Learning • Updated Mar 13, 2023 • 12
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-1-use-violation Reinforcement Learning • Updated Mar 13, 2023 • 11
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-2-use-violation Reinforcement Learning • Updated Mar 13, 2023 • 13
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-2-use-violation Reinforcement Learning • Updated Mar 13, 2023 • 27
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-3-use-violation Reinforcement Learning • Updated Mar 13, 2023 • 41
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-2-use-violation Reinforcement Learning • Updated Mar 13, 2023 • 12
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-3-use-violation Reinforcement Learning • Updated Mar 13, 2023 • 12
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-4-use-violation Reinforcement Learning • Updated Mar 13, 2023 • 15
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-3-use-violation Reinforcement Learning • Updated Mar 13, 2023 • 13
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-4-use-violation Reinforcement Learning • Updated Mar 13, 2023 • 12
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-4-use-violation Reinforcement Learning • Updated Mar 13, 2023 • 13