OpenRLHF

community

https://github.com/OpenRLHF

AI & ML interests

None defined yet.

Recent Activity

chuyi777 authored a paper about 2 months ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

ZhangRC authored a paper 4 months ago

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Longhui98 authored a paper 6 months ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

View all activity

OpenRLHF 's models 10

OpenRLHF/Llama-3-8b-rm-mixture

8B • Updated Nov 30, 2024 • 105 • 1

OpenRLHF/Llama-2-7b-rm-anthropic_hh-lmsys-oasst-webgpt

7B • Updated Nov 30, 2024 • 3 • 1

OpenRLHF/Llama-3-8b-rm-700k

8B • Updated Nov 30, 2024 • 860 • 3

OpenRLHF/Mistral-7b-PRM-Math-Shepherd

7B • Updated Oct 30, 2024 • 3 • 1

OpenRLHF/Llama-3-8b-iter-dpo-179k

Text Generation • 8B • Updated Jul 28, 2024 • 25

OpenRLHF/Llama-3-8b-rlhf-100k

Text Generation • 8B • Updated Jun 24, 2024 • 660 • 4

OpenRLHF/Llama-3-8b-sft-mixture

Text Generation • 8B • Updated Jun 14, 2024 • 4.21k • • 1

OpenRLHF/Llama-2-7b-sft-model-ocra-500k

Text Generation • 7B • Updated Jun 9, 2024 • 9

OpenRLHF/Llama-2-13b-rm-anthropic_hh-lmsys-oasst-webgpt

13B • Updated Jan 24, 2024 • 3

OpenRLHF/Llama-2-13b-sft-model-ocra-500k

Text Generation • 13B • Updated Jan 5, 2024 • 10 • 1