Rui Yang's picture

9 7 18

Rui Yang

Ray2333

·

https://yangrui2015.github.io

YangRui2015

AI & ML interests

Deep Reinforcement Learning

Recent Activity

liked a model 2 months ago

Ray2333/GRM_Llama3.1_8B_rewardmodel-ft

updated a model 2 months ago

Ray2333/GRM-Llama3-8B-rewardmodel-ft

updated a model 2 months ago

Ray2333/GRM-gemma2-2B-rewardmodel-ft

View all activity

Organizations

Ray2333's activity

liked a model 2 months ago

Ray2333/GRM_Llama3.1_8B_rewardmodel-ft

Text Classification • Updated Nov 30, 2024 • 135 • 3

updated 4 models 2 months ago

Ray2333/GRM-Llama3-8B-rewardmodel-ft

Updated Nov 30, 2024 • 1.8k • 1

Ray2333/GRM-gemma2-2B-rewardmodel-ft

Text Classification • Updated Nov 30, 2024 • 1.5k • 6

Ray2333/GRM-Llama3.2-3B-rewardmodel-ft

Text Classification • Updated Nov 30, 2024 • 13.2k • 7

Ray2333/GRM_Llama3.1_8B_rewardmodel-ft

Text Classification • Updated Nov 30, 2024 • 135 • 3

updated a collection 2 months ago

GRM

Generalizable Reward Models • 11 items • Updated Nov 25, 2024 • 4

upvoted 2 collections 2 months ago

Papers - Math - Reasoning

11 items • Updated Nov 10, 2024 • 1

Papers - Benchmarks - Math

4 items • Updated Nov 5, 2024 • 1

New activity in Ray2333/GRM-Llama3.2-3B-rewardmodel-ft 3 months ago

Model Size

#1 opened 3 months ago by

authored a paper 3 months ago

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Paper • 2411.00836 • Published Oct 29, 2024 • 15

commented a paper 3 months ago

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Paper • 2411.00836 • Published Oct 29, 2024 • 15 •

updated a dataset 3 months ago

DynaMath/DynaMath_Sample

Viewer • Updated Nov 5, 2024 • 5.01k • 269 • 6

upvoted a paper 3 months ago

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Paper • 2411.00836 • Published Oct 29, 2024 • 15

updated a Space 3 months ago

README

liked a dataset 3 months ago

DynaMath/DynaMath_Sample

Viewer • Updated Nov 5, 2024 • 5.01k • 269 • 6

liked a Space 3 months ago

Preference Proxy Evaluations

Preference Proxy Evaluations

New activity in Ray2333/GRM-llama3-8B-sftreg 3 months ago

Adding `safetensors` variant of this model

#3 opened 3 months ago by

updated a collection 4 months ago

GRM

Generalizable Reward Models • 11 items • Updated Nov 25, 2024 • 4

liked a model 5 months ago

Ray2333/GRM-Llama3-8B-rewardmodel-ft

Updated Nov 30, 2024 • 1.8k • 1