tlrm

community

AI & ML interests

None defined yet.

Recent Activity

JW17 authored a paper 16 days ago

AlphaPO -- Reward shape matters for LLM alignment

JW17 authored a paper 16 days ago

Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning

eunkey published a dataset 18 days ago

tlrm/ufc-Qwen2.5-3B-Instruct-seed2938

View all activity

models 0

None public yet

datasets 2

tlrm/OpenMathInstruct-2-filtered-shard3

Viewer • Updated Oct 22, 2024 • 324k • 14

tlrm/ufc-Qwen2.5-3B-Instruct-seed2938

Viewer • Updated Oct 19, 2024 • 10k • 10