20 53 223

Yinxu Pan

cppowboy

https://github.com/Cppowboy

AI & ML interests

RL for LLM, Code&Math Reasoning, Function Calling, Code Interpreter, Vision-Language Pretraining

Recent Activity

liked a dataset about 7 hours ago

Alibaba-NLP/WebShaper

liked a dataset 2 days ago

inclusionAI/ASearcher-train-data

upvoted a paper 2 days ago

On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting

View all activity

Organizations

liked a dataset about 7 hours ago

Alibaba-NLP/WebShaper

Viewer • Updated Jul 22 • 500 • 3.57k • 17

liked a dataset 2 days ago

inclusionAI/ASearcher-train-data

Preview • Updated 10 days ago • 403 • 8

upvoted 2 papers 2 days ago

On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting

Paper • 2508.11408 • Published 8 days ago • 6

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published 3 days ago • 24

upvoted a paper 3 days ago

MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published 9 days ago • 16

liked a model 4 days ago

Qwen/Qwen-Image-Edit

Image-to-Image • Updated 5 days ago • 28.2k • • 1.12k

liked a model 12 days ago

mistralai/Devstral-Small-2507

24B • Updated 5 days ago • 24.8k • 317

liked a dataset 22 days ago

princeton-nlp/prolong-data-512K

Updated Mar 5 • 9.17k • 10

liked 2 datasets 23 days ago

internlm/SWE-Fixer-Train-110K

Viewer • Updated Mar 17 • 115k • 210 • 11

ByteDance-Seed/Multi-SWE-RL

Updated Jul 23 • 643 • 28

upvoted a paper 29 days ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published about 1 month ago • 290

upvoted 3 papers about 1 month ago

RAVine: Reality-Aligned Evaluation for Agentic Search

Paper • 2507.16725 • Published Jul 22 • 28

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

Paper • 2507.16812 • Published Jul 22 • 62

GR-3 Technical Report

Paper • 2507.15493 • Published Jul 21 • 45

liked 2 datasets about 1 month ago

XenArcAI/MathX-5M

Viewer • Updated 28 days ago • 4.32M • 3.63k • 57

lmsys/lmsys-chat-1m

Viewer • Updated Jul 27, 2024 • 1M • 8.32k • 721

liked a model about 1 month ago

zai-org/GLM-4.1V-9B-Thinking

Image-Text-to-Text • 10B • Updated Jul 8 • 248k • • 718

upvoted a paper about 2 months ago

AsyncFlow: An Asynchronous Streaming RL Framework for Efficient LLM Post-Training

Paper • 2507.01663 • Published Jul 2 • 5

upvoted a collection about 2 months ago

Kimina Prover Preview

Collection

State-of-the-Art Models for Formal Mathematical Reasoning • 5 items • Updated Apr 28 • 33

upvoted a paper about 2 months ago

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1 • 74

Yinxu Pan

AI & ML interests

Recent Activity

Organizations

cppowboy's activity