2 18 36

Jiang Jiwen

jjw0126

AI & ML interests

RL, LLM

Recent Activity

liked a dataset 9 days ago

sanbu/tianji-chinese

liked a dataset 14 days ago

adyen/DABstep

liked a dataset 14 days ago

ibm-granite/GneissWeb

View all activity

Organizations

jjw0126's activity

liked a dataset 9 days ago

sanbu/tianji-chinese

Viewer • Updated Dec 21, 2024 • 13.1k • 490 • 10

liked 2 datasets 14 days ago

adyen/DABstep

Viewer • Updated 2 days ago • 12.2k • 5.02k • 12

ibm-granite/GneissWeb

Updated 13 days ago • 4.18k • 26

liked 2 datasets 16 days ago

AymanTarig/function-calling-v0.2-with-r1-cot

Viewer • Updated Feb 3 • 58k • 538 • 32

Jofthomas/hermes-function-calling-thinking-V1

Viewer • Updated 25 days ago • 3.57k • 6.62k • 23

liked a model 16 days ago

Salesforce/blip2-opt-2.7b

Image-Text-to-Text • Updated Feb 3 • 364k • 342

upvoted 2 articles 22 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 803

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 70

liked a dataset 24 days ago

Congliu/Chinese-DeepSeek-R1-Distill-data-110k

Viewer • Updated 21 days ago • 110k • 7.74k • 522

liked 5 datasets about 1 month ago

upvoted 2 collections about 1 month ago

🧠 Reasoning datasets

Collection

Datasets with reasoning traces for math and code released by the community • 14 items • Updated 2 days ago • 100

Reasoning Datasets

Collection

Distilled synthetic Reasoning datasets • 7 items • Updated Feb 2 • 56

liked 3 datasets about 1 month ago

Aarushhh/Thinking-Preference-7k

Viewer • Updated Jan 29 • 7.12k • 108 • 2

ServiceNow-AI/R1-Distill-SFT

Viewer • Updated Feb 8 • 1.85M • 5.58k • 272

mlfoundations-dev/LIMO

Viewer • Updated Feb 7 • 817 • 98 • 2

liked a model about 1 month ago

simplescaling/s1-32B

Text Generation • Updated 16 days ago • 14.5k • 288