peng's picture

peng

superpeng

·

AI & ML interests

None yet

Recent Activity

liked a model 6 days ago

baichuan-inc/Baichuan-M2-32B

liked a dataset 12 days ago

Intelligent-Internet/II-Medical-Reasoning-SFT

upvoted a collection 12 days ago

View all activity

Organizations

None yet

upvoted 2 collections 12 days ago

II-Medical

9 items • Updated Jul 4 • 10

Medical QA Datasets

A collection of medical question answering (QA) datasets • 23 items • Updated Feb 22 • 45

upvoted a paper about 1 month ago

QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training

Paper • 2506.00711 • Published May 31 • 1

upvoted a paper 6 months ago

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25 • 74

upvoted 2 collections 6 months ago

Phi-4

Phi-4 family of small language, multi-modal and reasoning models. • 17 items • Updated Jul 10 • 177

DeepSeek-R1-ReDistill

Re-distilled DeepSeek R1 models • 4 items • Updated Jan 30 • 14

upvoted a paper 8 months ago

DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought

Paper • 2412.17498 • Published Dec 23, 2024 • 22

upvoted an article 9 months ago

Article

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

By

•

Aug 19, 2024

• 78

upvoted a collection 10 months ago

Skywork-Reward-Data-Collection

Open-source preference datasets used to train the Skywork reward model series • 17 items • Updated Oct 12, 2024 • 19

upvoted 2 papers about 1 year ago

HelpSteer2: Open-source dataset for training top-performing reward models

Paper • 2406.08673 • Published Jun 12, 2024 • 19

Xwin-LM: Strong and Scalable Alignment Practice for LLMs

Paper • 2405.20335 • Published May 30, 2024 • 18

upvoted a collection about 1 year ago

Biomedical NLP papers

Papers posted on @[email protected] (Clinical, Healthcare & Biomedical NLP) • 183 items • Updated Jan 24 • 41

upvoted 2 papers about 1 year ago

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 167

Inference Performance Optimization for Large Language Models on CPUs

Paper • 2407.07304 • Published Jul 10, 2024 • 54

upvoted a collection about 1 year ago

Tulu 2 Llama 3 Update

Llama 3 models trained on the tulu dataset, following https://arxiv.org/abs/2311.10702 (tulu 2) and https://arxiv.org/abs/2406.09279 (tulu 2.5). • 12 items • Updated Mar 4 • 2

upvoted a paper about 1 year ago

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15, 2024 • 89

upvoted 2 papers over 1 year ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 72

Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts

Paper • 2309.07430 • Published Sep 14, 2023 • 27

upvoted 2 articles over 1 year ago

Article

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

By

and 2 others •

Apr 19, 2024

• 177

Article

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

By

and 2 others •

Mar 20, 2024

• 100