Chujie Zheng's picture

Chujie Zheng

chujiezheng

·

https://chujiezheng.github.io/

AI & ML interests

Large Language Models

Recent Activity

liked a Space about 15 hours ago

Qwen/QVQ-72B-preview

upvoted a collection about 15 hours ago

liked a model about 23 hours ago

Qwen/QVQ-72B-Preview

View all activity

Organizations

chujiezheng's activity

upvoted a collection about 15 hours ago

QVQ

QVQ: Qwen models for visual reasoning • 4 items • Updated about 22 hours ago • 15

upvoted a paper 6 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 6 days ago • 326

upvoted a paper 14 days ago

Evaluating and Aligning CodeLLMs on Human Preference

Paper • 2412.05210 • Published 19 days ago • 47

upvoted a paper 15 days ago

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published 16 days ago • 68

upvoted a paper 23 days ago

Yi-Lightning Technical Report

Paper • 2412.01253 • Published 23 days ago • 25

upvoted a collection 28 days ago

QwQ

Qwen with Questions • 2 items • Updated 28 days ago • 48

upvoted an article 2 months ago

Article

Accelerating LLM Inference: Fast Sampling with Gumbel-Max Trick

By

•

Oct 24

• 10

upvoted a paper 2 months ago

A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models

Paper • 2410.13841 • Published Oct 17 • 14

upvoted a paper 3 months ago

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18 • 138

upvoted a paper 4 months ago

I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm

Paper • 2408.08072 • Published Aug 15 • 32

upvoted a collection 6 months ago

Nemotron 4 340B

Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated Nov 2 • 160

upvoted a paper 8 months ago

Weak-to-Strong Extrapolation Expedites Alignment

Paper • 2404.16792 • Published Apr 25 • 11

upvoted 3 collections 8 months ago

Eurus

Advancing LLM Reasoning Generalists with Preference Trees • 11 items • Updated Oct 22 • 24

Weak-to-Strong Extrapolation Expedites Alignment

Better aligned models obtained by weak-to-strong model extrapolation (ExPO) • 25 items • Updated 12 days ago • 16

Model Checkpoints in the ExPO Paper

15 items • Updated May 19 • 2

upvoted 2 collections 11 months ago

Qwen1.5

Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated 28 days ago • 205

Quyen

State-of-the-arts General LLMs - based on Qwen1.5 • 26 items • Updated Feb 13 • 12

upvoted a paper about 1 year ago

Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

Paper • 2312.04724 • Published Dec 7, 2023 • 20

upvoted a collection about 1 year ago

Pythia Scaling Suite

Pythia is the first LLM suite designed specifically to enable scientific research on LLMs. To learn more see https://github.com/EleutherAI/pythia • 18 items • Updated Nov 21, 2023 • 26