Xiaosen Zheng's picture

Xiaosen Zheng

xszheng2020

·

AI & ML interests

Data-Centric AI and AI Safety.

Recent Activity

liked a model 6 days ago

GAIR/LIMO

liked a dataset 6 days ago

GAIR/MathPile

upvoted a collection 6 days ago

View all activity

Organizations

xszheng2020's activity

upvoted a collection 6 days ago

Qwen2.5-Math

Math-specific model series based on Qwen2.5 • 11 items • Updated Jan 14 • 80

upvoted a paper 20 days ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published 20 days ago • 98

upvoted a collection 23 days ago

OLMo 2 Preview Post-trained Models

These model's tokenizer did not use HF's fast tokenizer, resulting in variations in how pre-tokenization was applied. Resolved in latest versions. • 6 items • Updated 13 days ago • 4

upvoted a paper about 2 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 214

upvoted a collection about 2 months ago

DeepSeek-R1

8 items • Updated Jan 21 • 593

upvoted 2 collections 3 months ago

NeMo Curator - Classifier Models

Classifier models that can be used in NeMo Curator for labelling/filtering datasets. • 11 items • Updated about 6 hours ago • 16

FastText Model for Pretraining Data Curation

6 items • Updated 10 days ago • 2

upvoted a paper 3 months ago

OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain

Paper • 2412.13018 • Published Dec 17, 2024 • 41

upvoted 2 collections 4 months ago

🔱 Sailor2 Language Models

Sailing in South-East Asia with Inclusive Multilingual LLMs • 34 items • Updated about 1 month ago • 26

DCLM Pools

Raw pools for use in DCLM competition • 5 items • Updated Jul 17, 2024 • 1

upvoted 2 papers 5 months ago

Sample-Efficient Alignment for LLMs

Paper • 2411.01493 • Published Nov 3, 2024 • 12

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 67

upvoted a collection 5 months ago

MagpieLM

Aligning LMs with Fully Open Recipe + Synthetic Data Generated from Open-Source LMs. • 9 items • Updated Jan 13 • 16

upvoted a paper 5 months ago

Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch

Paper • 2410.18693 • Published Oct 24, 2024 • 42

upvoted 3 collections 5 months ago

ScaleQuest

We introduce ScaleQuest, a scalable and novel data synthesis method. Project Page: https://scalequest.github.io/ • 11 items • Updated about 11 hours ago • 6

C4AI Aya Expanse

Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. • 4 items • Updated 24 days ago • 38

BGE

23 items • Updated Feb 13 • 99

upvoted an article 5 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16, 2024

• 344