FsfairX

community

AI & ML interests

None defined yet.

Recent Activity

bpucla authored a paper about 1 month ago

BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation

hendrydong authored a paper about 1 month ago

BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation

hendrydong authored a paper about 1 month ago

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

View all activity

sfairXC's activity

bpucla

authored a paper about 1 month ago

BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation

Paper • 2502.03860 • Published Feb 6 • 24

hendrydong

authored 2 papers about 1 month ago

BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation

Paper • 2502.03860 • Published Feb 6 • 24

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published Jan 31 • 38

hendrydong

authored a paper 3 months ago

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38

hendrydong

updated a model 5 months ago

sfairXC/FsfairX-LLaMA3-RM-v0.1

Text Classification • Updated Oct 14, 2024 • 3.75k • 55

hendrydong

authored a paper 5 months ago

MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs

Paper • 2410.04698 • Published Oct 7, 2024 • 13

hendrydong

updated 2 models 6 months ago

sfairXC/llama-3.1-sft-2ep

Text Generation • Updated Sep 18, 2024 • 15

sfairXC/llama-3.1-sft-1ep

Text Generation • Updated Sep 18, 2024 • 9

hendrydong

updated 2 models 7 months ago

sfairXC/gemma-sft-2ep

Text Generation • Updated Aug 30, 2024 • 12

sfairXC/gemma-sft-1ep

Text Generation • Updated Aug 30, 2024 • 9

hendrydong

authored a paper 8 months ago

ThinK: Thinner Key Cache by Query-Driven Pruning

Paper • 2407.21018 • Published Jul 30, 2024 • 32

hendrydong

updated a model 8 months ago

sfairXC/FsfairX-Gemma2-RM-v0.1

Text Classification • Updated Jul 9, 2024 • 40 • 7

hendrydong

authored 8 papers 10 months ago

Reverse Diffusion Monte Carlo

Paper • 2307.02037 • Published Jul 5, 2023 • 1

Spurious Feature Diversification Improves Out-of-distribution Generalization

Paper • 2309.17230 • Published Sep 29, 2023

Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint

Paper • 2312.11456 • Published Dec 18, 2023 • 1

Local Augmentation for Graph Neural Networks

Paper • 2109.03856 • Published Sep 8, 2021

Weakly Supervised Disentangled Generative Causal Representation Learning

Paper • 2010.02637 • Published Oct 6, 2020

LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models

Paper • 2306.12420 • Published Jun 21, 2023 • 2

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Paper • 2304.06767 • Published Apr 13, 2023 • 2

DetGPT: Detect What You Need via Reasoning

Paper • 2305.14167 • Published May 23, 2023