3 6 10

BinghengWu

wubingheng

https://github.com/wubingheng111

AI & ML interests

I like to fine-tune the small models of the Doge series.

Recent Activity

authored a paper 11 days ago

Trainable Dynamic Mask Sparse Attention

upvoted an article 12 days ago

Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models

published an article 12 days ago

Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models

View all activity

Organizations

Articles 1

Article

Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models

Papers 3

arxiv:2508.02124

arxiv:2505.19716

arxiv:2412.11834

models 3

datasets 15

wubingheng/MixtureOfThoughts-Chinese-tryrun

Viewer • Updated about 1 month ago • 10 • 96

wubingheng/Mixture-of-Thoughts-zh-try-run

Viewer • Updated Jul 17 • 10 • 89

wubingheng/Budget-aware-2048

Viewer • Updated Apr 29 • 25k • 3

wubingheng/Budget-aware-2048-in

Viewer • Updated Apr 29 • 25k • 8

wubingheng/Budget-aware-2048-in-try-run

Viewer • Updated Apr 29 • 2 • 6

wubingheng/Budget-aware-2048-try-run

Viewer • Updated Apr 29 • 2 • 5

wubingheng/L1-2048

Viewer • Updated Apr 28 • 25k • 2

wubingheng/L1-1024

Viewer • Updated Apr 28 • 25k • 4

wubingheng/compressed-openthoughts-50

Viewer • Updated Apr 28 • 25k • 13

wubingheng/compressed-openthoughts-90

Viewer • Updated Apr 28 • 25k • 6

View 15 datasets

BinghengWu

AI & ML interests

Recent Activity

Organizations

Articles 1

Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models

Papers 3

models 3 Sort: Recently updated

datasets 15 Sort: Recently updated

models 3

datasets 15