2 6

Taki WU

taki555

https://wutaiqiang.github.io/

AI & ML interests

None yet

Recent Activity

new activity 4 days ago

huggingface/documentation-images:Upload MoSLORA.png

updated a dataset 4 days ago

huggingface/documentation-images

authored a paper 21 days ago

LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models

View all activity

Organizations

None yet

taki555's activity

New activity in huggingface/documentation-images 4 days ago

Upload MoSLORA.png

#407 opened 4 days ago by

taki555

updated a dataset 4 days ago

huggingface/documentation-images

Viewer • Updated 1 day ago • 50 • 2.36M • 43

authored a paper 21 days ago

LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models

Paper • 2411.06839 • Published Nov 11 • 1

upvoted a paper 23 days ago

LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models

Paper • 2411.06839 • Published Nov 11 • 1

upvoted a collection 24 days ago

LLM-Neo

Collection

Model hub for LLM-Neo, including Llama3.1-Neo-1B-100w and Minitron-4B-Depth-Neo-10w. • 3 items • Updated Nov 20 • 4

upvoted a paper 3 months ago

A Survey on the Honesty of Large Language Models

Paper • 2409.18786 • Published Sep 27 • 31

upvoted a paper 5 months ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18 • 53

commented a paper 6 months ago

Mixture-of-Subspaces in Low-Rank Adaptation

Paper • 2406.11909 • Published Jun 16 • 3 •

authored 6 papers 6 months ago

TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities

Paper • 2212.06385 • Published Dec 13, 2022

RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer

Paper • 2304.05659 • Published Apr 12, 2023

Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast

Paper • 2405.14507 • Published May 23

Mixture-of-Subspaces in Low-Rank Adaptation

Paper • 2406.11909 • Published Jun 16 • 3

Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models

Paper • 2404.02657 • Published Apr 3

Weight-Inherited Distillation for Task-Agnostic BERT Compression

Paper • 2305.09098 • Published May 16, 2023

upvoted a paper 6 months ago

Mixture-of-Subspaces in Low-Rank Adaptation

Paper • 2406.11909 • Published Jun 16 • 3

New activity in huggingface/HuggingDiscussions 6 months ago

[FEEDBACK] Daily Papers

102

#32 opened 7 months ago by

kramp

upvoted a paper 6 months ago

ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation

Paper • 2406.09961 • Published Jun 14 • 54