1 5 7

Zhenmei Shi

Zhenmei

https://pages.cs.wisc.edu/~zhmeishi/

AI & ML interests

None yet

Recent Activity

upvoted a collection about 1 month ago

Qwen2.5

upvoted a collection about 2 months ago

DeepSeek-V3

liked a Space about 2 months ago

mteb/leaderboard_2_demo

View all activity

Organizations

Zhenmei's activity

upvoted a collection about 1 month ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 14 days ago • 557

upvoted a collection about 2 months ago

DeepSeek-V3

Collection

3 items • Updated Jan 6 • 195

liked a Space about 2 months ago

Leaderboard 2 Demo

📉

Demo of the new, massively multilingual leaderboard

liked 4 models about 2 months ago

upvoted a paper 2 months ago

On Computational Limits and Provably Efficient Criteria of Visual Autoregressive Models: A Fine-Grained Complexity Analysis

Paper • 2501.04377 • Published Jan 8 • 14

authored a paper 2 months ago

On Computational Limits and Provably Efficient Criteria of Visual Autoregressive Models: A Fine-Grained Complexity Analysis

Paper • 2501.04377 • Published Jan 8 • 14

liked a Space 4 months ago

5.05k

MTEB Leaderboard

🥇

Select benchmarks and languages for text embeddings evaluation

liked a dataset 4 months ago

MilaWang/SpatialEval

Viewer • Updated Dec 9, 2024 • 13.9k • 478 • 3

authored 5 papers 5 months ago

Towards Few-Shot Adaptation of Foundation Models via Multitask Finetuning

Paper • 2402.15017 • Published Feb 22, 2024

Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models

Paper • 2406.14852 • Published Jun 21, 2024

Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time

Paper • 2408.13233 • Published Aug 23, 2024 • 24

Toward Infinite-Long Prefix in Transformer

Paper • 2406.14036 • Published Jun 20, 2024

Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction

Paper • 2409.17422 • Published Sep 25, 2024 • 25

commented a paper 6 months ago

Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction

Paper • 2409.17422 • Published Sep 25, 2024 • 25 •

upvoted a paper 6 months ago

Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction

Paper • 2409.17422 • Published Sep 25, 2024 • 25

upvoted a collection 7 months ago

Llama 3.1

Collection

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 652