AI & ML interests
LLM
Recent Activity
Open source weights of Lorsa modules introduced in "Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition".
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs
Paper • 2502.14837 • Published • 4 -
fnlp/Llama-2-7B-MLA-d_kv_16
Text Generation • 6B • Updated • 48 -
fnlp/Llama-2-7B-MLA-d_kv_32
Text Generation • 6B • Updated • 25 -
fnlp/Llama-2-7B-MLA-d_kv_64
Text Generation • 7B • Updated • 8
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
fnlp/SmolLM-135M-MLA-d_kv_8-refactor
Text Generation • 0.1B • Updated • 9 -
fnlp/SmolLM-135M-MLA-d_kv_32-refactor
Text Generation • 0.1B • Updated • 3 -
fnlp/SmolLM-135M-MLA-d_kv_16-refactor
Text Generation • 0.1B • Updated • 2 -
fnlp/SmolLM-360M-MLA-d_kv_8-refactor
Text Generation • 0.3B • Updated • 2
Open source weights of Lorsa modules introduced in "Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition".
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
fnlp/SmolLM-135M-MLA-d_kv_8-refactor
Text Generation • 0.1B • Updated • 9 -
fnlp/SmolLM-135M-MLA-d_kv_32-refactor
Text Generation • 0.1B • Updated • 3 -
fnlp/SmolLM-135M-MLA-d_kv_16-refactor
Text Generation • 0.1B • Updated • 2 -
fnlp/SmolLM-360M-MLA-d_kv_8-refactor
Text Generation • 0.3B • Updated • 2
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs
Paper • 2502.14837 • Published • 4 -
fnlp/Llama-2-7B-MLA-d_kv_16
Text Generation • 6B • Updated • 48 -
fnlp/Llama-2-7B-MLA-d_kv_32
Text Generation • 6B • Updated • 25 -
fnlp/Llama-2-7B-MLA-d_kv_64
Text Generation • 7B • Updated • 8