The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"

OpenMOSS, Fudan NLP, SII
university
AI & ML interests
LLM
Recent Activity
Organization Card
Joint OpenMOSS group from Fudan NLP, SII and MoSi Inc.
Collections
3
models
86

fnlp/llama2-7B-d_kv_64-refactor
Text Generation
•
Updated
•
3

fnlp/llama2-7B-d_kv_32-refactor
Text Generation
•
Updated
•
2

fnlp/smollm1-1B7-d_kv_32-refactor
Text Generation
•
Updated
•
2

fnlp/llama2-7B-d_kv_16-refactor
Text Generation
•
Updated
•
3

fnlp/smollm1-1B7-d_kv_16-refactor
Text Generation
•
Updated

fnlp/smollm1-1B7-d_kv_8-refactor
Text Generation
•
Updated

fnlp/SmolLM-360M-MLA-d_kv_32-refactor
Text Generation
•
Updated

fnlp/SmolLM-360M-MLA-d_kv_16-refactor
Text Generation
•
Updated

fnlp/SmolLM-360M-MLA-d_kv_8-refactor
Text Generation
•
Updated

fnlp/SmolLM-135M-MLA-d_kv_32-refactor
Text Generation
•
Updated
datasets
17
fnlp/MHA2MLA-corpus-qwen1.5
Updated
•
54
fnlp/MHA2MLA-corpus-smollm
Updated
•
666
fnlp/MHA2MLA-corpus-qwen1_5
Updated
•
7
fnlp/MHA2MLA-corpus-qwen2
Updated
•
29
fnlp/MHA2MLA-corpus-mistral-v0_1
Updated
•
18
fnlp/MHA2MLA-corpus-smollm_v1
Updated
•
33
fnlp/MHA2MLA-corpus-llama2
Updated
•
46
fnlp/Ultra-Innerthought
Viewer
•
Updated
•
2.09M
•
46
•
2
fnlp/case2code-data
Viewer
•
Updated
•
887k
•
73
•
2
fnlp/AnyInstruct-resolution-1024
Updated
•
286