Directly distill from Llama, the finetune in DPO
Junxiong Wang
JunxiongWang
AI & ML interests
Attention Free Model / Subquadratic Language Models
Recent Activity
updated
a model
1 day ago
JunxiongWang/MambaInLlama3B_DPO2
published
a model
1 day ago
JunxiongWang/MambaInLlama3B_DPO2
updated
a model
1 day ago
JunxiongWang/MambaInLlama3B_DPO1
Organizations
Collections
7
models
42
JunxiongWang/MambaInLlama3B_DPO2
Updated
JunxiongWang/MambaInLlama3B_DPO1
Updated
JunxiongWang/MambaInLlama3B_v3.1
Updated
•
183
JunxiongWang/MambaInLlama3B_v3
Updated
•
127
JunxiongWang/MambaInLlama1B_v3
Updated
•
137
JunxiongWang/mamba_0_5_distill
Updated
•
1
JunxiongWang/Llama3.2-Mamba-3B-dpo
Updated
•
21
JunxiongWang/Llama3.2-Mamba-3B-distill
Updated
•
84
JunxiongWang/Llama3.2-Mamba2-3B-distill
Updated
•
67
JunxiongWang/Llama3.2-Mamba2-3B-dpo
Updated
•
69
datasets
7
JunxiongWang/test_math
Viewer
•
Updated
•
89.1k
•
5
JunxiongWang/FineMathV4
Viewer
•
Updated
•
6.7M
•
50
JunxiongWang/model_revision_max_4_closest_and_random
Viewer
•
Updated
•
530k
•
40
JunxiongWang/sftdatasetv3
Viewer
•
Updated
•
12.4M
•
718
JunxiongWang/sftdataset
Viewer
•
Updated
•
11M
•
126
•
2
JunxiongWang/llama3-ultrafeedback-armorm
Viewer
•
Updated
•
61.8k
•
103
•
1
JunxiongWang/testdataset
Viewer
•
Updated
•
1M
•
216