Na0s/Qwen1.5-MoE-A2.7B-Chat-20_experts_Maths_FT_1k_cosine Text Generation • 6B • Updated Dec 19, 2024 • 4
Na0s/Qwen1.5-MoE-A2.7B-Chat-20_experts-L2Norm-Pruning Text Generation • 6B • Updated Dec 18, 2024 • 6
Na0s/Mixtral-8x7B-Instruct-v0.1-exhaustive-LoRA-SFT-pruned-1-expert Text Generation • 41B • Updated Nov 18, 2024 • 5
Na0s/Mixtral-8x7B-Instruct-v0.1-exhaustive-LoRA-SFT-pruned-2-experts Text Generation • 35B • Updated Nov 18, 2024 • 3
Na0s/Mixtral-8x7B-Instruct-v0.1-exhaustive-LoRA-SFT-pruned-3-experts Text Generation • 30B • Updated Nov 18, 2024 • 3
Na0s/Mixtral-8x7B-Instruct-v0.1-exhaustive-LoRA-SFT-pruned-4-experts Text Generation • 24B • Updated Nov 18, 2024 • 3
Na0s/Mixtral-8x7B-v0.1-instruct-pruned-random-3-experts Text Generation • 30B • Updated Oct 6, 2024 • 7
Na0s/Mixtral-8x7B-v0.1-instruct-pruned-random-4-experts Text Generation • 24B • Updated Oct 6, 2024 • 7
Na0s/Mixtral-8x7B-v0.1-instruct-pruned-random-2-experts Text Generation • 35B • Updated Oct 6, 2024 • 6
Na0s/Mixtral-8x7B-v0.1-instruct-pruned-random-1-experts Text Generation • 41B • Updated Oct 6, 2024 • 5
Na0s/Mixtral-8x7B-v0.1-instruct-l2-norm-post-Gates-LoRA-SFT-pruned-1-expert Text Generation • 41B • Updated Oct 6, 2024 • 4
Na0s/Mixtral-8x7B-v0.1-instruct-l2-norm-post-Gates-LoRA-SFT-pruned-3-experts Text Generation • 30B • Updated Oct 6, 2024 • 4
Na0s/Mixtral-8x7B-v0.1-instruct-l2-norm-post-Gates-LoRA-SFT-pruned-4-experts Text Generation • 24B • Updated Oct 6, 2024 • 3
Na0s/Mixtral-8x7B-v0.1-instruct-l2-norm-post-Gates-LoRA-SFT-pruned-2-experts Text Generation • 35B • Updated Oct 5, 2024 • 4