mlfoundations-dev/Qwen-0.5B-Inst_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_1280_rope-scal_yarn Updated 4 days ago
mlfoundations-dev/Qwen-1.5B-Inst_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_1280_rope-scal_yarn Updated 4 days ago
mlfoundations-dev/packing_False_claude_3_7_20250219_tbench_traces_sharegptv1_cutoff-len_64000_rope-scaling_yarn Updated 4 days ago
mlfoundations-dev/claude_3_7_20250219_tbench_traces_sharegptv1_cutoff-len_64000_rope-scaling_yarn Updated 4 days ago
mlfoundations-dev/claude_3_7_20250219_tbench_traces_sharegptv1 Text Generation • 8B • Updated 9 days ago • 16
mlfoundations-dev/Qwen2.5-7B-Instruct_qwq_mix_qwen3_science Text Generation • 8B • Updated Jun 29 • 7
mlfoundations-dev/Qwen2.5-7B-Instruct_qwq_mix_r1_science Text Generation • 8B • Updated Jun 29 • 17 • 1
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz256_lr16e5_epochs5 Text Generation • 2B • Updated Jun 25 • 5
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz256_lr2e5_epochs5 Text Generation • 2B • Updated Jun 25 • 5
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz256_lr8e5_epochs5 Text Generation • 2B • Updated Jun 25 • 5
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz1024_lr16e5_epochs5 Text Generation • 2B • Updated Jun 25 • 5
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz1024_lr8e5_epochs5 Text Generation • 2B • Updated Jun 25 • 5
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz1024_lr4e5_epochs5 Text Generation • 2B • Updated Jun 24 • 5
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz1024_lr2e5_epochs5 Text Generation • 2B • Updated Jun 24 • 5
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz512_lr8e5_epochs5 Text Generation • 2B • Updated Jun 24 • 5
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz512_lr4e5_epochs5 Text Generation • 2B • Updated Jun 24 • 5
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz512_lr16e5_epochs5 Text Generation • 2B • Updated Jun 24 • 5
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz256_lr16e5_epochs7 Text Generation • 2B • Updated Jun 24 • 4
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz256_lr8e5_epochs7 Text Generation • 2B • Updated Jun 24 • 5