--- license: apache-2.0 language: - en tags: - moe - olmo - olmoe co2_eq_emissions: 1 --- OLMoE Logo. # Model Summary **We strongly recommend using the instruct version at https://hf.co/OLMoE/OLMoE-1B-7B-0924-Instruct instead which is based on this model with additional DPO (Direct Preference Optimization).** - Code: https://github.com/allenai/OLMoE - Paper: - Logs: https://github.com/allenai/OLMoE/blob/main/logs/olmoe-sft-logs.txt Branches: - `main`: Instruction tuned / supervised finetuned (SFT) model of https://hf.co/OLMoE/OLMoE-1B-7B-0924 (`main` branch) - `load-balancing`: Ablation with load balancing loss during SFT - `non-annealed`: Ablation starting from the checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/OLMoE/OLMoE-1B-7B-0924) rather than the annealed checkpoint (branch `main` of https://hf.co/OLMoE/OLMoE-1B-7B-0924) # Citation ```bibtex TODO ```