--- license: apache-2.0 language: - en tags: - moe - olmo - olmoe co2_eq_emissions: 1 --- OLMoE Logo. # Model Summary > OLMoE-1B-7B-Instruct is a Mixture-of-Experts LLM with 1B active and 7B total parameters released in August 2024 (0824) that has been adapted via SFT and DPO from [OLMoE-1B-7B](https://hf.co/OLMoE/OLMoE-1B-7B-0824). It yields state-of-the-art performance among models with a similar cost (1B) and is competitive with much larger models like Llama2-13B-Chat. OLMoE is 100% open-source. - Code: https://github.com/allenai/OLMoE - Paper: - Logs: https://github.com/allenai/OLMoE/blob/main/logs/olmoe-dpo-logs.txt # Use Install the `transformers` & `torch` libraries and run: ```python from transformers import OlmoeForCausalLM, AutoTokenizer import torch DEVICE = "cuda" if torch.cuda.is_available() else "cpu" # Load different ckpts via passing e.g. `revision=kto` model = OlmoeForCausalLM.from_pretrained("allenai/OLMoE-1B-7B-Instruct").to(DEVICE) tokenizer = AutoTokenizer.from_pretrained("allenai/OLMoE-1B-7B-Instruct") message = [{"role": "user", "content": "Explain to me like I'm five what is Bitcoin."}] inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt") out = model.generate(**inputs, max_length=64) print(tokenizer.decode(out[0])) # > # Bitcoin is a digital currency that is created and held electronically. No one controls it. Bitcoins aren’t printed, like dollars or euros – they’re produced by people and businesses running computers all around the world, using software that solves mathematical ``` Branches: - `main`: Preference tuned via DPO model of https://hf.co/OLMoE/OLMoE-1B-7B-0824-SFT (`main` branch) - `load-balancing`: Ablation with load balancing loss during DPO starting from the `load-balancing` branch of https://hf.co/OLMoE/OLMoE-1B-7B-0824-SFT - `non-annealed`: Ablation starting from the `non-annealed` branch of https://hf.co/OLMoE/OLMoE-1B-7B-0824-SFT which is an SFT of the pretraining checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/OLMoE/OLMoE-1B-7B-0824) - `kto`: Ablation using KTO instead of DPO. This branch is the checkpoint after 5,000 steps with the RMS optimizer. The other `kto*` branches correspond to the other checkpoints mentioned in the paper. # Citation ```bibtex TODO ```