Safetensors
English
olmoe
Mixture of Experts
olmo
Muennighoff commited on
Commit
1fbe775
·
verified ·
1 Parent(s): 8cbcd10

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -13,7 +13,7 @@ co2_eq_emissions: 1
13
 
14
  # Model Summary
15
 
16
- **We strongly recommend using the instruct version at https://hf.co/OLMoE/OLMoE-1B-7B-0924-Instruct instead which is based on this model with additional DPO (Direct Preference Optimization).**
17
 
18
  - Code: https://github.com/allenai/OLMoE
19
  - Paper:
@@ -21,9 +21,9 @@ co2_eq_emissions: 1
21
 
22
 
23
  Branches:
24
- - `main`: Instruction tuned / supervised finetuned (SFT) model of https://hf.co/OLMoE/OLMoE-1B-7B-0924 (`main` branch)
25
  - `load-balancing`: Ablation with load balancing loss during SFT
26
- - `non-annealed`: Ablation starting from the checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/OLMoE/OLMoE-1B-7B-0924) rather than the annealed checkpoint (branch `main` of https://hf.co/OLMoE/OLMoE-1B-7B-0924)
27
 
28
  # Citation
29
 
 
13
 
14
  # Model Summary
15
 
16
+ **We strongly recommend using the instruct version at https://hf.co/allenai/OLMoE-1B-7B-0924-Instruct instead which is based on this model with additional DPO (Direct Preference Optimization).**
17
 
18
  - Code: https://github.com/allenai/OLMoE
19
  - Paper:
 
21
 
22
 
23
  Branches:
24
+ - `main`: Instruction tuned / supervised finetuned (SFT) model of https://hf.co/allenai/OLMoE-1B-7B-0924 (`main` branch)
25
  - `load-balancing`: Ablation with load balancing loss during SFT
26
+ - `non-annealed`: Ablation starting from the checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/allenai/OLMoE-1B-7B-0924) rather than the annealed checkpoint (branch `main` of https://hf.co/allenai/OLMoE-1B-7B-0924)
27
 
28
  # Citation
29