Update README.md
Browse files
README.md
CHANGED
@@ -13,7 +13,7 @@ co2_eq_emissions: 1
|
|
13 |
|
14 |
# Model Summary
|
15 |
|
16 |
-
**We strongly recommend using the instruct version at https://hf.co/
|
17 |
|
18 |
- Code: https://github.com/allenai/OLMoE
|
19 |
- Paper:
|
@@ -21,9 +21,9 @@ co2_eq_emissions: 1
|
|
21 |
|
22 |
|
23 |
Branches:
|
24 |
-
- `main`: Instruction tuned / supervised finetuned (SFT) model of https://hf.co/
|
25 |
- `load-balancing`: Ablation with load balancing loss during SFT
|
26 |
-
- `non-annealed`: Ablation starting from the checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/
|
27 |
|
28 |
# Citation
|
29 |
|
|
|
13 |
|
14 |
# Model Summary
|
15 |
|
16 |
+
**We strongly recommend using the instruct version at https://hf.co/allenai/OLMoE-1B-7B-0924-Instruct instead which is based on this model with additional DPO (Direct Preference Optimization).**
|
17 |
|
18 |
- Code: https://github.com/allenai/OLMoE
|
19 |
- Paper:
|
|
|
21 |
|
22 |
|
23 |
Branches:
|
24 |
+
- `main`: Instruction tuned / supervised finetuned (SFT) model of https://hf.co/allenai/OLMoE-1B-7B-0924 (`main` branch)
|
25 |
- `load-balancing`: Ablation with load balancing loss during SFT
|
26 |
+
- `non-annealed`: Ablation starting from the checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/allenai/OLMoE-1B-7B-0924) rather than the annealed checkpoint (branch `main` of https://hf.co/allenai/OLMoE-1B-7B-0924)
|
27 |
|
28 |
# Citation
|
29 |
|