Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
sail
's Collections
🔱 Sailor2 Language Models
🧬 RegMix: Data Mixture as Regression
📈 Scaling Laws with Vocabulary
💡 DICE
⚓️ Sailor Language Models
💡 DICE
updated
Jul 28
Self-alignment with DPO Implicit Rewards
Upvote
9
Bootstrapping Language Models with DPO Implicit Rewards
Paper
•
2406.09760
•
Published
Jun 14
•
38
sail/Llama-3-Base-8B-DICE-Iter1
Text Generation
•
Updated
Jul 11
•
15
•
1
sail/Llama-3-Base-8B-DICE-Iter2
Text Generation
•
Updated
Jul 11
•
18
•
2
sail/Zephyr-7B-DICE-Iter1
Text Generation
•
Updated
Jul 11
•
28
sail/Zephyr-7B-DICE-Iter2
Text Generation
•
Updated
Jul 11
•
15
Upvote
9
+5
Share collection
View history
Collection guide
Browse collections