Update README.md
Browse files
README.md
CHANGED
@@ -1,11 +1,5 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
- en
|
7 |
-
---
|
8 |
-
Mistral-7b model sft with tldr data from Learning to Summarize from Human Feedback paper
|
9 |
-
---
|
10 |
-
license: mit
|
11 |
-
---
|
|
|
1 |
+
Paper: Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization
|
2 |
+
|
3 |
+
Link: https://arxiv.org/abs/2404.00530
|
4 |
+
|
5 |
+
Github: https://github.com/Hritikbansal/dove
|
|
|
|
|
|
|
|
|
|
|
|