Haoxiang-Wang
commited on
Commit
•
f6bdb40
1
Parent(s):
26db919
Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ license: llama3
|
|
10 |
|
11 |
[Haoxiang Wang*](https://haoxiang-wang.github.io/), [Wei Xiong*](https://weixiongust.github.io/WeiXiongUST/index.html), [Tengyang Xie](https://tengyangxie.github.io/), [Han Zhao](https://hanzhaoml.github.io/), [Tong Zhang](https://tongzhang-ml.org/)
|
12 |
|
13 |
-
+ **Blog**:
|
14 |
+ **Tech Report**: To be released in June 2024
|
15 |
+ **Model**: [ArmoRM-Llama3-8B-v0.1](https://huggingface.co/RLHFlow/ArmoRM-Llama3-8B-v0.1)
|
16 |
+ Finetuned from model: [FsfairX-LLaMA3-RM-v0.1](https://huggingface.co/sfairXC/FsfairX-LLaMA3-RM-v0.1)
|
|
|
10 |
|
11 |
[Haoxiang Wang*](https://haoxiang-wang.github.io/), [Wei Xiong*](https://weixiongust.github.io/WeiXiongUST/index.html), [Tengyang Xie](https://tengyangxie.github.io/), [Han Zhao](https://hanzhaoml.github.io/), [Tong Zhang](https://tongzhang-ml.org/)
|
12 |
|
13 |
+
+ **Blog**: https://rlhflow.github.io/posts/2024-05-29-multi-objective-reward-modeling/
|
14 |
+ **Tech Report**: To be released in June 2024
|
15 |
+ **Model**: [ArmoRM-Llama3-8B-v0.1](https://huggingface.co/RLHFlow/ArmoRM-Llama3-8B-v0.1)
|
16 |
+ Finetuned from model: [FsfairX-LLaMA3-RM-v0.1](https://huggingface.co/sfairXC/FsfairX-LLaMA3-RM-v0.1)
|