Junrulu
/

Reproduced-tulu2-dpo-13b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Junrulu commited on Mar 18

Commit

3b39e12

•

1 Parent(s): 6d44b91

Update README.md

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -15,7 +15,11 @@ This repository provides a reproduction version of Tulu2-DPO-13B finetuned upon
 ## Performance
-Check more progressive training metrics and final benchmark results in our [code repository](https://github.com/LuJunru/LLM_Finetune/tree/DPO).
 ## Input Format

 ## Performance
+| Model | Size | Alignment | MT-Bench (score) | AlpacaEval 2.0 (win rate %) |
+|-------------|-----|----|---------------|--------------|
+| **Tulu-v2-13b** 🐪 | **13B** | **SFT** | **5.79** | **2.61** |
+| **Tulu-v2-dpo-13b** 🐪 | **13B** | **DPO** | **6.06** | **6.96** |
+| **Reproduced-tulu2-dpo-13b** | **13B** | **DPO** | **6.27** | **6.71** |
 ## Input Format