Junrulu commited on
Commit
0b401b0
·
verified ·
1 Parent(s): 4c889f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -11,9 +11,7 @@ base_model: allenai/tulu-2-13b
11
 
12
  # Model Card for Reproduced Tulu2 DPO 13B
13
 
14
- - This repository provides a reproduction version of Tulu2-DPO-13B finetuned upon [Tulu2-13B](https://huggingface.co/allenai/tulu-2-13b) and [Ultrafeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
15
- - Therefore, we obey all licenses mentioned in Tulu2's work.
16
- - Check our codes for more details: https://github.com/LuJunru/LLM_Finetune/tree/DPO. The codes are built with [TRL](https://github.com/huggingface/trl/tree/main).
17
 
18
  ## Performance
19
 
@@ -44,3 +42,7 @@ The following hyperparameters were used during DPO training:
44
  - lr_scheduler_warmup_ratio: 0.1
45
  - Weight Decay: 0.05
46
  - num_epochs: 3.0
 
 
 
 
 
11
 
12
  # Model Card for Reproduced Tulu2 DPO 13B
13
 
14
+ This repository provides a reproduction version of Tulu2-DPO-13B finetuned upon [Tulu2-13B](https://huggingface.co/allenai/tulu-2-13b) and [Ultrafeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized). Therefore, we obey all licenses mentioned in Tulu2's work. Check our codes for more details: https://github.com/LuJunru/LLM_Finetune/tree/DPO, which is built with [TRL](https://github.com/huggingface/trl/tree/main).
 
 
15
 
16
  ## Performance
17
 
 
42
  - lr_scheduler_warmup_ratio: 0.1
43
  - Weight Decay: 0.05
44
  - num_epochs: 3.0
45
+
46
+ ## Progressive metrics
47
+
48
+ We present