Ejafa
/

phi-3-mini-128k-instruct-simpo-lr-5e-07-gamma-1.5

Text Generation

alignment-handbook

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Ejafa commited on Jun 25

Commit

d582bc6

•

1 Parent(s): 59a0af6

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -18,6 +18,12 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # phi-3-mini-128k-instruct-simpo-lr-5e-07-gamma-1.5

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+## Description
+This model was trained as part of the Reinforcement Learning - 24 project at Peking University, focusing on [simpo].
+## Authors
+- Ejafa Bassam
+- Yaroslav Ponomarenko
 # phi-3-mini-128k-instruct-simpo-lr-5e-07-gamma-1.5