Haleshot
/

Mathmate-7B-DELLA-ORPO

preference-learning

Model card Files Files and versions Community

Haleshot commited on Sep 23

Commit

80cab8f

•

1 Parent(s): 90127cd

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ Mathmate-7B-DELLA-ORPO is a finetuned version of [Haleshot/Mathmate-7B-DELLA](ht
 ## Model Details
 - **Base Model:** [Haleshot/Mathmate-7B-DELLA](https://huggingface.co/Haleshot/Mathmate-7B-DELLA)
-- **Finetuning Method:** ORPO (Offline Ranked Preference Optimization)
 - **Training Dataset:** [argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo)
 ## Finetuning

 ## Model Details
 - **Base Model:** [Haleshot/Mathmate-7B-DELLA](https://huggingface.co/Haleshot/Mathmate-7B-DELLA)
+- **Finetuning Method:** ORPO (Odds Ratio Preference Optimization)
 - **Training Dataset:** [argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo)
 ## Finetuning