oopere
/

martra-phi-3-mini-dpo

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

oopere commited on Jun 7, 2024

Commit

c4a7648

·

verified ·

1 Parent(s): b02b243

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ widget:
 You can see the process with instructions for creating the model in the notebook: [Aligning_DPO_phi3.ipynb](https://github.com/peremartra/Large-Language-Model-Notebooks-Course/blob/main/P2-MHF/Aligning_DPO_phi3.ipynb)
-To create it, we started with the Phi-3-Mini-4K-Instruct model and applied DPO alignment using the distilabel-capybara-dpo-7k-binarized dataset.
 Phi-3 is a state-of-the-art model with 3.8 billion parameters that has outperformed other models with 7 billion parameters. The DPO alignment process has produced good results, modifying the model's responses and making them more similar to those in the capybara dataset.

 You can see the process with instructions for creating the model in the notebook: [Aligning_DPO_phi3.ipynb](https://github.com/peremartra/Large-Language-Model-Notebooks-Course/blob/main/P2-MHF/Aligning_DPO_phi3.ipynb)
+To create it, we started with the [Phi-3-Mini-4K-Instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) model and applied DPO alignment using the [distilabel-capybara-dpo-7k-binarized dataset](https://huggingface.co/datasets/argilla/distilabel-capybara-dpo-7k-binarized).
 Phi-3 is a state-of-the-art model with 3.8 billion parameters that has outperformed other models with 7 billion parameters. The DPO alignment process has produced good results, modifying the model's responses and making them more similar to those in the capybara dataset.