Update README.md
Browse files
README.md
CHANGED
@@ -13,7 +13,7 @@ licence: license
|
|
13 |
|
14 |
Fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) to generate YouTube titles based on my preferences. It was trained using [TRL](https://github.com/huggingface/trl).
|
15 |
|
16 |
-
Video link
|
17 |
[Blog link](https://shawhin.medium.com/fine-tuning-llms-on-human-feedback-rlhf-dpo-1c693dbc4cbf) <br>
|
18 |
[GitHub Repo](https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/dpo) <br>
|
19 |
[Training Dataset](https://huggingface.co/datasets/shawhin/youtube-titles-dpo)
|
|
|
13 |
|
14 |
Fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) to generate YouTube titles based on my preferences. It was trained using [TRL](https://github.com/huggingface/trl).
|
15 |
|
16 |
+
[Video link](https://youtu.be/bbVoDXoPrPM) <br>
|
17 |
[Blog link](https://shawhin.medium.com/fine-tuning-llms-on-human-feedback-rlhf-dpo-1c693dbc4cbf) <br>
|
18 |
[GitHub Repo](https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/dpo) <br>
|
19 |
[Training Dataset](https://huggingface.co/datasets/shawhin/youtube-titles-dpo)
|