shawhin
/

Qwen2.5-0.5B-DPO

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

shawhin commited on 28 days ago

Commit

4f36e42

·

verified ·

1 Parent(s): 967eb7c

shawhin/Qwen2.5-0.5B-DPO

Files changed (3) hide show

README.md +2 -14
model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -7,20 +7,10 @@ tags:
 - trl
 - dpo
 licence: license
-license: apache-2.0
-datasets:
-- shawhin/youtube-titles-dpo
 ---
 # Model Card for Qwen2.5-0.5B-DPO
-Model fine-tuned on YouTube title preferences for my [YouTube channel](https://www.youtube.com/@ShawhinTalebi).
-[Preference dataset](https://huggingface.co/datasets/shawhin/youtube-titles-dpo) <br>
-YouTube video: coming soon! <br>
-Blog post: coming soon!
 This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct).
 It has been trained using [TRL](https://github.com/huggingface/trl).
@@ -29,11 +19,9 @@ It has been trained using [TRL](https://github.com/huggingface/trl).
 ```python
 from transformers import pipeline
-video_idea = "intro independent component analysis"
-prompt = f"Given the YouTube video idea write an engaging title.\n\n**Video Idea**: {video_idea}\n\n**Additional Guidance**:\n- Title should be between 30 and 75 characters long\n- Only return the title idea, nothing else!"
 generator = pipeline("text-generation", model="shawhin/Qwen2.5-0.5B-DPO", device="cuda")
-output = generator([{"role": "user", "content": prompt}], max_new_tokens=128, return_full_text=False)[0]
 print(output["generated_text"])
 ```

 - trl
 - dpo
 licence: license
 ---
 # Model Card for Qwen2.5-0.5B-DPO
 This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct).
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ```python
 from transformers import pipeline
+question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
 generator = pipeline("text-generation", model="shawhin/Qwen2.5-0.5B-DPO", device="cuda")
+output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
 print(output["generated_text"])
 ```

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:51dbfea29e3cea38d7a328e8fddd6dd6cd78a440270447c6884e34bfeef9253a
 size 1976163472

 version https://git-lfs.github.com/spec/v1
+oid sha256:7318cf89151cc86f13d147359a9de4bd7f7e969ed5444f20e14d3d63126fa3b4
 size 1976163472

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:aabc18474bd1b8c9bf88d3adcd839df3993e8b72a7078246f3492509481ed691
 size 6200

 version https://git-lfs.github.com/spec/v1
+oid sha256:d4d587a432b77e2326c9248fcae91211e3e6ef9ebe45f4b04b5c0feb55616e22
 size 6200