shawhin commited on
Commit
4f36e42
·
verified ·
1 Parent(s): 967eb7c

shawhin/Qwen2.5-0.5B-DPO

Browse files
Files changed (3) hide show
  1. README.md +2 -14
  2. model.safetensors +1 -1
  3. training_args.bin +1 -1
README.md CHANGED
@@ -7,20 +7,10 @@ tags:
7
  - trl
8
  - dpo
9
  licence: license
10
- license: apache-2.0
11
- datasets:
12
- - shawhin/youtube-titles-dpo
13
  ---
14
 
15
  # Model Card for Qwen2.5-0.5B-DPO
16
 
17
- Model fine-tuned on YouTube title preferences for my [YouTube channel](https://www.youtube.com/@ShawhinTalebi).
18
-
19
- [Preference dataset](https://huggingface.co/datasets/shawhin/youtube-titles-dpo) <br>
20
- YouTube video: coming soon! <br>
21
- Blog post: coming soon!
22
-
23
-
24
  This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct).
25
  It has been trained using [TRL](https://github.com/huggingface/trl).
26
 
@@ -29,11 +19,9 @@ It has been trained using [TRL](https://github.com/huggingface/trl).
29
  ```python
30
  from transformers import pipeline
31
 
32
- video_idea = "intro independent component analysis"
33
- prompt = f"Given the YouTube video idea write an engaging title.\n\n**Video Idea**: {video_idea}\n\n**Additional Guidance**:\n- Title should be between 30 and 75 characters long\n- Only return the title idea, nothing else!"
34
-
35
  generator = pipeline("text-generation", model="shawhin/Qwen2.5-0.5B-DPO", device="cuda")
36
- output = generator([{"role": "user", "content": prompt}], max_new_tokens=128, return_full_text=False)[0]
37
  print(output["generated_text"])
38
  ```
39
 
 
7
  - trl
8
  - dpo
9
  licence: license
 
 
 
10
  ---
11
 
12
  # Model Card for Qwen2.5-0.5B-DPO
13
 
 
 
 
 
 
 
 
14
  This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct).
15
  It has been trained using [TRL](https://github.com/huggingface/trl).
16
 
 
19
  ```python
20
  from transformers import pipeline
21
 
22
+ question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
 
 
23
  generator = pipeline("text-generation", model="shawhin/Qwen2.5-0.5B-DPO", device="cuda")
24
+ output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
25
  print(output["generated_text"])
26
  ```
27
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:51dbfea29e3cea38d7a328e8fddd6dd6cd78a440270447c6884e34bfeef9253a
3
  size 1976163472
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7318cf89151cc86f13d147359a9de4bd7f7e969ed5444f20e14d3d63126fa3b4
3
  size 1976163472
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:aabc18474bd1b8c9bf88d3adcd839df3993e8b72a7078246f3492509481ed691
3
  size 6200
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d4d587a432b77e2326c9248fcae91211e3e6ef9ebe45f4b04b5c0feb55616e22
3
  size 6200