Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
4
Nguyễn Minh Phúc
DatPySci
Follow
Oztobuzz's profile picture
1 follower
·
1 following
AI & ML interests
Reinforcement learning, NLP
Organizations
DatPySci
's datasets
57
Sort: Recently updated
DatPySci/Llama-3.1-8B-rm-anthropic-hh
Viewer
•
Updated
Feb 10
•
140k
•
25
DatPySci/Llama-3.1-8B-rm-tldr-pref
Viewer
•
Updated
Feb 10
•
177k
•
26
DatPySci/tldr_pythia-6.9b_pref
Viewer
•
Updated
Feb 6
•
94.9k
•
39
DatPySci/tldr_synthetic_llama3_3b_32
Viewer
•
Updated
Jan 24
•
5.47k
•
24
DatPySci/llama3_3b_sft_tldr_synthetic
Viewer
•
Updated
Jan 19
•
5.47k
•
21
DatPySci/weak_gpt2_large_dpo_hh
Viewer
•
Updated
Jan 9
•
8k
•
17
DatPySci/weak_gpt2_medium_dpo_hh
Viewer
•
Updated
Jan 9
•
8k
•
20
DatPySci/weak_gpt2_dpo_hh
Viewer
•
Updated
Jan 9
•
8k
•
14
DatPySci/Llama-3.2-3B_refine_gpt2-large_tldr
Viewer
•
Updated
Jan 8
•
8k
•
20
DatPySci/Llama-3.2-3B_refine_gpt2-medium_tldr
Viewer
•
Updated
Jan 8
•
8k
•
21
DatPySci/Llama-3.2-3B_refine_gpt2_tldr
Viewer
•
Updated
Jan 8
•
8k
•
17
DatPySci/Llama-3.2-1B_refine_gpt2-large_tldr
Viewer
•
Updated
Jan 8
•
8k
•
19
DatPySci/Llama-3.2-1B_refine_gpt2-medium_tldr
Viewer
•
Updated
Jan 8
•
8k
•
20
DatPySci/Llama-3.2-1B_refine_gpt2_tldr
Viewer
•
Updated
Jan 8
•
8k
•
19
DatPySci/hh_gpt2_large_w2s_feedback
Viewer
•
Updated
Jan 4
•
53.8k
•
18
DatPySci/hh_gpt2_medium_w2s_feedback
Viewer
•
Updated
Jan 4
•
53.8k
•
17
DatPySci/hh_gpt2_w2s_feedback
Viewer
•
Updated
Jan 4
•
53.8k
•
22
DatPySci/tldr_gpt2_large_w2s_feedback
Viewer
•
Updated
Jan 4
•
46.4k
•
15
DatPySci/tldr_gpt2_medium_w2s_feedback
Viewer
•
Updated
Jan 4
•
46.4k
•
15
DatPySci/tldr_gpt2_w2s_feedback
Viewer
•
Updated
Jan 4
•
46.4k
•
19
DatPySci/gpt2-medium_dpo_tldr_temp_1_2
Viewer
•
Updated
Jan 2
•
8k
•
13
DatPySci/gpt2_dpo_tldr_temp_1_0
Viewer
•
Updated
Jan 2
•
3.88k
•
20
DatPySci/gpt2-large_dpo_tldr_temp_1_0
Viewer
•
Updated
Jan 2
•
3.88k
•
15
DatPySci/gpt2-medium_dpo_tldr_temp_1_0
Viewer
•
Updated
Jan 2
•
3.88k
•
17
DatPySci/weak_to_strong_reward_tldr
Viewer
•
Updated
Dec 30, 2024
•
94.8k
•
15
DatPySci/weak_to_strong_reward_hh
Viewer
•
Updated
Dec 30, 2024
•
110k
•
24
DatPySci/gpt2_dpo_anthropic_hh_pref
Viewer
•
Updated
Dec 28, 2024
•
128k
•
19
DatPySci/base_llama3-1b_anthropic_hh
Viewer
•
Updated
Dec 26, 2024
•
8k
•
15
DatPySci/gpt2-large_dpo_anthropic_hh
Viewer
•
Updated
Dec 25, 2024
•
8k
•
15
DatPySci/gpt2-medium_dpo_anthropic_hh
Viewer
•
Updated
Dec 25, 2024
•
8k
•
14
Previous
1
2
Next