Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
4
Nguyễn Minh Phúc
DatPySci
Follow
Oztobuzz's profile picture
1 follower
·
1 following
AI & ML interests
Reinforcement learning, NLP
Recent Activity
updated
a model
about 2 months ago
DatPySci/Qwen-2.5-7B-Simple-RL
published
a model
about 2 months ago
DatPySci/Qwen-2.5-7B-Simple-RL
published
a model
about 2 months ago
DatPySci/Llama-3.2-3B-sft-mixture
View all activity
Organizations
DatPySci
's models
90
Sort: Recently updated
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_beta_0.05_steps_sft__tldr
Updated
Sep 21, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_beta_0.03_steps_sft__tldr
Updated
Sep 20, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_beta_0.01_steps_sft__tldr
Updated
Sep 20, 2024
DatPySci/EleutherAI_pythia-1b-deduped__clipped_pythia-1b_beta-0.1__tldr
Updated
Sep 20, 2024
DatPySci/EleutherAI_pythia-1b-deduped__ppo__tldr
Updated
Sep 19, 2024
DatPySci/EleutherAI_pythia-1b-deduped__clip_pythia-1b_beta-0.1__tldr
Updated
Sep 18, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_pythia-1b_beta-0.01__tldr
Updated
Sep 17, 2024
DatPySci/EleutherAI_pythia-1b-deduped__length_IS_pythia-1b_beta-0.01__tldr
Updated
Sep 17, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_pythia-1b_beta-0.1__tldr
Updated
Sep 16, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_pythia-1b_beta-0.05__tldr
Updated
Sep 16, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_pythia-1b_beta-0.01__tldr
Updated
Sep 16, 2024
DatPySci/EleutherAI_pythia-1b-deduped__offline_rl__tldr
Updated
Sep 8, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_beta_10_sft__tldr
Updated
Sep 7, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_beta_3_sft__tldr
Updated
Sep 7, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_beta_1.0_sft__tldr
Updated
Sep 7, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_beta_0.3_sft__tldr
Updated
Sep 7, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_beta_0.1_sft__tldr
Updated
Sep 7, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_beta_0.05_sft__tldr
Updated
Sep 7, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_beta_0.03_sft__tldr
Updated
Sep 6, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_beta_0.01_steps_46800__tldr
Updated
Sep 6, 2024
DatPySci/EleutherAI_pythia-1b-deduped__dpo_shift_beta_0.01_sft__tldr
Updated
Sep 6, 2024
DatPySci/EleutherAI_pythia-1b-deduped__pref_shift__tldr
Updated
Sep 4, 2024
DatPySci/EleutherAI_pythia-1b-deduped__off_policy__tldr
Updated
Sep 3, 2024
DatPySci/EleutherAI_pythia-2.8b-deduped__reward__tldr
Updated
Sep 1, 2024
DatPySci/EleutherAI_pythia-1b-deduped__on_policy__tldr
Updated
Aug 31, 2024
DatPySci/EleutherAI_pythia-2.8b-deduped__dpo__tldr
Updated
Aug 28, 2024
DatPySci/EleutherAI_pythia-410m-deduped__sft__tldr
Updated
Aug 27, 2024
DatPySci/EleutherAI_pythia-1b-deduped__off_rm__tldr
Updated
Aug 25, 2024
DatPySci/EleutherAI_pythia-1b-deduped__off_exp__tldr
Updated
Aug 24, 2024
DatPySci/EleutherAI_pythia-1b-deduped__SNIS_off_policy_0.05_1e-6__tldr
Updated
Aug 24, 2024
Previous
1
2
3
Next