Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference Providers
Select all
SambaNova
Cerebras
Hyperbolic
fal
Nebius AI Studio
Fireworks
Together AI
Replicate
Novita
HF Inference API
Misc
Reset Misc
reward-trainer
Inference Endpoints
AutoTrain Compatible
text-generation-inference
4-bit precision
Eval Results
8-bit precision
Misc with no match
Merge
custom_code
text-embeddings-inference
Carbon Emissions
Mixture of Experts
Apply filters
Models
561
Full-text search
Edit filters
Sort: Trending
Active filters:
reward-trainer
Clear all
vwxyzjn/rm_zephyr_new2
Text Classification
•
Updated
May 6, 2024
•
5
MahmoudMohamed/Reward_Model
Text Classification
•
Updated
May 8, 2024
•
11
Holarissun/RM-TLDR_human_loraR64_-1_gemma7b_lr1e-05_bs2_g4
Updated
May 9, 2024
Holarissun/RM-TLDR_human_loraR64_-1_gemma7b_lr1.41e-05_bs2_g4
Updated
May 11, 2024
Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr1.41e-05_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr5e-06_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr1e-06_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr5e-05_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_contrast_loraR32_-1_gemma2b_lr5e-05_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr5e-06_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr1.41e-05_bs2_g4
Updated
May 12, 2024
•
1
Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr1e-06_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr5e-05_bs2_g4
Updated
May 12, 2024
thorirhrafn/gpt1B_reward_model3
Updated
May 13, 2024
•
2
vwxyzjn/rm
Text Classification
•
Updated
Jun 20, 2024
•
6
vwxyzjn/rm1
Text Classification
•
Updated
May 21, 2024
•
4
calkp/reward_model
Text Classification
•
Updated
May 22, 2024
•
7
ianmiller314/results
Text Classification
•
Updated
May 24, 2024
•
4
mnoukhov/pythia410m-rm-tldr
Text Classification
•
Updated
Jun 2, 2024
•
4
damienbenveniste/HW2-reward
Text Classification
•
Updated
Jun 14, 2024
•
12
DownwardSpiral33/2c2-reward
Text Classification
•
Updated
Jun 7, 2024
•
4
DownwardSpiral33/2c6-d6-reward
Text Classification
•
Updated
Jun 7, 2024
•
4
DownwardSpiral33/2c2-reward-medium
Text Classification
•
Updated
Jun 7, 2024
•
5
DownwardSpiral33/2c6-reward
Text Classification
•
Updated
Jun 7, 2024
•
4
gsdas/temp_model
Text Classification
•
Updated
Jun 8, 2024
•
4
SiMajid/working
Updated
Jul 21, 2024
•
4
RCODI/deberta-v3-large-reward-model
Text Classification
•
Updated
Jun 12, 2024
•
6
just1nseo/reward_modeling_openchat
Updated
Jun 12, 2024
santiviquez/reward_modeling_anthropic_hh
Text Classification
•
Updated
Jun 13, 2024
•
18
mnoukhov/pythia160m-rm-tldr
Text Classification
•
Updated
Jun 18, 2024
•
5
Previous
1
2
3
4
5
6
...
19
Next