Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference status
Reset Inference status
Warm
Cold
Frozen
Misc
Reset Misc
reward-trainer
Inference Endpoints
AutoTrain Compatible
text-generation-inference
4-bit precision
Eval Results
8-bit precision
Misc with no match
Merge
custom_code
text-embeddings-inference
Carbon Emissions
Mixture of Experts
Apply filters
Models
317
Full-text search
Edit filters
Sort: Trending
Active filters:
reward-trainer
Clear all
HFXM/RM_HHRLHF_Rule1
Text Classification
•
Updated
21 days ago
•
11
HFXM/RM_HHRLHF_Rule2
Text Classification
•
Updated
21 days ago
•
11
arqa39/Qwen2-0.5B-Reward
Updated
14 days ago
RLHF-And-Friends/Pythia-70M-Reward
Updated
8 days ago
blakenp/gpt-Reward
Text Classification
•
Updated
14 days ago
•
34
blakenp/Qwen2.5-1.5B-Reward
Text Classification
•
Updated
13 days ago
•
14
blakenp/Qwen2-0.5B-Reward
Text Classification
•
Updated
13 days ago
•
16
ZHIYII/Qwen2.5-7B-Reward
Text Classification
•
Updated
12 days ago
•
15
eth-dl-rewards/internlm2-7b-reward-code-30k
Updated
13 days ago
eth-dl-rewards/internlm2-7b-reward-code-100k
Updated
13 days ago
eth-dl-rewards/internlm2-7b-reward-code-60k
Updated
13 days ago
eth-dl-rewards/internlm2-7b-reward-math-30k
Updated
12 days ago
eth-dl-rewards/internlm2-7b-reward-math-60k
Updated
12 days ago
eth-dl-rewards/internlm2-7b-reward-math-100k
Updated
12 days ago
eth-dl-rewards/internlm2-7b-reward-math-100k-scratch
Updated
11 days ago
RLHF-And-Friends/Llama-3.2-1B-Instruct-Reward
Updated
about 5 hours ago
RLHF-And-Friends/Llama-3.2-1B-Instruct-Reward-4r
Updated
about 1 hour ago
Previous
1
...
9
10
11
Next