Edit Models filters

Inference status

Misc

Inference Endpoints

AutoTrain Compatible

text-generation-inference

4-bit precision

8-bit precision

Misc with no match

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

317

Full-text search

Active filters: reward-trainer

HFXM/RM_HHRLHF_Rule1

Text Classification • Updated 21 days ago • 11

HFXM/RM_HHRLHF_Rule2

Text Classification • Updated 21 days ago • 11

arqa39/Qwen2-0.5B-Reward

Updated 14 days ago

RLHF-And-Friends/Pythia-70M-Reward

Updated 8 days ago

blakenp/gpt-Reward

Text Classification • Updated 14 days ago • 34

blakenp/Qwen2.5-1.5B-Reward

Text Classification • Updated 13 days ago • 14

blakenp/Qwen2-0.5B-Reward

Text Classification • Updated 13 days ago • 16

ZHIYII/Qwen2.5-7B-Reward

Text Classification • Updated 12 days ago • 15

eth-dl-rewards/internlm2-7b-reward-code-30k

Updated 13 days ago

eth-dl-rewards/internlm2-7b-reward-code-100k

Updated 13 days ago

eth-dl-rewards/internlm2-7b-reward-code-60k

Updated 13 days ago

eth-dl-rewards/internlm2-7b-reward-math-30k

Updated 12 days ago

eth-dl-rewards/internlm2-7b-reward-math-60k

Updated 12 days ago

eth-dl-rewards/internlm2-7b-reward-math-100k

Updated 12 days ago

eth-dl-rewards/internlm2-7b-reward-math-100k-scratch

Updated 11 days ago

RLHF-And-Friends/Llama-3.2-1B-Instruct-Reward

Updated about 5 hours ago

RLHF-And-Friends/Llama-3.2-1B-Instruct-Reward-4r

Updated about 1 hour ago