Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

Inference Endpoints

AutoTrain Compatible

text-generation-inference

4-bit precision

8-bit precision

Misc with no match

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

561

Full-text search

Active filters: reward-trainer

jasonbillion/RM_HHRLHF_Rule0_Seed2027

Text Classification • Updated Jan 11 • 5

jasonbillion/RM_HHRLHF_Rule1_Seed2027

Text Classification • Updated Jan 11 • 5

davidgaofc/b_POISON_RM_base

Text Classification • Updated Jan 12 • 6

davidgaofc/b_RM_base

Text Classification • Updated Jan 12 • 6

davidgaofc/c_POISON_RM_base

Text Classification • Updated Jan 12 • 5

davidgaofc/d_POISON_RM_base

Text Classification • Updated Jan 13 • 5

lblaoke/llama2-7b-rm-human

Text Classification • Updated Jan 14 • 5

lblaoke/llama2-7b-rm-self

Text Classification • Updated Jan 14 • 8

lblaoke/llama2-7b-rm-self-human

Text Classification • Updated Jan 13 • 6

sahandrez/pairwise-reward-Qwen2.5-1.5B-sft-uf

Text Classification • Updated Jan 13 • 6

lblaoke/mistral-v0.1-7b-rm-human

Text Classification • Updated Jan 14 • 5

lblaoke/mistral-v0.1-7b-rm-self

Text Classification • Updated Jan 14 • 5

lblaoke/mistral-v0.1-7b-rm-self-human

Text Classification • Updated Jan 14 • 6

lblaoke/mistral-v0.3-7b-rm-human

Text Classification • Updated Jan 14 • 6

lblaoke/mistral-v0.3-7b-rm-self

Text Classification • Updated Jan 14 • 5

lblaoke/mistral-v0.3-7b-rm-self-human

Text Classification • Updated Jan 14 • 7

joaoluislins/trained_rwmodel

gagan3012/Qwen-2.5-reasoning-verifier

Text Generation • Updated Jan 25 • 24

mradermacher/Qwen-2.5-reasoning-verifier-GGUF

Updated Jan 26 • 1.05k

Mithilhf01/mistral-reward

Text Classification • Updated Jan 31 • 4

AsphodelRem/test-reward-model

Text Classification • Updated Feb 2 • 6

MilyaShams/SmolLM2-135M-Instruct-Reward

Text Classification • Updated Feb 1 • 15

fjxdaisy/RM_HHRLHF_Rule6_Seed2027

Text Classification • Updated Feb 1 • 6

fjxdaisy/RM_HHRLHF_Rule6_Seed2029

Text Classification • Updated Feb 1 • 6

fjxdaisy/RM_HHRLHF_Rule6_Seed2028

Text Classification • Updated Feb 1 • 5

fjxdaisy/RM_HHRLHF_Rule7_Seed2026

Text Classification • Updated Feb 2 • 5

AsphodelRem/test-custom-reward-model

Text Classification • Updated Feb 3 • 4

HFXM/RM_HHRLHF_Rule2_Seed2025

Text Classification • Updated Feb 2 • 5

HFXM/RM_HHRLHF_Rule2_Seed2028

Text Classification • Updated Feb 2 • 4

HFXM/RM_HHRLHF_Rule2_Seed2027

Text Classification • Updated Feb 2 • 4