Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
ContextualAI
/
Contextual_KTO_Mistral_PairRM
like
31
Follow
ContextualAI
62
Text Generation
Transformers
Safetensors
snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset
English
mistral
human feedback
rlhf
preferences
alignment
HALO
halos
dpo
rl
conversational
text-generation-inference
Inference Endpoints
arxiv:
2402.01306
License:
apache-2.0
Model card
Files
Files and versions
Community
1
Train
Deploy
Use this model
main
Contextual_KTO_Mistral_PairRM
/
README.md
Commit History
Update README.md
98bee13
verified
xwinxu
commited on
Apr 26, 2024
Update README.md
bdf7fe0
verified
xwinxu
commited on
Mar 7, 2024
Update README.md
31efc9a
verified
xwinxu
commited on
Mar 7, 2024
Update README.md
8d0fec9
verified
xwinxu
commited on
Mar 7, 2024
Update README.md
8b7e5cc
verified
xwinxu
commited on
Mar 7, 2024
Update README.md
06fc6e3
verified
xwinxu
commited on
Mar 5, 2024
Upload tokenizer
eb151d5
verified
Muennighoff
commited on
Mar 5, 2024
Upload README.md with huggingface_hub
927e33a
verified
Muennighoff
commited on
Mar 5, 2024