Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
ContextualAI
/
Contextual_KTO_Mistral_PairRM
like
31
Follow
ContextualAI
59
Text Generation
Transformers
Safetensors
snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset
English
mistral
human feedback
rlhf
preferences
alignment
HALO
halos
dpo
rl
conversational
text-generation-inference
Inference Endpoints
arxiv:
2402.01306
License:
apache-2.0
Model card
Files
Files and versions
Community
1
Train
Deploy
Use this model
main
Contextual_KTO_Mistral_PairRM
3 contributors
History:
25 commits
xwinxu
Update README.md
98bee13
verified
8 months ago
.gitattributes
Safe
1.52 kB
initial commit
10 months ago
README.md
Safe
2.23 kB
Update README.md
8 months ago
added_tokens.json
Safe
42 Bytes
Upload tokenizer
10 months ago
config.json
Safe
653 Bytes
Upload MistralForCausalLM
10 months ago
generation_config.json
Safe
111 Bytes
Upload MistralForCausalLM
10 months ago
model-00001-of-00003.safetensors
Safe
4.94 GB
LFS
Upload MistralForCausalLM
10 months ago
model-00002-of-00003.safetensors
Safe
5 GB
LFS
Upload MistralForCausalLM
10 months ago
model-00003-of-00003.safetensors
Safe
4.54 GB
LFS
Upload MistralForCausalLM
10 months ago
model.safetensors.index.json
Safe
24 kB
Upload MistralForCausalLM
10 months ago
special_tokens_map.json
Safe
550 Bytes
Fix tokenizer chat template
10 months ago
tokenizer.json
Safe
1.8 MB
Upload tokenizer
10 months ago
tokenizer.model
Safe
493 kB
LFS
Upload tokenizer
10 months ago
tokenizer_config.json
Safe
1.59 kB
Fix tokenizer chat template
10 months ago