Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
minionKP
/
reward_model_output
like
0
PEFT
Safetensors
llama
trl
reward-trainer
Generated from Trainer
License:
llama3
Model card
Files
Files and versions
Community
Train
Use this model
main
reward_model_output
/
README.md
Commit History
End of training
f725802
verified
minionKP
commited on
Aug 27
End of training
76b3da1
verified
minionKP
commited on
Aug 26
End of training
6c61887
verified
minionKP
commited on
Aug 20