Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
jiazhengli
/
Meta-Llama-3-8B-QLoRA-Assessment-Rationale-dpo
like
1
PEFT
Safetensors
jiazhengli/Rationale_MCTS
jiazhengli/Synthetic_Rationale
English
llama-factory
lora
Generated from Trainer
arxiv:
2406.19949
License:
other
Model card
Files
Files and versions
Community
Use this model
main
Meta-Llama-3-8B-QLoRA-Assessment-Rationale-dpo
Commit History
Update README.md
50e66b0
verified
jiazhengli
commited on
Oct 14, 2024
init push
a57f764
Jiazheng Li
commited on
Jul 6, 2024
initial commit
36102ca
verified
jiazhengli
commited on
Jul 6, 2024