Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
rl-llm-agent
/
Llama-3.2-3B-Instruct-online-dpo-alfworld-iter2
like
0
Follow
RL LLM AGENT
2
PyTorch
llama
Model card
Files
Files and versions
xet
Community
1
New discussion
New pull request
Resources
PR & discussions documentation
Code of Conduct
Hub documentation
All
Discussions
Pull requests
View closed (0)
Sort: Recently created
Adding `safetensors` variant of this model
#1 opened 7 months ago by
SFconvertbot