Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

RL LLM AGENT

community
https://www.sanjibanchoudhury.com/
Activity Feed

AI & ML interests

None defined yet.

Paloma Sodhi's profile picture Sanjiban Choudhury's profile picture

rl-llm-agent 's models 13

rl-llm-agent/Llama-3.2-3B-Instruct-sft-alfworld-leap-iter1

Text Generation • 3B • Updated Feb 12 • 2

rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-iqlearn-iter1

Updated Jan 20 • 2

rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-exploration-aflworld-iter0-checkpoint-50

Updated Jan 16 • 2

rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-iter2-70k

Updated Jan 16 • 2

rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-shaped-iter0

Updated Jan 14 • 3

rl-llm-agent/Llama-3.2-3B-Instruct-value-alfworld-8b-sft

Updated Jan 13 • 3

rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-alfworld-iqlearn-iter0

Updated Jan 13 • 3

rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-iqlearn-iter0

Updated Jan 13 • 2

rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-alfworld-iter2

Updated Jan 11 • 2

rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-alfworld-iter1

Text Generation • 3B • Updated Jan 10 • 2

rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-alfworld-iter0

Updated Jan 8 • 2

rl-llm-agent/Llama-3.2-3B-Instruct-sft-alfworld-iter0

Text Generation • 3B • Updated Jan 4 • 11

rl-llm-agent/Llama-3.1-8B-Instruct-sft-alfworld-iter0

Text Generation • 8B • Updated Jan 3 • 2
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs