LeRobot documentation
Train RL in Simulation
Train RL in Simulation
This guide explains how to use the gym_hil
simulation environments as an alternative to real robots when working with the LeRobot framework for Human-In-the-Loop (HIL) reinforcement learning.
gym_hil
is a package that provides Gymnasium-compatible simulation environments specifically designed for Human-In-the-Loop reinforcement learning. These environments allow you to:
Train policies in simulation to test the RL stack before training on real robots
Collect demonstrations in sim using external devices like gamepads or keyboards
Perform human interventions during policy learning
Currently, the main environment is a Franka Panda robot simulation based on MuJoCo, with tasks like picking up a cube.
Installation
First, install the gym_hil
package within the LeRobot environment:
pip install -e ".[hilserl]"
What do I need?
- A gamepad or keyboard to control the robot
- A Nvidia GPU
Configuration
To use gym_hil
with LeRobot, you need to create a configuration file. An example is provided here. Key configuration sections include:
Environment Type and Task
{
"type": "hil",
"name": "franka_sim",
"task": "PandaPickCubeGamepad-v0",
"device": "cuda"
}
Available tasks:
PandaPickCubeBase-v0
: Basic environmentPandaPickCubeGamepad-v0
: With gamepad controlPandaPickCubeKeyboard-v0
: With keyboard control
Gym Wrappers Configuration
"wrapper": {
"gripper_penalty": -0.02,
"control_time_s": 15.0,
"use_gripper": true,
"fixed_reset_joint_positions": [0.0, 0.195, 0.0, -2.43, 0.0, 2.62, 0.785],
"end_effector_step_sizes": {
"x": 0.025,
"y": 0.025,
"z": 0.025
},
"control_mode": "gamepad"
}
Important parameters:
gripper_penalty
: Penalty for excessive gripper movementuse_gripper
: Whether to enable gripper controlend_effector_step_sizes
: Size of the steps in the x,y,z axes of the end-effectorcontrol_mode
: Set to"gamepad"
to use a gamepad controller
Running with HIL RL of LeRobot
Basic Usage
To run the environment, set mode to null:
python lerobot/scripts/rl/gym_manipulator.py --config_path path/to/gym_hil_env.json
Recording a Dataset
To collect a dataset, set the mode to record
whilst defining the repo_id and number of episodes to record:
python lerobot/scripts/rl/gym_manipulator.py --config_path path/to/gym_hil_env.json
Training a Policy
To train a policy, checkout the configuration example available here and run the actor and learner servers:
python lerobot/scripts/rl/actor.py --config_path path/to/train_gym_hil_env.json
In a different terminal, run the learner server:
python lerobot/scripts/rl/learner.py --config_path path/to/train_gym_hil_env.json
The simulation environment provides a safe and repeatable way to develop and test your Human-In-the-Loop reinforcement learning components before deploying to real robots.
Congrats 🎉, you have finished this tutorial!
If you have any questions or need help, please reach out on Discord.
Paper citation:
@article{luo2024precise,
title={Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning},
author={Luo, Jianlan and Xu, Charles and Wu, Jeffrey and Levine, Sergey},
journal={arXiv preprint arXiv:2410.21845},
year={2024}
}