LeRobot documentation
Imitation Learning in Sim
Imitation Learning in Sim
This tutorial will explain how to train a neural network to control a robot in simulation with imitation learning.
You’ll learn:
- How to record a dataset in simulation with gym-hil and visualize the dataset.
- How to train a policy using your data.
- How to evaluate your policy in simulation and visualize the results.
For the simulation environment we use the same repo that is also being used by the Human-In-the-Loop (HIL) reinforcement learning algorithm. This environment is based on MuJoCo and allows you to record datasets in LeRobotDataset format. Teleoperation is easiest with a controller like the Logitech F710, but you can also use your keyboard if you are up for the challenge.
Installation
First, install the gym_hil
package within the LeRobot environment, go to your LeRobot folder and run this command:
pip install -e ".[hilserl]"
Teleoperate and Record a Dataset
To use gym_hil
with LeRobot, you need to use a configuration file. An example config file can be found here.
To teleoperate and collect a dataset, we need to modify this config file and you should add your repo_id
here: "repo_id": "il_gym",
and "num_episodes": 30,
and make sure you set mode
to record
, “mode”: “record”.
If you do not have a Nvidia GPU also change "device": "cuda"
parameter in the config file (for example to mps
for MacOS).
By default the config file assumes you use a controller. To use your keyboard please change the envoirment specified at "task"
in the config file and set it to "PandaPickCubeKeyboard-v0"
.
Then we can run this command to start:
python lerobot/scripts/rl/gym_manipulator.py --config_path path/to/env_config_gym_hil_il.json
Once rendered you can teleoperate the robot with the gamepad or keyboard, below you can find the gamepad/keyboard controls.
Note that to teleoperate the robot you have to hold the “Human Take Over Pause Policy” Button RB
to enable control!
Gamepad Controls
Gamepad button mapping for robot control and episode management
Keyboard controls
For keyboard controls use the spacebar
to enable control and the following keys to move the robot:
Arrow keys: Move in X-Y plane
Shift and Shift_R: Move in Z axis
Right Ctrl and Left Ctrl: Open and close gripper
ESC: Exit
Visualize a dataset
If you uploaded your dataset to the hub you can visualize your dataset online by copy pasting your repo id.
Dataset visualizer
Train a policy
To train a policy to control your robot, use the python lerobot/scripts/train.py
script. A few arguments are required. Here is an example command:
python lerobot/scripts/train.py \
--dataset.repo_id=${HF_USER}/il_gym \
--policy.type=act \
--output_dir=outputs/train/il_sim_test \
--job_name=il_sim_test \
--policy.device=cuda \
--wandb.enable=true
Let’s explain the command:
- We provided the dataset as argument with
--dataset.repo_id=${HF_USER}/il_gym
. - We provided the policy with
policy.type=act
. This loads configurations fromconfiguration_act.py
. Importantly, this policy will automatically adapt to the number of motor states, motor actions and cameras of your robot (e.g.laptop
andphone
) which have been saved in your dataset. - We provided
policy.device=cuda
since we are training on a Nvidia GPU, but you could usepolicy.device=mps
to train on Apple silicon. - We provided
wandb.enable=true
to use Weights and Biases for visualizing training plots. This is optional but if you use it, make sure you are logged in by runningwandb login
.
Training should take several hours, 100k steps (which is the default) will take about 1h on Nvidia A100. You will find checkpoints in outputs/train/il_sim_test/checkpoints
.
Train using Collab
If your local computer doesn’t have a powerful GPU you could utilize Google Collab to train your model by following the ACT training notebook.
Upload policy checkpoints
Once training is done, upload the latest checkpoint with:
huggingface-cli upload ${HF_USER}/il_sim_test \
outputs/train/il_sim_test/checkpoints/last/pretrained_model
You can also upload intermediate checkpoints with:
CKPT=010000
huggingface-cli upload ${HF_USER}/il_sim_test${CKPT} \
outputs/train/il_sim_test/checkpoints/${CKPT}/pretrained_model
Evaluate your policy in Sim
To evaluate your policy we have to use the config file that can be found here.
Make sure to replace the repo_id
with the dataset you trained on, for example pepijn223/il_sim_dataset
and replace the pretrained_policy_name_or_path
with your model id, for example pepijn223/il_sim_model
Then you can run this command to visualize your trained policy
python lerobot/scripts/rl/eval_policy.py --config_path=path/to/eval_config_gym_hil.json
While the main workflow of training ACT in simulation is straightforward, there is significant room for exploring how to set up the task, define the initial state of the environment, and determine the type of data required during collection to learn the most effective policy. If your trained policy doesn’t perform well, investigate the quality of the dataset it was trained on using our visualizers, as well as the action values and various hyperparameters related to ACT and the simulation.
Congrats 🎉, you have finished this tutorial. If you want to continue with using LeRobot in simulation follow this Tutorial on reinforcement learning in sim with HIL-SERL
If you have any questions or need help, please reach out on Discord.