Model Checkpoints for ManiSkill-HAB

RL (SAC, PPO) and IL (BC, DP) baselines for ManiSkill-HAB. Each checkpoint includes a torch checkpoint policy.pt (model, optimizer/scheduler state, other trainable parameters) and a train config config.yml with hyperparemeters and env kwargs.

RL Pick/Place policies are trained using SAC due to improved performance, while Open/Close is trained with PPO for wall-time efficiency (see Appendix A.4.3). All-object RL policies are under all/ directories, while per-object policies are under directories labeled by the object name. IL policies do not have per-object Pick/Place variants.

To download these policies, run the following:

huggingface-cli download arth-shukla/mshab_checkpoints --local-dir mshab_checkpoints

If you use ManiSkill-HAB in your work, please consider citing the following:

@inproceedings{shukla2025maniskillhab,
  author       = {Arth Shukla and
                  Stone Tao and
                  Hao Su},
  title        = {ManiSkill-HAB: {A} Benchmark for Low-Level Manipulation in Home Rearrangement
                  Tasks},
  booktitle    = {The Thirteenth International Conference on Learning Representations,
                  {ICLR} 2025, Singapore, April 24-28, 2025},
  publisher    = {OpenReview.net},
  year         = {2025},
  url          = {https://openreview.net/forum?id=6bKEWevgSd},
  timestamp    = {Thu, 15 May 2025 17:19:05 +0200},
  biburl       = {https://dblp.org/rec/conf/iclr/ShuklaTS25.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}