Model Card

This model is released with paper LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

Xiang Li1, Cristina Mata1, Jongwoo Park1, Kumara Kahatapitiya1, Yoo Sung Jang1, Jinghuan Shang1, Kanchana Ranasinghe1, Ryan Burgert1, Mu Cai2, Yong Jae Lee2, and Michael S. Ryoo1

1Stony Brook University 2University of Wisconsin-Madison

Model details

Model type: This repository contains three models trained on three subsets respectively, converted from VIMA-Data. For the conversion code, please refer to convert_vima.ipynb

Paper or resources for more information: https://github.com/LostXine/LLaRA

Where to send questions or comments about the model: https://github.com/LostXine/LLaRA/issues

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
Inference API (serverless) has been turned off for this model.

Dataset used to train variante/llara-maskrcnn

Collection including variante/llara-maskrcnn