---
license: mit
tags:
- world-model
- open-x-embodiment
- robotic-manipulation
- video-generation
- video-prediction
- gpt
---

# iVideoGPT (Pre-trained on Open X-Embodiment, 256x256 resolution, action-free)

See https://github.com/thuml/iVideoGPT for examples for using this model.

## Citation

```
@inproceedings{wu2024ivideogpt,
    title={iVideoGPT: Interactive VideoGPTs are Scalable World Models}, 
    author={Jialong Wu and Shaofeng Yin and Ningya Feng and Xu He and Dong Li and Jianye Hao and Mingsheng Long},
    booktitle={Advances in Neural Information Processing Systems},
    year={2024}
}
```