Aether: Geometric-Aware Unified World Modeling

image
     

This repository contains the model used in the paper Aether: Geometric-Aware Unified World Modeling.

Aether addresses a fundamental challenge in AI: integrating geometric reconstruction with generative modeling for human-like spatial reasoning. Our framework unifies three core capabilities: (1) 4D dynamic reconstruction, (2) action-conditioned video prediction, and (3) goal-conditioned visual planning. Trained entirely on synthetic data, Aether achieves strong zero-shot generalization to real-world scenarios.

Teaser

Find the code at https://github.com/OpenRobotLab/Aether.

πŸ“ Citation

If you find this work useful in your research, please consider citing:

@article{aether,
  title     = {Aether: Geometric-Aware Unified World Modeling},
  author    = {Aether Team and Haoyi Zhu and Yifan Wang and Jianjun Zhou and Wenzheng Chang and Yang Zhou and Zizun Li and Junyi Chen and Chunhua Shen and Jiangmiao Pang and Tong He},
  journal   = {arXiv preprint arXiv:2503.18945},
  year      = {2025}
}

βš–οΈ License

This repository is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgements

Our work is primarily built upon Accelerate, Diffusers, CogVideoX, Finetrainers, DepthAnyVideo, CUT3R, MonST3R, VBench, GST, SPA, DroidCalib, Grounded-SAM-2, ceres-solver, etc. We extend our gratitude to all these authors for their generously open-sourced code and their significant contributions to the community.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using AetherWorldModel/AetherV1 1