|
--- |
|
license: apache-2.0 |
|
extra_gated_prompt: "You agree to not use the model to conduct experiments that cause harm to human subjects." |
|
extra_gated_fields: |
|
Name: text |
|
Company/Organization: text |
|
Country: text |
|
E-Mail: text |
|
--- |
|
|
|
# Model Card for InternVideo2 |
|
|
|
This modelcard aims to give the model info of 'InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding'. |
|
|
|
## Model Details |
|
|
|
### Model Sources |
|
|
|
- **Repository:** [InternVideo2](https://github.com/OpenGVLab/InternVideo/tree/main/InternVideo2) |
|
- **Paper:** [2403.15377](https://arxiv.org/abs/2403.15377) |
|
- **Point of Contact:** mailto:[InternVideo Group]([email protected]) |
|
|
|
## Citation |
|
|
|
If you find this work useful for your research, please consider citing InternVideo2. Your acknowledgement would greatly help us in continuing to contribute resources to the research community. |
|
|
|
``` |
|
|
|
@article{wang2024internvideo2, |
|
title={InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding}, |
|
author={Wang, Yi and Li, Kunchang and Li, Xinhao and Yu, Jiashuo and He, Yinan and Chen, Guo and Pei, Baoqi and Zheng, Rongkun and Xu, Jilan and Wang, Zun and others}, |
|
journal={arXiv preprint arXiv:2403.15377}, |
|
year={2024} |
|
} |
|
``` |