Evaluate and generate text based on images and videos
Wan: Open and Advanced Large-Scale Video Generative Models