This repository contains the model described in Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
HF Inference deployability: The model has no library tag.