Spaces:
Runtime error
A newer version of the Gradio SDK is available:
5.23.3
Sound Generation with AudioLDM2 and OpenVINO™
AudioLDM 2 is a latent text-to-audio diffusion model capable of generating realistic audio samples given any text input.
AudioLDM 2 was proposed in the paper AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining by Haohe Liu
et al.
The model takes a text prompt as input and predicts the corresponding audio. It can generate text-conditional sound effects, human speech and music.
In this tutorial we will try out the pipeline, convert the models backing it one by one and will run an interactive app with Gradio!
Notebook Contents
This notebook demonstrates how to convert and run Audio LDM 2 using OpenVINO.
Notebook contains the following steps:
- Create pipeline with PyTorch models using Diffusers library.
- Convert PyTorch models to OpenVINO IR format using model conversion API.
- Run Audio LDM 2 pipeline with OpenVINO.
Installation Instructions
This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to Installation Guide.