|
--- |
|
license: bigscience-openrail-m |
|
datasets: |
|
- thelou1s/AudioSet |
|
- Chr0my/freesound.org |
|
language: |
|
- en |
|
library_name: diffusers |
|
tags: |
|
- music |
|
- art |
|
--- |
|
# Model Card for Model ID |
|
|
|
Generate any audio from text using your imagination |
|
|
|
# Model Details |
|
|
|
## Model Description |
|
|
|
- **Developed by:** Haohe Liu |
|
- **License:** CC-BY-NC-SA 4.0 |
|
|
|
## Model Sources |
|
|
|
- **Repository:** https://github.com/haoheliu/AudioLDM |
|
- **Paper:** https://arxiv.org/abs/2301.12503 |
|
- **Demo:** https://audioldm.github.io/ |
|
|
|
## Direct Use |
|
|
|
https://huggingface.co/spaces/haoheliu/audioldm-text-to-audio-generation |
|
|
|
# Bias, Risks, and Limitations |
|
|
|
TODO |
|
|
|
# Training Details |
|
|
|
## Training Data |
|
|
|
TODO |
|
|
|
# Evaluation |
|
|
|
TODO |
|
|
|
## Testing Data, Factors & Metrics |
|
|
|
### Testing Data |
|
TODO |
|
|
|
### Metrics |
|
TODO |
|
|
|
## Results |
|
TODO |
|
|
|
**BibTeX:** |
|
|
|
```bibtex |
|
@article{liu2023audioldm, |
|
title={AudioLDM: Text-to-Audio Generation with Latent Diffusion Models}, |
|
author={Liu, Haohe and Chen, Zehua and Yuan, Yi and Mei, Xinhao and Liu, Xubo and Mandic, Danilo and Wang, Wenwu and Plumbley, Mark D}, |
|
journal={arXiv preprint arXiv:2301.12503}, |
|
year={2023} |
|
} |
|
``` |
|
|
|
|