FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder
Abstract
Generative adversarial network (GAN) based vocoders have achieved significant attention in speech synthesis with high quality and fast inference speed. However, there still exist many noticeable spectral artifacts, resulting in the quality decline of synthesized speech. In this work, we adopt a novel GAN-based vocoder designed for few artifacts and high fidelity, called FA-<PRE_TAG>GAN</POST_TAG>. To suppress the aliasing artifacts caused by non-ideal upsampling layers in high-frequency components, we introduce the anti-aliased twin deconvolution module in the generator. To alleviate blurring artifacts and enrich the reconstruction of spectral details, we propose a novel fine-grained multi-resolution real and imaginary loss to assist in the modeling of phase information. Experimental results reveal that FA-<PRE_TAG>GAN</POST_TAG> outperforms the compared approaches in promoting audio quality and alleviating spectral artifacts, and exhibits superior performance when applied to unseen speaker scenarios.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Improving Unsupervised Clean-to-Rendered Guitar Tone Transformation Using GANs and Integrated Unaligned Clean Data (2024)
- Speech Bandwidth Expansion Via High Fidelity Generative Adversarial Networks (2024)
- FLY-TTS: Fast, Lightweight and High-Quality End-to-End Text-to-Speech Synthesis (2024)
- Universal Score-based Speech Enhancement with High Content Preservation (2024)
- DSP-informed bandwidth extension using locally-conditioned excitation and linear time-varying filter subnetworks (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper