ash56
/

ssl-aasist

Model card Files Files and versions Community

ssl-aasist / fairseq /examples /speech_synthesis /README.md

ash56's picture

Add files using upload-large-folder tool

a1d9110 verified 12 days ago

|

1.27 kB

	Speech Synthesis (S^2)
	===
	[https://arxiv.org/abs/2109.06912](https://arxiv.org/abs/2109.06912)

	Speech synthesis with fairseq.

	## Features

	- Autoregressive and non-autoregressive models
	- Multi-speaker synthesis
	- Audio preprocessing (denoising, VAD, etc.) for less curated data
	- Automatic metrics for model development
	- Similar data configuration as [S2T](../speech_to_text/README.md)


	## Examples
	- [Single-speaker synthesis on LJSpeech](docs/ljspeech_example.md)
	- [Multi-speaker synthesis on VCTK](docs/vctk_example.md)
	- [Multi-speaker synthesis on Common Voice](docs/common_voice_example.md)


	## Citation
	Please cite as:
	```
	@article{wang2021fairseqs2,
	title={fairseq S\^{} 2: A Scalable and Integrable Speech Synthesis Toolkit},
	author={Wang, Changhan and Hsu, Wei-Ning and Adi, Yossi and Polyak, Adam and Lee, Ann and Chen, Peng-Jen and Gu, Jiatao and Pino, Juan},
	journal={arXiv preprint arXiv:2109.06912},
	year={2021}
	}

	@inproceedings{ott2019fairseq,
	title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
	author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
	booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
	year = {2019},
	}
	```