cdminix commited on
Commit
f8d5c98
·
verified ·
1 Parent(s): 3f277ef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -26
README.md CHANGED
@@ -20,29 +20,20 @@ More details can be found in our paper [*TTSDS -- Text-to-Speech Distribution Sc
20
  ## Reproducibility
21
  To reproduce our results, check out our repository [here](https://github.com/ttsds/ttsds).
22
 
23
- ## Credits
24
-
25
-
26
- This benchmark is inspired by [TTS Arena](https://huggingface.co/spaces/TTS-AGI/TTS-Arena) which instead focuses on the subjective evaluation of TTS models.
27
- Our benchmark would not be possible without the many open-source TTS models on Hugging Face and GitHub.
28
- Additionally, our benchmark uses the following datasets:
29
- - [LJSpeech](https://keithito.com/LJ-Speech-Dataset/h)
30
- - [LibriTTS](https://www.openslr.org/60/)
31
- - [VCTK](https://datashare.ed.ac.uk/handle/10283/2950)
32
- - [Common Voice](https://commonvoice.mozilla.org/)
33
- - [ESC-50](https://github.com/karolpiczak/ESC-50)
34
- And the following metrics/representations/tools:
35
- - [Wav2Vec2](https://arxiv.org/abs/2006.11477)
36
- - [Hubert](https://arxiv.org/abs/2006.11477)
37
- - [WavLM](https://arxiv.org/abs/2110.13900)
38
- - [PESQ](https://en.wikipedia.org/wiki/Perceptual_Evaluation_of_Speech_Quality)
39
- - [VoiceFixer](https://arxiv.org/abs/2204.05841)
40
- - [WADA SNR](https://www.cs.cmu.edu/~robust/Papers/KimSternIS08.pdf)
41
- - [Whisper](https://arxiv.org/abs/2212.04356)
42
- - [Masked Prosody Model](https://huggingface.co/cdminix/masked_prosody_model)
43
- - [PyWorld](https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder)
44
- - [WeSpeaker](https://arxiv.org/abs/2210.17016)
45
- - [D-Vector](https://github.com/yistLin/dvector)
46
-
47
- Authors: Christoph Minixhofer, Ondřej Klejch, and Peter Bell
48
- of the University of Edinburgh.
 
20
  ## Reproducibility
21
  To reproduce our results, check out our repository [here](https://github.com/ttsds/ttsds).
22
 
23
+ Authors:
24
+ Christoph Minixhofer, Ondřej Klejch, and Peter Bell
25
+ The University of Edinburgh.
26
+
27
+ ## Citation
28
+
29
+ ```
30
+ @misc{minixhofer2024ttsdstexttospeechdistribution,
31
+ title={TTSDS -- Text-to-Speech Distribution Score},
32
+ author={Christoph Minixhofer and Ondřej Klejch and Peter Bell},
33
+ year={2024},
34
+ eprint={2407.12707},
35
+ archivePrefix={arXiv},
36
+ primaryClass={eess.AS},
37
+ url={https://arxiv.org/abs/2407.12707},
38
+ }
39
+ ```