Organization Card

TTSDS Benchmark

As many recent Text-to-Speech (TTS) models have shown, synthetic audio can be close to real human speech. However, traditional evaluation methods for TTS systems need an update to keep pace with these new developments. Our TTSDS benchmark assesses the quality of synthetic speech by considering factors like prosody, speaker identity, and intelligibility. By comparing these factors with both real speech and noise datasets, we can better understand how synthetic speech stacks up.

More information

More details can be found in our paper TTSDS -- Text-to-Speech Distribution Score.

Reproducibility

To reproduce our results, check out our repository here.

Citation

@misc{minixhofer2024ttsds,
      title={TTSDS -- Text-to-Speech Distribution Score}, 
      author={Christoph Minixhofer and Ondřej Klejch and Peter Bell},
      year={2024},
      eprint={2407.12707},
      archivePrefix={arXiv},
      primaryClass={eess.AS},
      url={https://arxiv.org/abs/2407.12707}, 
}

spaces 1

pinned

Running

22

TTSDS Benchmark and Leaderboard

🥇

Text-To-Speech (TTS) Evaluation using objective metrics.

models

None public yet

datasets 6

TTS Distribution Score

AI & ML interests

Recent Activity

TTSDS Benchmark

More information

Reproducibility

Citation

spaces 1

TTSDS Benchmark and Leaderboard

models

datasets 6

ttsds/v2_data

ttsds/results

ttsds/requests

ttsds/noise-reference

ttsds/reference

ttsds/speaker_text_pairs

AI & ML interests

Recent Activity

Team members 1

TTSDS Benchmark

More information

Reproducibility

Citation

spaces 1

TTSDS Benchmark and Leaderboard

models

datasets 6 Sort: Recently updated

datasets 6