Text-to-Speech
Greek
English
F5-TTS-Greek / README.md
PetrosStav's picture
Update README.md
b8808ae verified
|
raw
history blame
1.4 kB
metadata
license: cc-by-nc-4.0
datasets:
  - amphion/Emilia-Dataset
  - mozilla-foundation/common_voice_12_0
language:
  - el
  - en
base_model:
  - SWivid/F5-TTS
pipeline_tag: text-to-speech

F5-TTS-Greek

F5-TTS model finetuned to speak Greek

(This work is under development and is in beta version.)

Finetuned on Greek speech datasets and a small part of Emilia-EN dataset to prevent catastrophic forgetting of English.

Model can generate Greek text with Greek reference speech, English text with English reference speech, and mix of Greek and English (quality here needs improvement, and many runs might be needed to get good results).

Datasets used:

Training arguments

Learning Rate: 0.00001 Batch Size per GPU: 3200 Max Samples: 64 Gradient Accumulation Steps: 1 Max Gradient Norm: 1 Epochs: 277 Warmup Updates: 1274 Save per Updates: 25000 Last per Steps: 1000 mixed_precision: fp16

Links:

Github: https://github.com/SWivid/F5-TTS

Paper: F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching (https://arxiv.org/abs/2410.06885)