Text-to-Speech
F5-TTS
Italian
File size: 507 Bytes
32c0a32
 
 
 
 
 
 
 
0d6ee2c
2d32562
 
 
 
5c7f270
2d32562
5c7f270
2d32562
78f1bdc
5c7f270
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
datasets:
- facebook/multilingual_librispeech
language:
- it
base_model:
- SWivid/F5-TTS
pipeline_tag: text-to-speech
license: cc-by-4.0
---

This is a test to see how to finetune F5 in italian

Trained over 247+h hours of "train" split of facebook/multilingual_librispeech dataset, 6700 steps for Epoch:
- catastrophic failure (the model forgot english)
- italian pronunciation not perfect


The run.py file is an example of how to extract the wav files and produce the metadata.csv to use for training