File size: 993 Bytes
32c0a32 0d6ee2c 2d32562 5c7f270 2d32562 5c7f270 2d32562 d0ddad3 5132332 d0ddad3 5132332 d0ddad3 78f1bdc 5c7f270 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
---
datasets:
- facebook/multilingual_librispeech
language:
- it
base_model:
- SWivid/F5-TTS
pipeline_tag: text-to-speech
license: cc-by-4.0
---
This is a test to see how to finetune F5 in italian
Trained over 247+h hours of "train" split of facebook/multilingual_librispeech dataset, 6700 steps for Epoch:
- catastrophic failure (the model forgot english)
- italian pronunciation not perfect
## folder structure:
```
| - italian_59kh
| | - checkpoints
```
### italian_59kh
Contains the weight at specific steps, the higher the number, the further it went into training.
Weights in this folder can't be used to resume training, use checkpoints instead.
### italian_59kh/checkpoints
Contains the weight of the checkpoints at specific steps, the higher the number, the further it went into training.
Weights in this folder can be used as starting point to continue training.
The run.py file is an example of how to extract the wav files and produce the metadata.csv to use for training |