sinhprous commited on
Commit
cbef2ba
·
verified ·
1 Parent(s): 59eed48

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -3
README.md CHANGED
@@ -1,3 +1,53 @@
1
- ---
2
- license: cc-by-nc-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ datasets:
4
+ - mozilla-foundation/common_voice_17_0
5
+ - bond005/sberdevices_golos_10h_crowd
6
+ - bond005/sova_rudevices
7
+ - Aniemore/resd_annotated
8
+ language:
9
+ - ru
10
+ base_model:
11
+ - SWivid/F5-TTS
12
+ ---
13
+ ## Overview
14
+ The F5-TTS model is fine-tuned on the LJSpeech dataset with an emphasis on stability, ensuring it avoids choppiness, mispronunciations, repetitions, and skipping words
15
+ Differences from the original model: the phoneme alignment was used during training, whereas a duration predictor is used during inference.
16
+
17
+ ## License
18
+ This model is released under the Creative Commons Attribution Non Commercial Share Alike 4.0 license, which allows for free usage, modification, and distribution
19
+
20
+ ## Model Information
21
+ **Base Model**: SWivid/F5-TTS
22
+ **Total Training Duration:** 250.000 steps
23
+
24
+ **Training Configuration:**
25
+ ```json
26
+ "exp_name": "F5TTS_Base",
27
+ "learning_rate": 1e-05,
28
+ "batch_size_per_gpu": 4500,
29
+ "batch_size_type": "frame",
30
+ "max_samples": 64,
31
+ "grad_accumulation_steps": 1,
32
+ "max_grad_norm": 1,
33
+ "epochs": 144,
34
+ "num_warmup_updates": 5838,
35
+ "save_per_updates": 11676,
36
+ "last_per_steps": 2918,
37
+ "finetune": true,
38
+ "file_checkpoint_train": "",
39
+ "tokenizer_type": "char",
40
+ "tokenizer_file": "",
41
+ "mixed_precision": "fp16",
42
+ "logger": "wandb",
43
+ "bnb_optimizer": true
44
+ ```
45
+
46
+ ## Usage Instructions
47
+ Go to [base repo](https://github.com/SWivid/F5-TTS)
48
+
49
+ ## To do
50
+ - Multi-speaker model
51
+
52
+ # Other links
53
+ - [Github repo](https://github.com/sinhprous/F5-TTS)