vumichien commited on
Commit
66f2348
·
1 Parent(s): ee9a4c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -5
README.md CHANGED
@@ -4,17 +4,35 @@ library_name: keras
4
 
5
  ## Model description
6
 
7
- More information needed
 
 
 
 
 
 
 
 
8
 
9
  ## Intended uses & limitations
10
 
11
- More information needed
 
 
 
 
 
 
 
 
 
 
12
 
13
- ## Training and evaluation data
14
 
15
- More information needed
16
 
17
- ## Model Plot
 
18
 
19
  <details>
20
  <summary>View Model Plot</summary>
 
4
 
5
  ## Model description
6
 
7
+ This repo contains the model and the notebook for implementing MelGAN to inverse spectrogram using feature matching [MelGAN-based spectrogram inversion using feature matching](https://keras.io/examples/audio/melgan_spectrogram_inversion/).
8
+
9
+ Full credits go to [Darshan Deshpande](https://twitter.com/getdarshan)
10
+
11
+ Reproduced by [Vu Minh Chien](https://www.linkedin.com/in/vumichien/)
12
+
13
+ Motivation: Autoregressive vocoders have been ubiquitous for the majority of the history of speech processing, but for most of their existence they have lacked parallelism. MelGAN is a non-autoregressive, fully convolutional vocoder architecture used for purposes ranging from spectral inversion and speech enhancement to present-day state-of-the-art speech synthesis when used as a decoder with models like Tacotron2 or FastSpeech that convert text to mel spectrograms.
14
+
15
+ LJSpeech dataset was used in this tutorial. The LJSpeech dataset is primarily used for text-to-speech and consists of 13,100 discrete speech samples taken from 7 non-fiction books, having a total length of approximately 24 hours
16
 
17
  ## Intended uses & limitations
18
 
19
+ The MelGAN implemented in this tutorial is similar to the original implementation with only the difference in the method of padding for convolutions where we will use 'same' instead of reflecting padding.
20
+
21
+
22
+ ### Training hyperparameters
23
+
24
+ The following hyperparameters were used during training:
25
+ - generator_learning_rate: 1e-5
26
+ - discriminator_learning_rate: 1e-6
27
+ - train_batch_size: 16
28
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
29
+ - num_epochs: 20
30
 
 
31
 
32
+ ## Model Plot
33
 
34
+ ### View Model Demo
35
+ ![Model Demo](./demo.png)
36
 
37
  <details>
38
  <summary>View Model Plot</summary>