Update README.md
Browse files
README.md
CHANGED
@@ -17,41 +17,44 @@ ConFit is a pioneering organisation dedicated to advancing the fields of speech
|
|
17 |
|
18 |
Audio classification:
|
19 |
|
20 |
-
| Dataset | Classes | Task |
|
21 |
-
| :---: | :---: | :---: | :---: | :---: |
|
22 |
-
| WMMS | 32 | Multi-class | 1697 | 10.42 |
|
23 |
-
| MSWC (English) | 271 | Multi-class | 33726 | 0.99 |
|
24 |
-
| MSWC (Spanish) | 146 | Multi-class | 11759 | 0.99 |
|
25 |
-
| MSWC (Indian) | 14 | Multi-class | 739 | 0.99 |
|
26 |
-
| ESC50 | 50 | Multi-class | 2000 | 5.00 |
|
27 |
-
|
|
28 |
-
|
|
29 |
-
|
|
30 |
-
|
|
31 |
-
|
|
32 |
-
|
|
33 |
-
|
|
34 |
-
|
|
35 |
-
|
|
36 |
-
|
|
37 |
-
|
|
38 |
-
|
|
39 |
-
|
|
40 |
-
|
|
|
|
|
|
|
|
41 |
|
42 |
Automated audio captioning:
|
43 |
|
44 |
-
| Dataset |
|
45 |
-
| :---: | :---: | :---: |
|
46 |
-
| Music4All | | |
|
47 |
|
48 |
Music, speech, and noise:
|
49 |
|
50 |
-
| Dataset |
|
51 |
-
| :---: | :---: | :---: |
|
52 |
-
| MUSAN | | |
|
53 |
-
| RIR-Noise | | |
|
54 |
-
| ARCA23K | | |
|
55 |
|
56 |
## Contact Us
|
57 |
|
|
|
17 |
|
18 |
Audio classification:
|
19 |
|
20 |
+
| Dataset | Split Method | Classes | Task | # Clips | Average Duration | Sampling Rate |
|
21 |
+
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|
22 |
+
| WMMS | TT | 32 | Multi-class | 1697 | 10.42 | 16000 |
|
23 |
+
| MSWC (English) | TVT | 271 | Multi-class | 33726 | 0.99 | 16000 |
|
24 |
+
| MSWC (Spanish) | TVT | 146 | Multi-class | 11759 | 0.99 | 16000 |
|
25 |
+
| MSWC (Indian) | TVT | 14 | Multi-class | 739 | 0.99 | 16000 |
|
26 |
+
| ESC50 | 5-fold | 50 | Multi-class | 2000 | 5.00 | 44100 |
|
27 |
+
| UrbanSound8K | | 10 | Multi-class | | | |
|
28 |
+
| AudioSet | | 527 | Multi-label | | | |
|
29 |
+
| MagnaTagATune | | | Multi-label | | | |
|
30 |
+
| Medley-solos-DB | | 8 | Multi-class | | | 44100 |
|
31 |
+
| Pianos | TVT | 8 | Multi-class | 668 | 4.86 | 16000 |
|
32 |
+
| FSD-Kaggle-2019 (curated) | TT | 80 | Multi-label | 9451 | 8.93 | 44100 |
|
33 |
+
| GTZAN | TVT | 10 | Multi-class | 930 | 30.02 | 22050 |
|
34 |
+
| Nsynth (instrument) | TVT | 11 | Multi-class | 305979 | 4.00 | 16000 |
|
35 |
+
| Nsynth (pitch) | TVT | 112 | Multi-class | 305979 | 4.00 | 16000 |
|
36 |
+
| CREMA-D | TVT | 6 | Multi-class | 7442 | 2.54 | 16000 |
|
37 |
+
| IEMOCAP | 5-fold | 4 | Multi-class | 5531 | 4.52 | 16000 |
|
38 |
+
| EmoDB | TT | 7 | Multi-class | 535 | 2.77 | 16000 |
|
39 |
+
| EMOVO | 6-fold | 7 | Multi-class | 588 | 3.12 | 48000 |
|
40 |
+
| IRMAS | TT | 11 | Multi-label | 9579 | 7.16 | 44100 |
|
41 |
+
| RAVDESS | 5-fold | 8 | Multi-class | 2880 | 3.70 | 48000 |
|
42 |
+
| TIMIT | TVT | 630 | Multi-class | 6300 | 3.07 | 16000 |
|
43 |
+
| LibriSpeech | TT | 2484 | Multi-class | 21933 | 3.75 | 16000 |
|
44 |
|
45 |
Automated audio captioning:
|
46 |
|
47 |
+
| Dataset | # Clips | Average Duration | Sampling Rate |
|
48 |
+
| :---: | :---: | :---: | :---: |
|
49 |
+
| Music4All | | | |
|
50 |
|
51 |
Music, speech, and noise:
|
52 |
|
53 |
+
| Dataset | # Clips | Average Duration | Sampling Rate |
|
54 |
+
| :---: | :---: | :---: | :---: |
|
55 |
+
| MUSAN | 2016 | 195.16 | 16000 |
|
56 |
+
| RIR-Noise | 61260 | 1.54 | 16000 |
|
57 |
+
| ARCA23K | | | |
|
58 |
|
59 |
## Contact Us
|
60 |
|