Update README.md
Browse files
README.md
CHANGED
@@ -25,7 +25,12 @@ tags:
|
|
25 |
|
26 |
Frame-VAD Multilingual MarbleNet v2.0 is a convolutional neural network for voice activity detection (VAD) that serves as the first step for Speech Recognition and Speaker Diarization. It is a frame-based model that outputs a speech probability for each 20 millisecond frame of the input audio. The model has 91.5K parameters, making it lightweight and efficient for real-time applications. <br>
|
27 |
To reduce false positive errors — cases where the model incorrectly detects speech when none is present — the model was trained with white noise and real-word noise perturbations. During training, the volume of audios was also varied. Additionally, the training data includes non-speech audio samples to help the model distinguish between speech and non-speech sounds (such as coughing, laughter, and breathing, etc.) <br>
|
28 |
-
|
|
|
|
|
|
|
|
|
|
|
29 |
|
30 |
This model is ready for commercial use. <br>
|
31 |
|
|
|
25 |
|
26 |
Frame-VAD Multilingual MarbleNet v2.0 is a convolutional neural network for voice activity detection (VAD) that serves as the first step for Speech Recognition and Speaker Diarization. It is a frame-based model that outputs a speech probability for each 20 millisecond frame of the input audio. The model has 91.5K parameters, making it lightweight and efficient for real-time applications. <br>
|
27 |
To reduce false positive errors — cases where the model incorrectly detects speech when none is present — the model was trained with white noise and real-word noise perturbations. During training, the volume of audios was also varied. Additionally, the training data includes non-speech audio samples to help the model distinguish between speech and non-speech sounds (such as coughing, laughter, and breathing, etc.) <br>
|
28 |
+
|
29 |
+
**Key Features**
|
30 |
+
- Lightweight model with only 91.5K parameters
|
31 |
+
- Robust against false positive errors
|
32 |
+
- Outputs speech probability for each 20 ms audio frame
|
33 |
+
- Multilingual support: Chinese, German, Russian, English, Spanish, and French
|
34 |
|
35 |
This model is ready for commercial use. <br>
|
36 |
|