nianlong commited on
Commit
18ab89f
1 Parent(s): b298633

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -1
README.md CHANGED
@@ -4,7 +4,7 @@ license: apache-2.0
4
  # Positive Transfer Of The Whisper Speech Transformer To Human And Animal Voice Activity Detection
5
  We proposed WhisperSeg, utilizing the Whisper Transformer pre-trained for Automatic Speech Recognition (ASR) for both human and animal Voice Activity Detection (VAD). For more details, please refer to our paper
6
 
7
- >
8
  > [Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection](https://doi.org/10.1101/2023.09.30.560270)
9
  >
10
  > Nianlong Gu, Kanghwi Lee, Maris Basha, Sumit Kumar Ram, Guanghao You, Richard H. R. Hahnloser <br>
@@ -57,5 +57,22 @@ spec_viewer.visualize( audio = audio, sr = sr, min_frequency= min_frequency, pre
57
  Run it in Google Colab: <a href="https://colab.research.google.com/github/nianlonggu/WhisperSeg/blob/master/docs/WhisperSeg_Voice_Activity_Detection_Demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
58
  For more details, please refer to the GitHub repository: https://github.com/nianlonggu/WhisperSeg
59
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
  ## Contact
61
 
4
  # Positive Transfer Of The Whisper Speech Transformer To Human And Animal Voice Activity Detection
5
  We proposed WhisperSeg, utilizing the Whisper Transformer pre-trained for Automatic Speech Recognition (ASR) for both human and animal Voice Activity Detection (VAD). For more details, please refer to our paper
6
 
7
+
8
  > [Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection](https://doi.org/10.1101/2023.09.30.560270)
9
  >
10
  > Nianlong Gu, Kanghwi Lee, Maris Basha, Sumit Kumar Ram, Guanghao You, Richard H. R. Hahnloser <br>
 
57
  Run it in Google Colab: <a href="https://colab.research.google.com/github/nianlonggu/WhisperSeg/blob/master/docs/WhisperSeg_Voice_Activity_Detection_Demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
58
  For more details, please refer to the GitHub repository: https://github.com/nianlonggu/WhisperSeg
59
 
60
+ ## Citation
61
+ When using our code or models for your work, please cite the following paper:
62
+ ```
63
+ @article {Gu2023.09.30.560270,
64
+ author = {Nianlong Gu and Kanghwi Lee and Maris Basha and Sumit Kumar Ram and Guanghao You and Richard Hahnloser},
65
+ title = {Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection},
66
+ elocation-id = {2023.09.30.560270},
67
+ year = {2023},
68
+ doi = {10.1101/2023.09.30.560270},
69
+ publisher = {Cold Spring Harbor Laboratory},
70
+ abstract = {This paper introduces WhisperSeg, utilizing the Whisper Transformer pre-trained for Automatic Speech Recognition (ASR) for human and animal Voice Activity Detection (VAD). Contrary to traditional methods that detect human voice or animal vocalizations from a short audio frame and rely on careful threshold selection, WhisperSeg processes entire spectrograms of long audio and generates plain text representations of onset, offset, and type of voice activity. Processing a longer audio context with a larger network greatly improves detection accuracy from few labeled examples. We further demonstrate a positive transfer of detection performance to new animal species, making our approach viable in the data-scarce multi-species setting.Competing Interest StatementThe authors have declared no competing interest.},
71
+ URL = {https://www.biorxiv.org/content/early/2023/10/02/2023.09.30.560270},
72
+ eprint = {https://www.biorxiv.org/content/early/2023/10/02/2023.09.30.560270.full.pdf},
73
+ journal = {bioRxiv}
74
+ }
75
+ ```
76
+
77
  ## Contact
78