Spaces:
Sleeping
Sleeping
Lakoc
commited on
Commit
·
643ae12
1
Parent(s):
8529a0b
Added more info about this space.
Browse files
app.py
CHANGED
@@ -153,7 +153,9 @@ yt_transcribe = gr.Interface(
|
|
153 |
outputs=["html", "text"],
|
154 |
title="Transcribe YouTube",
|
155 |
description=(
|
156 |
-
"
|
|
|
|
|
157 |
),
|
158 |
allow_flagging="never",
|
159 |
)
|
@@ -161,21 +163,41 @@ yt_transcribe = gr.Interface(
|
|
161 |
with demo:
|
162 |
gr.TabbedInterface([mf_transcribe, file_transcribe, yt_transcribe], ["Microphone", "Audio file", "YouTube"])
|
163 |
|
164 |
-
gr.Markdown(
|
165 |
-
"Disclaimer: This space currently runs on basic CPU hardware, so generation might take a bit longer. "
|
166 |
-
"You can clone the repository and run it locally for better performance. "
|
167 |
-
"Please refer to the [Hugging Face documentation](https://huggingface.co/docs/hub/spaces-overview#clone-the-repository) "
|
168 |
-
"on how to clone the repository and run it locally. "
|
169 |
-
"The model is not perfect and may make errors, so please use responsibly."
|
170 |
-
)
|
171 |
-
|
172 |
gr.Markdown(
|
173 |
"""
|
174 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
175 |
- [DeCRED Base](https://huggingface.co/BUT-FIT/DeCRED-base)
|
176 |
- [DeCRED Small](https://huggingface.co/BUT-FIT/DeCRED-small)
|
177 |
- [ED Base](https://huggingface.co/BUT-FIT/ED-base)
|
178 |
- [ED Small](https://huggingface.co/BUT-FIT/ED-small)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
179 |
"""
|
180 |
)
|
181 |
|
|
|
153 |
outputs=["html", "text"],
|
154 |
title="Transcribe YouTube",
|
155 |
description=(
|
156 |
+
"""
|
157 |
+
### *Currently only works on local instances of this space, as youtube-dl does not function from Hugging Face servers.*
|
158 |
+
Transcribe long-form YouTube videos with the click of a button! Select a model from the dropdown."""
|
159 |
),
|
160 |
allow_flagging="never",
|
161 |
)
|
|
|
163 |
with demo:
|
164 |
gr.TabbedInterface([mf_transcribe, file_transcribe, yt_transcribe], ["Microphone", "Audio file", "YouTube"])
|
165 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
166 |
gr.Markdown(
|
167 |
"""
|
168 |
+
## Overview
|
169 |
+
This space demonstrates the performance of **DeCRED** (**De**coder-**C**entric **R**egularization in **E**ncoder-**D**ecoder) for automatic speech recognition (ASR).
|
170 |
+
DeCRED enhances model robustness and generalization, particularly in out-of-domain scenarios, by introducing auxiliary classifiers in the decoder layers of encoder-decoder ASR architectures.
|
171 |
+
|
172 |
+
## Key Features
|
173 |
+
- **Auxiliary Classifiers**: DeCRED integrates auxiliary classifiers in the decoder module to regularize training, improving the model’s ability to generalize across domains.
|
174 |
+
- **Enhanced Decoding**: It proposes two new decoding strategies that leverage auxiliary classifiers to re-estimate token probabilities, resulting in more accurate ASR predictions.
|
175 |
+
- **Strong Baseline**: Built on the **E-branchformer** architecture, DeCRED achieves competitive word error rates (WER) compared to Whisper-medium and OWSM v3 while requiring significantly less training data and a smaller model size.
|
176 |
+
- **Out-of-Domain Performance**: DeCRED demonstrates strong generalization, reducing WERs by 2.7 and 2.9 points on the AMI and Gigaspeech datasets, respectively.
|
177 |
+
|
178 |
+
## Disclaimer
|
179 |
+
This space currently runs on basic CPU hardware, so generation might take a bit longer (approximately four times the length of the audio).
|
180 |
+
You can clone the repository and run it locally for better performance.
|
181 |
+
Please refer to the [Hugging Face documentation](https://huggingface.co/docs/hub/spaces-overview#clone-the-repository)
|
182 |
+
for instructions on how to clone the repository and run it locally.
|
183 |
+
The model is not perfect and may make errors, so please use it responsibly.
|
184 |
+
|
185 |
+
## Explore the Models
|
186 |
- [DeCRED Base](https://huggingface.co/BUT-FIT/DeCRED-base)
|
187 |
- [DeCRED Small](https://huggingface.co/BUT-FIT/DeCRED-small)
|
188 |
- [ED Base](https://huggingface.co/BUT-FIT/ED-base)
|
189 |
- [ED Small](https://huggingface.co/BUT-FIT/ED-small)
|
190 |
+
|
191 |
+
## Citation
|
192 |
+
If you use DeCRED in your research, please cite the following paper:
|
193 |
+
|
194 |
+
```bibtex
|
195 |
+
@misc{polok_2024_decred,
|
196 |
+
title={Improving Automatic Speech Recognition with Decoder-Centric Regularization in Encoder-Decoder Models},
|
197 |
+
author={Alexander Polok, Santosh Kesiraju, Karel Beneš, Lukáš Burget, Jan Černocký},
|
198 |
+
year={2024},
|
199 |
+
}
|
200 |
+
```
|
201 |
"""
|
202 |
)
|
203 |
|