Lakoc commited on
Commit
643ae12
·
1 Parent(s): 8529a0b

Added more info about this space.

Browse files
Files changed (1) hide show
  1. app.py +32 -10
app.py CHANGED
@@ -153,7 +153,9 @@ yt_transcribe = gr.Interface(
153
  outputs=["html", "text"],
154
  title="Transcribe YouTube",
155
  description=(
156
- "Transcribe long-form YouTube videos with the click of a button! Select a model from the dropdown."
 
 
157
  ),
158
  allow_flagging="never",
159
  )
@@ -161,21 +163,41 @@ yt_transcribe = gr.Interface(
161
  with demo:
162
  gr.TabbedInterface([mf_transcribe, file_transcribe, yt_transcribe], ["Microphone", "Audio file", "YouTube"])
163
 
164
- gr.Markdown(
165
- "Disclaimer: This space currently runs on basic CPU hardware, so generation might take a bit longer. "
166
- "You can clone the repository and run it locally for better performance. "
167
- "Please refer to the [Hugging Face documentation](https://huggingface.co/docs/hub/spaces-overview#clone-the-repository) "
168
- "on how to clone the repository and run it locally. "
169
- "The model is not perfect and may make errors, so please use responsibly."
170
- )
171
-
172
  gr.Markdown(
173
  """
174
- ### Explore the Models:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
175
  - [DeCRED Base](https://huggingface.co/BUT-FIT/DeCRED-base)
176
  - [DeCRED Small](https://huggingface.co/BUT-FIT/DeCRED-small)
177
  - [ED Base](https://huggingface.co/BUT-FIT/ED-base)
178
  - [ED Small](https://huggingface.co/BUT-FIT/ED-small)
 
 
 
 
 
 
 
 
 
 
 
179
  """
180
  )
181
 
 
153
  outputs=["html", "text"],
154
  title="Transcribe YouTube",
155
  description=(
156
+ """
157
+ ### *Currently only works on local instances of this space, as youtube-dl does not function from Hugging Face servers.*
158
+ Transcribe long-form YouTube videos with the click of a button! Select a model from the dropdown."""
159
  ),
160
  allow_flagging="never",
161
  )
 
163
  with demo:
164
  gr.TabbedInterface([mf_transcribe, file_transcribe, yt_transcribe], ["Microphone", "Audio file", "YouTube"])
165
 
 
 
 
 
 
 
 
 
166
  gr.Markdown(
167
  """
168
+ ## Overview
169
+ This space demonstrates the performance of **DeCRED** (**De**coder-**C**entric **R**egularization in **E**ncoder-**D**ecoder) for automatic speech recognition (ASR).
170
+ DeCRED enhances model robustness and generalization, particularly in out-of-domain scenarios, by introducing auxiliary classifiers in the decoder layers of encoder-decoder ASR architectures.
171
+
172
+ ## Key Features
173
+ - **Auxiliary Classifiers**: DeCRED integrates auxiliary classifiers in the decoder module to regularize training, improving the model’s ability to generalize across domains.
174
+ - **Enhanced Decoding**: It proposes two new decoding strategies that leverage auxiliary classifiers to re-estimate token probabilities, resulting in more accurate ASR predictions.
175
+ - **Strong Baseline**: Built on the **E-branchformer** architecture, DeCRED achieves competitive word error rates (WER) compared to Whisper-medium and OWSM v3 while requiring significantly less training data and a smaller model size.
176
+ - **Out-of-Domain Performance**: DeCRED demonstrates strong generalization, reducing WERs by 2.7 and 2.9 points on the AMI and Gigaspeech datasets, respectively.
177
+
178
+ ## Disclaimer
179
+ This space currently runs on basic CPU hardware, so generation might take a bit longer (approximately four times the length of the audio).
180
+ You can clone the repository and run it locally for better performance.
181
+ Please refer to the [Hugging Face documentation](https://huggingface.co/docs/hub/spaces-overview#clone-the-repository)
182
+ for instructions on how to clone the repository and run it locally.
183
+ The model is not perfect and may make errors, so please use it responsibly.
184
+
185
+ ## Explore the Models
186
  - [DeCRED Base](https://huggingface.co/BUT-FIT/DeCRED-base)
187
  - [DeCRED Small](https://huggingface.co/BUT-FIT/DeCRED-small)
188
  - [ED Base](https://huggingface.co/BUT-FIT/ED-base)
189
  - [ED Small](https://huggingface.co/BUT-FIT/ED-small)
190
+
191
+ ## Citation
192
+ If you use DeCRED in your research, please cite the following paper:
193
+
194
+ ```bibtex
195
+ @misc{polok_2024_decred,
196
+ title={Improving Automatic Speech Recognition with Decoder-Centric Regularization in Encoder-Decoder Models},
197
+ author={Alexander Polok, Santosh Kesiraju, Karel Beneš, Lukáš Burget, Jan Černocký},
198
+ year={2024},
199
+ }
200
+ ```
201
  """
202
  )
203