Spaces:
Sleeping
Sleeping
Update app.py
Browse files
app.py
CHANGED
@@ -164,7 +164,7 @@ with gr.Blocks(theme=theme, css=css) as demo:
|
|
164 |
gr.Markdown(
|
165 |
f"""
|
166 |
## 🎶YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
|
167 |
-
|
168 |
- Model name: `{model_name}`
|
169 |
- Encoder backbone: Perceiver-TF + Mixture of Experts (2/8)
|
170 |
- Decoder backbone: Multi-channel T5-small
|
@@ -173,7 +173,7 @@ with gr.Blocks(theme=theme, css=css) as demo:
|
|
173 |
- Augmentation strategy: Intra-/Cross dataset stem augment, No Pitch-shifting
|
174 |
- FP Precision: BF16-mixed for training, FP16 for inference
|
175 |
|
176 |
-
|
177 |
- Currently running on CPU, and it takes longer than 3 minutes for a 30-second input. Please try [GPU-HuggingFace-demo](mimbres/YourMT3) for fast inference.
|
178 |
- For acadmic reproduction purpose, we strongly recommend to use [Colab Demo](https://colab.research.google.com/drive/1AgOVEBfZknDkjmSRA7leoa81a2vrnhBG?usp=sharing) with multiple checkpoints.
|
179 |
|
|
|
164 |
gr.Markdown(
|
165 |
f"""
|
166 |
## 🎶YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
|
167 |
+
## Model card:
|
168 |
- Model name: `{model_name}`
|
169 |
- Encoder backbone: Perceiver-TF + Mixture of Experts (2/8)
|
170 |
- Decoder backbone: Multi-channel T5-small
|
|
|
173 |
- Augmentation strategy: Intra-/Cross dataset stem augment, No Pitch-shifting
|
174 |
- FP Precision: BF16-mixed for training, FP16 for inference
|
175 |
|
176 |
+
## Caution:
|
177 |
- Currently running on CPU, and it takes longer than 3 minutes for a 30-second input. Please try [GPU-HuggingFace-demo](mimbres/YourMT3) for fast inference.
|
178 |
- For acadmic reproduction purpose, we strongly recommend to use [Colab Demo](https://colab.research.google.com/drive/1AgOVEBfZknDkjmSRA7leoa81a2vrnhBG?usp=sharing) with multiple checkpoints.
|
179 |
|