Marianoleiras commited on
Commit
a794f21
·
verified ·
1 Parent(s): dde4de1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -8
README.md CHANGED
@@ -22,15 +22,10 @@ should probably proofread and complete it, then remove this comment. -->
22
  # whisper-small-es-ja
23
 
24
  ## Model Overview
25
- This model is a fine-tuned version of OpenAI's Whisper-small, specifically trained on the **Marianoleiras/voxpopuli_es-ja** dataset for Spanish-to-Japanese speech-to-text (STT) tasks.
26
- It employs the Whisper architecture, which is known for its robustness in multilingual speech recognition and translation scenarios.
27
 
28
- The primary goal of this model is to enable accurate end-to-end transcription and translation of spoken Spanish into written Japanese.
29
- It was developed as part of a **three-week workshop organized by Yasmin Moslem**, focusing on speech-to-text pipelines.
30
- The workshop involved:
31
- 1. **Dataset creation** during the first week.
32
- 2. **Model training and optimization** during the second week.
33
- 3. **In-depth exploration and evaluation** in the third week.
34
 
35
  The model achieves competitive performance metrics on the provided dataset:
36
 
 
22
  # whisper-small-es-ja
23
 
24
  ## Model Overview
25
+ This model was developed as part of a workshop organized by Yasmin Moslem, focusing on **speech-to-text pipelines**.
26
+ The workshop's primary goal was to enable accurate transcription and translation of spoken source languages into written target languages while learning about end-to-end and cascaded approaches in the process.
27
 
28
+ This model represents an **end-to-end solution** for Spanish-to-Japanese speech-to-text (STT) tasks and is a fine-tuned version of OpenAI's Whisper-small, specifically trained on the **[Marianoleiras/voxpopuli_es-ja](https://huggingface.co/datasets/Marianoleiras/voxpopuli_es-ja)** dataset for Spanish-to-Japanese speech-to-text (STT) tasks.
 
 
 
 
 
29
 
30
  The model achieves competitive performance metrics on the provided dataset:
31