stp99 commited on
Commit
8012980
·
verified ·
1 Parent(s): 469e9fd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -26
README.md CHANGED
@@ -80,38 +80,38 @@ Important: before you can use this model, you must follow these steps:
80
  3. Download the Auto-AVSR video encoder weights from [here](https://drive.google.com/file/d/1shcWXUK2iauRhW9NbwCc25FjU1CoMm8i/view?usp=sharing) and put them in `path/to/some_directory_2`
81
  4. Go to `config.json` and change the `video_encoder._name_or_path` to `path/to/some_directory_2/vsr_trlrs3vox2_base.pth`
82
 
83
- ## Training data
84
 
85
  ### Monolingual
86
 
87
- | TASK | Task name | Dataset | License | Metric(s) |
88
- | -------- | ---------------------------- | ------------------ | --------------- | ------------------------- |
89
- | **ASR** | Automatic Speech Recognition | **LibriHeavy** | CC-BY-4.0 | WER |
90
- | | | **CommonVoice** | Apache-2.0 | |
91
- | | | **LibriTTS** | CC BY 4.0 | |
92
- | | | **Spoken SQUAD** | CC-BY-SA-4.0 | |
93
- | | | **Speech-Massive** | CC-BY-NC-SA-4.0 | |
94
- | **VSR** | Visual Speech Recognition | **LRS2-BBC** | Custom | WER |
95
- | **SSUM** | Speech Summarization | **AMI** | CC-BY-4.0 | Rouge-1, Rouge-2, Rouge-L |
96
- | | | **ICSI** | CC-BY-4.0 | |
97
- | **SQA** | Spoken Question Answering | **Spoken SQUAD** | CC-BY-SA-4.0 | Accuracy, Exact Match, F1 |
98
 
99
  ### Multilingual
100
 
101
- | TASK | Task name | Dataset | License | Metric(s) |
102
- | ---------------- | ----------------------------- | ------------------------------------ | ------------------------------------------ | ------------------- |
103
- | **ST** | Speech-to-text Translation | **CoVoST2** | CC0 | BLEU, COMET, BLEURT |
104
- | | | **FLEURS** | CC-BY-4.0 | |
105
- | | | **EuroParl-ST** | CC-BY-NC-4.0 | |
106
- | | | **ACL 60/60** | CC-BY-4.0 | |
107
- | **MT** | Machine Translation | **FLORES** | CC-BY-SA-4.0 | BLEU, COMET, BLEURT |
108
- | | | **ACL 60/60** | CC-BY-4.0 | |
109
- | | | **EuroParl-ST** | CC-BY-NC-4.0 | |
110
- | **TextInstruct** | Text Instruction Following | **Everything_Instruct_Multilingual** | Apache-2.0 | MMLU |
111
- | **SLU** | Spoken Language Understanding | **Speech-Massive** | CC-BY-NC-SA-4.0 | Intent Accuracy |
112
- | | | **SLURP** | CC BY 4.0 (text) <br> CC BY-NC 4.0 (audio) | |
113
-
114
- ## Evaluation data
115
  coming soon...
116
 
117
  ## Framework versions
 
80
  3. Download the Auto-AVSR video encoder weights from [here](https://drive.google.com/file/d/1shcWXUK2iauRhW9NbwCc25FjU1CoMm8i/view?usp=sharing) and put them in `path/to/some_directory_2`
81
  4. Go to `config.json` and change the `video_encoder._name_or_path` to `path/to/some_directory_2/vsr_trlrs3vox2_base.pth`
82
 
83
+ ## Training Data
84
 
85
  ### Monolingual
86
 
87
+ | TASK | Task name | Dataset | License |
88
+ | -------- | ---------------------------- | ------------------ | --------------- |
89
+ | **ASR** | Automatic Speech Recognition | **LibriHeavy** | CC-BY-4.0 |
90
+ | | | **CommonVoice** | Apache-2.0 |
91
+ | | | **LibriTTS** | CC BY 4.0 |
92
+ | | | **Spoken SQUAD** | CC-BY-SA-4.0 |
93
+ | | | **Speech-Massive** | CC-BY-NC-SA-4.0 |
94
+ | **VSR** | Visual Speech Recognition | **LRS2-BBC** | Custom |
95
+ | **SSUM** | Speech Summarization | **AMI** | CC-BY-4.0 |
96
+ | | | **ICSI** | CC-BY-4.0 |
97
+ | **SQA** | Spoken Question Answering | **Spoken SQUAD** | CC-BY-SA-4.0 |
98
 
99
  ### Multilingual
100
 
101
+ | TASK | Task name | Dataset | License |
102
+ | ---------------- | ----------------------------- | ------------------------------------ | ------------------------------------------ |
103
+ | **ST** | Speech-to-text Translation | **CoVoST2** | CC0 |
104
+ | | | **FLEURS** | CC-BY-4.0 |
105
+ | | | **EuroParl-ST** | CC-BY-NC-4.0 |
106
+ | | | **ACL 60/60** | CC-BY-4.0 |
107
+ | **MT** | Machine Translation | **FLORES** | CC-BY-SA-4.0 |
108
+ | | | **ACL 60/60** | CC-BY-4.0 |
109
+ | | | **EuroParl-ST** | CC-BY-NC-4.0 |
110
+ | **TextInstruct** | Text Instruction Following | **Everything_Instruct_Multilingual** | Apache-2.0 |
111
+ | **SLU** | Spoken Language Understanding | **Speech-Massive** | CC-BY-NC-SA-4.0 |
112
+ | | | **SLURP** | CC BY 4.0 (text) <br> CC BY-NC 4.0 (audio) |
113
+
114
+ ## Evaluation Results
115
  coming soon...
116
 
117
  ## Framework versions