Update README.md
Browse files
README.md
CHANGED
@@ -80,38 +80,38 @@ Important: before you can use this model, you must follow these steps:
|
|
80 |
3. Download the Auto-AVSR video encoder weights from [here](https://drive.google.com/file/d/1shcWXUK2iauRhW9NbwCc25FjU1CoMm8i/view?usp=sharing) and put them in `path/to/some_directory_2`
|
81 |
4. Go to `config.json` and change the `video_encoder._name_or_path` to `path/to/some_directory_2/vsr_trlrs3vox2_base.pth`
|
82 |
|
83 |
-
## Training
|
84 |
|
85 |
### Monolingual
|
86 |
|
87 |
-
| TASK | Task name | Dataset | License |
|
88 |
-
| -------- | ---------------------------- | ------------------ | --------------- |
|
89 |
-
| **ASR** | Automatic Speech Recognition | **LibriHeavy** | CC-BY-4.0 |
|
90 |
-
| | | **CommonVoice** | Apache-2.0 |
|
91 |
-
| | | **LibriTTS** | CC BY 4.0 |
|
92 |
-
| | | **Spoken SQUAD** | CC-BY-SA-4.0 |
|
93 |
-
| | | **Speech-Massive** | CC-BY-NC-SA-4.0 |
|
94 |
-
| **VSR** | Visual Speech Recognition | **LRS2-BBC** | Custom |
|
95 |
-
| **SSUM** | Speech Summarization | **AMI** | CC-BY-4.0 |
|
96 |
-
| | | **ICSI** | CC-BY-4.0 |
|
97 |
-
| **SQA** | Spoken Question Answering | **Spoken SQUAD** | CC-BY-SA-4.0 |
|
98 |
|
99 |
### Multilingual
|
100 |
|
101 |
-
| TASK | Task name | Dataset | License |
|
102 |
-
| ---------------- | ----------------------------- | ------------------------------------ | ------------------------------------------ |
|
103 |
-
| **ST** | Speech-to-text Translation | **CoVoST2** | CC0 |
|
104 |
-
| | | **FLEURS** | CC-BY-4.0 |
|
105 |
-
| | | **EuroParl-ST** | CC-BY-NC-4.0 |
|
106 |
-
| | | **ACL 60/60** | CC-BY-4.0 |
|
107 |
-
| **MT** | Machine Translation | **FLORES** | CC-BY-SA-4.0 |
|
108 |
-
| | | **ACL 60/60** | CC-BY-4.0 |
|
109 |
-
| | | **EuroParl-ST** | CC-BY-NC-4.0 |
|
110 |
-
| **TextInstruct** | Text Instruction Following | **Everything_Instruct_Multilingual** | Apache-2.0 |
|
111 |
-
| **SLU** | Spoken Language Understanding | **Speech-Massive** | CC-BY-NC-SA-4.0 |
|
112 |
-
| | | **SLURP** | CC BY 4.0 (text) <br> CC BY-NC 4.0 (audio) |
|
113 |
-
|
114 |
-
## Evaluation
|
115 |
coming soon...
|
116 |
|
117 |
## Framework versions
|
|
|
80 |
3. Download the Auto-AVSR video encoder weights from [here](https://drive.google.com/file/d/1shcWXUK2iauRhW9NbwCc25FjU1CoMm8i/view?usp=sharing) and put them in `path/to/some_directory_2`
|
81 |
4. Go to `config.json` and change the `video_encoder._name_or_path` to `path/to/some_directory_2/vsr_trlrs3vox2_base.pth`
|
82 |
|
83 |
+
## Training Data
|
84 |
|
85 |
### Monolingual
|
86 |
|
87 |
+
| TASK | Task name | Dataset | License |
|
88 |
+
| -------- | ---------------------------- | ------------------ | --------------- |
|
89 |
+
| **ASR** | Automatic Speech Recognition | **LibriHeavy** | CC-BY-4.0 |
|
90 |
+
| | | **CommonVoice** | Apache-2.0 |
|
91 |
+
| | | **LibriTTS** | CC BY 4.0 |
|
92 |
+
| | | **Spoken SQUAD** | CC-BY-SA-4.0 |
|
93 |
+
| | | **Speech-Massive** | CC-BY-NC-SA-4.0 |
|
94 |
+
| **VSR** | Visual Speech Recognition | **LRS2-BBC** | Custom |
|
95 |
+
| **SSUM** | Speech Summarization | **AMI** | CC-BY-4.0 |
|
96 |
+
| | | **ICSI** | CC-BY-4.0 |
|
97 |
+
| **SQA** | Spoken Question Answering | **Spoken SQUAD** | CC-BY-SA-4.0 |
|
98 |
|
99 |
### Multilingual
|
100 |
|
101 |
+
| TASK | Task name | Dataset | License |
|
102 |
+
| ---------------- | ----------------------------- | ------------------------------------ | ------------------------------------------ |
|
103 |
+
| **ST** | Speech-to-text Translation | **CoVoST2** | CC0 |
|
104 |
+
| | | **FLEURS** | CC-BY-4.0 |
|
105 |
+
| | | **EuroParl-ST** | CC-BY-NC-4.0 |
|
106 |
+
| | | **ACL 60/60** | CC-BY-4.0 |
|
107 |
+
| **MT** | Machine Translation | **FLORES** | CC-BY-SA-4.0 |
|
108 |
+
| | | **ACL 60/60** | CC-BY-4.0 |
|
109 |
+
| | | **EuroParl-ST** | CC-BY-NC-4.0 |
|
110 |
+
| **TextInstruct** | Text Instruction Following | **Everything_Instruct_Multilingual** | Apache-2.0 |
|
111 |
+
| **SLU** | Spoken Language Understanding | **Speech-Massive** | CC-BY-NC-SA-4.0 |
|
112 |
+
| | | **SLURP** | CC BY 4.0 (text) <br> CC BY-NC 4.0 (audio) |
|
113 |
+
|
114 |
+
## Evaluation Results
|
115 |
coming soon...
|
116 |
|
117 |
## Framework versions
|