speechbrainteam TanelAlumae commited on
Commit
0253049
1 Parent(s): 080588e

Update README.md (#11)

Browse files

- Update README.md (1483d2d5c668d7921c757fc7cd014fa2cbe7dc5a)


Co-authored-by: Tanel Alumäe <[email protected]>

Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -133,7 +133,7 @@ widget:
133
 
134
  ## Model description
135
 
136
- This is a spoken language recognition model trained on the VoxLingua107 dataset using SpeechBrain.
137
  The model uses the ECAPA-TDNN architecture that has previously been used for speaker recognition. However, it uses
138
  more fully connected hidden layers after the embedding layer, and cross-entropy loss was used for training.
139
  We observed that this improved the performance of extracted utterance embeddings for downstream tasks.
@@ -259,7 +259,7 @@ The model has two uses:
259
  - use as an utterance-level feature (embedding) extractor, for creating a dedicated language ID model on your own data
260
 
261
  The model is trained on automatically collected YouTube data. For more
262
- information about the dataset, see [here](http://bark.phon.ioc.ee/voxlingua107/).
263
 
264
 
265
  #### How to use
@@ -330,7 +330,7 @@ Since the model is trained on VoxLingua107, it has many limitations and biases,
330
 
331
  ## Training data
332
 
333
- The model is trained on [VoxLingua107](http://bark.phon.ioc.ee/voxlingua107/).
334
 
335
  VoxLingua107 is a speech dataset for training spoken language identification models.
336
  The dataset consists of short speech segments automatically extracted from YouTube videos and labeled according the language of the video title and description, with some post-processing steps to filter out false positives.
 
133
 
134
  ## Model description
135
 
136
+ This is a spoken language recognition model trained on the [VoxLingua107 dataset](https://cs.taltech.ee/staff/tanel.alumae/data/voxlingua107/) using SpeechBrain.
137
  The model uses the ECAPA-TDNN architecture that has previously been used for speaker recognition. However, it uses
138
  more fully connected hidden layers after the embedding layer, and cross-entropy loss was used for training.
139
  We observed that this improved the performance of extracted utterance embeddings for downstream tasks.
 
259
  - use as an utterance-level feature (embedding) extractor, for creating a dedicated language ID model on your own data
260
 
261
  The model is trained on automatically collected YouTube data. For more
262
+ information about the dataset, see [here](https://cs.taltech.ee/staff/tanel.alumae/data/voxlingua107/).
263
 
264
 
265
  #### How to use
 
330
 
331
  ## Training data
332
 
333
+ The model is trained on [VoxLingua107](https://cs.taltech.ee/staff/tanel.alumae/data/voxlingua107/).
334
 
335
  VoxLingua107 is a speech dataset for training spoken language identification models.
336
  The dataset consists of short speech segments automatically extracted from YouTube videos and labeled according the language of the video title and description, with some post-processing steps to filter out false positives.