pre-trained model

by sjhu - opened Jul 17, 2022

sjhu

Jul 17, 2022

Hi, as the intro said "Wav2Vec2-Conformer with relative position embeddings, pre-trained and fine-tuned on 960 hours of Librispeech on 16kHz sampled speech audio", but in the Github readme, the pre-train dataset is 60k Libri-Light.

Which one is correct?

Thanks a lot!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment