YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Overview
We present a CLSRIL-23 (Cross Lingual Speech Representations on Indic Languages), a self supervised learning based audio pre-trained model which learns cross lingual speech representations from raw audio across 23 Indic languages. It is built on top of wav2vec 2.0 which is solved by training a contrastive task over masked latent speech representations and jointly learns the quantization of latents shared across all languages.
Original Repo contains models in fairseq format.
Languages in the pretraining dataset
Language | Data (In Hrs) |
---|---|
Assamese | 254.9 |
Bengali | 331.3 |
Bodo | 26.9 |
Dogri | 17.1 |
English | 819.7 |
Gujarati | 336.7 |
Hindi | 4563.7 |
Kannada | 451.8 |
Kashmiri | 67.8 |
Konkani | 36.8 |
Maithili | 113.8 |
Malayalam | 297.7 |
Manipuri | 171.9 |
Marathi | 458.2 |
Nepali | 31.6 |
Odia | 131.4 |
Punjabi | 486.05 |
Sanskrit | 58.8 |
Santali | 6.56 |
Sindhi | 16 |
Tamil | 542.6 |
Telugu | 302.8 |
Urdu | 259.68 |
Repo for training:
Experimentation platform built on top of fairseq.
- Downloads last month
- 22
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.