You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

W2v-BERT 2.0 speech encoder fine-tuned to Galician and Spanish

Fine-tuned version of the Conformer-based W2v-BERT 2.0 speech encoder as described in Section 3.2.1 of the paper, which is at the core of our Seamless models.

This model was pre-trained on 4.5M hours of unlabeled audio data covering more than 143 languages. It requires finetuning to be used for downstream tasks such as Automatic Speech Recognition (ASR), or Audio Classification.

Model Name #params checkpoint
W2v-BERT 2.0 600M checkpoint

This model and its training are supported by 🤗 Transformers, more on it in the docs.

🤗 Transformers usage

This is a bare checkpoint without any modeling head, and thus requires finetuning to be used for downstream tasks such as ASR. You can however use it to extract audio embeddings from the top layer with this code snippet:

from transformers import AutoFeatureExtractor, Wav2Vec2BertModel
import torch
from datasets import load_dataset

dataset = load_dataset("hf-internal-testing/librispeech_asr_demo", "clean", split="validation")
dataset = dataset.sort("id")
sampling_rate = dataset.features["audio"].sampling_rate

processor = AutoProcessor.from_pretrained("andrespm/w2v-bert-2.0-multi-gl-es-v1.0")
model = Wav2Vec2BertModel.from_pretrained("andrespm/w2v-bert-2.0-multi-gl-es-v1.0")

# audio file is decoded on the fly
inputs = processor(dataset[0]["audio"]["array"], sampling_rate=sampling_rate, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

To learn more about the model use, refer to the following resources:

Downloads last month
2
Safetensors
Model size
606M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for andrespm/w2v-bert-2.0-multi-gl-es-v1.0

Finetuned
(345)
this model

Datasets used to train andrespm/w2v-bert-2.0-multi-gl-es-v1.0