Model Description

The Wav2vec2 base model facebook/wav2vec2-base-960h fine tuned on phoneme recognition task for the dutch language.

Usage

To transcribe in phonemes audio files the model can be used as a standalone acoustic model as follows:

 from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC
 from datasets import load_dataset
 import torch
 
 # load model and tokenizer
 processor = Wav2Vec2Processor.from_pretrained("Clementapa/wav2vec2-base-960h-phoneme-reco-dutch")
 model = Wav2Vec2ForCTC.from_pretrained("Clementapa/wav2vec2-base-960h-phoneme-reco-dutch")
     
 # load dummy dataset and read soundfiles
 ds = load_dataset("common_voice", "nl", split="validation")
 
 # tokenize
 input_values = processor(ds[0]["audio"]["array"], return_tensors="pt", padding="longest").input_values  # Batch size 1
 
 # retrieve logits
 logits = model(input_values).logits
 
 # take argmax and decode
 predicted_ids = torch.argmax(logits, dim=-1)
 transcription = processor.batch_decode(predicted_ids)
Downloads last month
155
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Clementapa/wav2vec2-base-960h-phoneme-reco-dutch

Space using Clementapa/wav2vec2-base-960h-phoneme-reco-dutch 1

Evaluation results