SSL model
Collection
Self-Supervised Learning model
•
4 items
•
Updated
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
Please read Apache License, Version 2.0 before downloading this model.
Log in or Sign Up to review the conditions and access this model content.
imprt/kushinada-hubert-large
This is a Japanese HuBERT Large model pre-trained using 62215 hours of audio extracted from large-scale Japanese TV broadcast audio data by voice activity detection.
This model was trained using code from the official repository.
import soundfile as sf
from transformers import AutoFeatureExtractor
model = "imprt/kushinada-hubert-large"
feature_extractor = AutoFeatureExtractor.from_pretrained(model)
audio_file="/path/to/16k_audio_file"
audio_input, sr = sf.read(audio_file)
feature_extractor(audio_input, sampling_rate=sr)
@article{journals/corr/abs-2106-07447,
added-at = {2021-06-16T00:00:00.000+0200},
author = {Hsu, Wei-Ning and Bolte, Benjamin and Tsai, Yao-Hung Hubert and Lakhotia, Kushal and Salakhutdinov, Ruslan and Mohamed, Abdelrahman},
biburl = {https://www.bibsonomy.org/bibtex/2435bd8c9ac37a4eab204ded15e9f8918/dblp},
ee = {https://arxiv.org/abs/2106.07447},
interhash = {c85407653eddc9c9256c261afe8d6954},
intrahash = {435bd8c9ac37a4eab204ded15e9f8918},
journal = {CoRR},
keywords = {dblp},
timestamp = {2024-04-08T22:55:35.000+0200},
title = {HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units.},
url = {http://dblp.uni-trier.de/db/journals/corr/corr2106.html#abs-2106-07447},
volume = {abs/2106.07447},
year = 2021
}