README.md · diarizers-community/README at c09d4360c3127b3149795acd873d512cc6bc1858

metadata

title: README
emoji: 🏃
colorFrom: indigo
colorTo: purple
sdk: static
pinned: false

diarizers-community aims to promote speaker diarization on the Hugging Face hub. It contains:

A collection of multilingual speaker diarization datasets that are compatible with the diarizers library. They have been processed using diarizers scripts.

The available datasets are the CallHome (Japanese, Chinese, German, Spanish, English), AMI Corpus (English), Vox-Converse (English) and Simsamu (French). We aim to add more datasets in the future to better support speaker diarising on the Hub.

A collection of multilingual fine-tuned segmentation model baselines compatible with pyannote.

Each model has been fine-tuned on a specific Callhome language subset. They achieve better performances on multilingual data compared to pyannote's pre-trained segmentation-3.0 model:

First Header	Second Header
Content Cell	Content Cell
Content Cell	Content Cell

Note: Results have been obtained using the test script from diarizers.

Together with diarizers-community, we release:

diarizers, a library for fine-tuning pyannote speaker diarization models using the Hugging Face ecosystem.
A google colab notebook, with a step-by-step guide on how to use diarizers.

Edit this README.md markdown file to author your organization card.