Spaces:
Running
Running
kamilakesbi
commited on
Commit
•
40c310c
1
Parent(s):
59a0a12
Update README.md
Browse files
README.md
CHANGED
@@ -7,11 +7,20 @@ sdk: static
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
-
[diarizers-community](https://huggingface.co/diarizers-community) aims to promote speaker diarization on the Hugging Face hub. It comes with [diarizers](https://github.com/kamilakesbi/diarizers), a library for fine-tuning pyannote models that is compatible with the Hugging Face ecosystem.
|
11 |
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
|
14 |
-
- A collection of speaker diarization datasets that are compatible with diarizers. They have been generated using [diarizers scripts](https://github.com/kamilakesbi/diarizers/blob/main/datasets/README.md). Each of these datasets comes with the following features:
|
15 |
|
16 |
|
17 |
|
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
+
[diarizers-community](https://huggingface.co/diarizers-community) aims to promote speaker diarization on the Hugging Face hub. It comes with [diarizers](https://github.com/kamilakesbi/diarizers), a library for fine-tuning pyannote speaker diarzaition models that is compatible with the Hugging Face ecosystem.
|
11 |
|
12 |
+
This organization contains:
|
13 |
+
|
14 |
+
- A collection of [multilingual speaker diarization datasets](https://huggingface.co/collections/diarizers-community/speaker-diarization-datasets-66261b8d571552066e003788) that are compatible with diarizers. They have been processed using [diarizers scripts](https://github.com/kamilakesbi/diarizers/blob/main/datasets/README.md).
|
15 |
+
|
16 |
+
The currently available datasets are the CallHome (Japanese, Chinese, German, Spanish, English), the AMI Corpus (English), Vox-Converse (English) and Simsamu (French). We aim at adding more datasets in the future to support speaker diarization on the Hub.
|
17 |
+
|
18 |
+
- A collection of [5 fine-tuned segmentation model](https://huggingface.co/collections/diarizers-community/models-66261d0f9277b825c807ff2a) baselines that can be used in a pyannote speaker diarization pipeline.
|
19 |
+
|
20 |
+
- Each model has been fine-tuned on a specific language of the Callhome dataset. In comparison to the pretrained [pyannote segmentation model](https://huggingface.co/pyannote/segmentation-3.0), they reach better performance on each of the Callhome test sets:
|
21 |
+
|
22 |
+
** ADD BENCHMARK **
|
23 |
|
|
|
24 |
|
25 |
|
26 |
|