Spaces:
Running
Running
kamilakesbi
commited on
Commit
•
2cbde18
1
Parent(s):
f20db24
Update README.md
Browse files
README.md
CHANGED
@@ -13,11 +13,19 @@ pinned: false
|
|
13 |
|
14 |
The available datasets are the CallHome (Japanese, Chinese, German, Spanish, English), AMI Corpus (English), Vox-Converse (English) and Simsamu (French). We aim to add more datasets in the future to better support speaker diarising on the Hub.
|
15 |
|
16 |
-
- A collection of multilingual [fine-tuned segmentation model](https://huggingface.co/collections/diarizers-community/models-66261d0f9277b825c807ff2a) baselines compatible with pyannote.
|
17 |
|
18 |
-
Each model has been fine-tuned on a specific Callhome language subset. They achieve better performances on multilingual data compared to pyannote's pre-trained [segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) model
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
|
|
|
|
|
21 |
| [Callhome](https://huggingface.co/datasets/diarizers-community/callhome) test dataset subset| Model | DER | False alarm | Missed detection| Confusion |
|
22 |
| ------------------------| ------------- | ------------- | ------------- | --------------- | ------------- |
|
23 |
| Japanese | [Pretrained](https://huggingface.co/pyannote/segmentation-3.0) | 25.44 | **2.30** | 17.45 | 5.69 |
|
@@ -33,11 +41,6 @@ Each model has been fine-tuned on a specific Callhome language subset. They achi
|
|
33 |
|
34 |
Results are in %. They have been obtained using the [test script](https://github.com/kamilakesbi/diarizers/blob/main/test_segmentation.py) from diarizers.
|
35 |
|
36 |
-
Together with diarizers-community, we release:
|
37 |
-
|
38 |
-
- [diarizers](https://github.com/kamilakesbi/diarizers/tree/main), a library for fine-tuning pyannote speaker diarization models using the Hugging Face ecosystem.
|
39 |
-
|
40 |
-
- A google colab [notebook](https://colab.research.google.com/github/kamilakesbi/notebooks/blob/main/fine_tune_pyannote.ipynb), with a step-by-step guide on how to use diarizers.
|
41 |
|
42 |
|
43 |
Edit this `README.md` markdown file to author your organization card.
|
|
|
13 |
|
14 |
The available datasets are the CallHome (Japanese, Chinese, German, Spanish, English), AMI Corpus (English), Vox-Converse (English) and Simsamu (French). We aim to add more datasets in the future to better support speaker diarising on the Hub.
|
15 |
|
16 |
+
- A collection of multilingual [fine-tuned segmentation model](https://huggingface.co/collections/diarizers-community/models-66261d0f9277b825c807ff2a) baselines compatible with pyannote.
|
17 |
|
18 |
+
Each model has been fine-tuned on a specific Callhome language subset. They achieve better performances on multilingual data compared to pyannote's pre-trained [segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) model (see benchmark for more details on model performance).
|
19 |
+
|
20 |
+
Together with diarizers-community, we release:
|
21 |
+
|
22 |
+
- [diarizers](https://github.com/kamilakesbi/diarizers/tree/main), a library for fine-tuning pyannote speaker diarization models using the Hugging Face ecosystem.
|
23 |
+
|
24 |
+
- A google colab [notebook](https://colab.research.google.com/github/kamilakesbi/notebooks/blob/main/fine_tune_pyannote.ipynb), with a step-by-step guide on how to use diarizers.
|
25 |
|
26 |
|
27 |
+
** Benchamrk: **
|
28 |
+
|
29 |
| [Callhome](https://huggingface.co/datasets/diarizers-community/callhome) test dataset subset| Model | DER | False alarm | Missed detection| Confusion |
|
30 |
| ------------------------| ------------- | ------------- | ------------- | --------------- | ------------- |
|
31 |
| Japanese | [Pretrained](https://huggingface.co/pyannote/segmentation-3.0) | 25.44 | **2.30** | 17.45 | 5.69 |
|
|
|
41 |
|
42 |
Results are in %. They have been obtained using the [test script](https://github.com/kamilakesbi/diarizers/blob/main/test_segmentation.py) from diarizers.
|
43 |
|
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
|
46 |
Edit this `README.md` markdown file to author your organization card.
|