evie-8 commited on
Commit
00c581f
1 Parent(s): 56fe7f6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -1
README.md CHANGED
@@ -26,8 +26,53 @@ It achieves the following results on the evaluation set:
26
  - Confusion: 0.0529
27
 
28
  ## Model description
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
- More information needed
31
 
32
  ## Intended uses & limitations
33
 
 
26
  - Confusion: 0.0529
27
 
28
  ## Model description
29
+ This segmentation model has been trained on English data (Callhome) using [diarizers](https://github.com/huggingface/diarizers/tree/main).
30
+ It can be loaded with two lines of code:
31
+
32
+ ```python
33
+ from diarizers import SegmentationModel
34
+
35
+ segmentation_model = SegmentationModel().from_pretrained('evie-8/speaker-segmentation-fine-tuned-callhome-eng')
36
+ ```
37
+
38
+ To use it within a pyannote speaker diarization pipeline, load the [pyannote/speaker-diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1) pipeline, and convert the model to a pyannote compatible format:
39
+
40
+ ```python
41
+
42
+ from pyannote.audio import Pipeline
43
+ import torch
44
+
45
+ device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
46
+
47
+
48
+ # load the pre-trained pyannote pipeline
49
+ pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1")
50
+ pipeline.to(device)
51
+
52
+ # replace the segmentation model with your fine-tuned one
53
+ model = segmentation_model.to_pyannote_model()
54
+ pipeline._segmentation.model = model.to(device)
55
+ ```
56
+
57
+ You can now use the pipeline on audio examples:
58
+
59
+ ```python
60
+ # load dataset example
61
+ dataset = load_dataset("diarizers-community/callhome", "eng", split="data")
62
+ sample = dataset[0]["audio"]
63
+
64
+ # pre-process inputs
65
+ sample["waveform"] = torch.from_numpy(sample.pop("array")[None, :]).to(device, dtype=model.dtype)
66
+ sample["sample_rate"] = sample.pop("sampling_rate")
67
+
68
+ # perform inference
69
+ diarization = pipeline(sample)
70
+
71
+ # dump the diarization output to disk using RTTM format
72
+ with open("audio.rttm", "w") as rttm:
73
+ diarization.write_rttm(rttm)
74
+ ```
75
 
 
76
 
77
  ## Intended uses & limitations
78