JSWOOK commited on
Commit
8e63e81
·
verified ·
1 Parent(s): 263bb37

Upload 4 files

Browse files
Files changed (4) hide show
  1. README.md +73 -0
  2. config.json +18 -0
  3. model.safetensors +3 -0
  4. training_args.bin +3 -0
README.md ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ language:
4
+ - en
5
+ license: mit
6
+ base_model: pyannote/speaker-diarization-3.1
7
+ tags:
8
+ - speaker-diarization
9
+ - speaker-segmentation
10
+ - generated_from_trainer
11
+ datasets:
12
+ - diarizers-community/voxconverse
13
+ model-index:
14
+ - name: JSWOOK/pyannote_finetuning
15
+ results: []
16
+ ---
17
+
18
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
19
+ should probably proofread and complete it, then remove this comment. -->
20
+
21
+ # JSWOOK/pyannote_finetuning
22
+
23
+ This model is a fine-tuned version of [pyannote/speaker-diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1) on the diarizers-community/voxconverse dataset.
24
+ It achieves the following results on the evaluation set:
25
+ - Loss: 0.1283
26
+ - Model Preparation Time: 0.0036
27
+ - Der: 0.0490
28
+ - False Alarm: 0.0309
29
+ - Missed Detection: 0.0091
30
+ - Confusion: 0.0090
31
+
32
+ ## Model description
33
+
34
+ More information needed
35
+
36
+ ## Intended uses & limitations
37
+
38
+ More information needed
39
+
40
+ ## Training and evaluation data
41
+
42
+ More information needed
43
+
44
+ ## Training procedure
45
+
46
+ ### Training hyperparameters
47
+
48
+ The following hyperparameters were used during training:
49
+ - learning_rate: 0.001
50
+ - train_batch_size: 32
51
+ - eval_batch_size: 32
52
+ - seed: 42
53
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
54
+ - lr_scheduler_type: cosine
55
+ - num_epochs: 5
56
+
57
+ ### Training results
58
+
59
+ | Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Der | False Alarm | Missed Detection | Confusion |
60
+ |:-------------:|:-----:|:----:|:---------------:|:----------------------:|:------:|:-----------:|:----------------:|:---------:|
61
+ | No log | 1.0 | 21 | 0.1258 | 0.0036 | 0.0485 | 0.0287 | 0.0105 | 0.0093 |
62
+ | 0.228 | 2.0 | 42 | 0.1327 | 0.0036 | 0.0509 | 0.0300 | 0.0098 | 0.0112 |
63
+ | 0.1873 | 3.0 | 63 | 0.1280 | 0.0036 | 0.0496 | 0.0307 | 0.0092 | 0.0097 |
64
+ | 0.166 | 4.0 | 84 | 0.1280 | 0.0036 | 0.0487 | 0.0307 | 0.0091 | 0.0090 |
65
+ | 0.152 | 5.0 | 105 | 0.1283 | 0.0036 | 0.0490 | 0.0309 | 0.0091 | 0.0090 |
66
+
67
+
68
+ ### Framework versions
69
+
70
+ - Transformers 4.44.2
71
+ - Pytorch 2.5.0+cu121
72
+ - Datasets 3.1.0
73
+ - Tokenizers 0.19.1
config.json ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "SegmentationModel"
4
+ ],
5
+ "chunk_duration": 10.0,
6
+ "max_speakers_per_chunk": 3,
7
+ "max_speakers_per_frame": 2,
8
+ "min_duration": null,
9
+ "model_type": "pyannet",
10
+ "sample_rate": 16000,
11
+ "torch_dtype": "float32",
12
+ "transformers_version": "4.44.2",
13
+ "warm_up": [
14
+ 0.0,
15
+ 0.0
16
+ ],
17
+ "weigh_by_cardinality": false
18
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:597593778c2ec74b8f3a2339f22f20c90af7f2eaf263ea2e32b9c0c91fd0e303
3
+ size 5899124
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0dfb89a2e36124f2c7f50bd15b387aa40de6243edfecb1aead5268a780fa3ded
3
+ size 5240