Leo1212 commited on
Commit
d209592
·
verified ·
1 Parent(s): 2104f5d

Add new SentenceTransformer model.

Browse files
Files changed (2) hide show
  1. README.md +34 -31
  2. model.safetensors +1 -1
README.md CHANGED
@@ -420,13 +420,19 @@ You can finetune this model on your own dataset.
420
  ```
421
 
422
  ### Training Hyperparameters
 
 
 
 
 
 
423
 
424
  #### All Hyperparameters
425
  <details><summary>Click to expand</summary>
426
 
427
- - `overwrite_output_dir`: False
428
  - `do_predict`: False
429
- - `eval_strategy`: no
430
  - `prediction_loss_only`: True
431
  - `per_device_train_batch_size`: 8
432
  - `per_device_eval_batch_size`: 8
@@ -441,7 +447,7 @@ You can finetune this model on your own dataset.
441
  - `adam_beta2`: 0.999
442
  - `adam_epsilon`: 1e-08
443
  - `max_grad_norm`: 1.0
444
- - `num_train_epochs`: 3.0
445
  - `max_steps`: -1
446
  - `lr_scheduler_type`: linear
447
  - `lr_scheduler_kwargs`: {}
@@ -481,7 +487,7 @@ You can finetune this model on your own dataset.
481
  - `disable_tqdm`: False
482
  - `remove_unused_columns`: True
483
  - `label_names`: None
484
- - `load_best_model_at_end`: False
485
  - `ignore_data_skip`: False
486
  - `fsdp`: []
487
  - `fsdp_min_num_params`: 0
@@ -540,33 +546,30 @@ You can finetune this model on your own dataset.
540
  </details>
541
 
542
  ### Training Logs
543
- | Epoch | Step | Training Loss |
544
- |:------:|:-----:|:-------------:|
545
- | 0.1217 | 500 | 2.0816 |
546
- | 0.2433 | 1000 | 1.8989 |
547
- | 0.3650 | 1500 | 1.7863 |
548
- | 0.4866 | 2000 | 1.6893 |
549
- | 0.6083 | 2500 | 1.7278 |
550
- | 0.7299 | 3000 | 1.6332 |
551
- | 0.8516 | 3500 | 1.5289 |
552
- | 0.9732 | 4000 | 1.6122 |
553
- | 1.0949 | 4500 | 1.5243 |
554
- | 1.2165 | 5000 | 1.4054 |
555
- | 1.3382 | 5500 | 1.5066 |
556
- | 1.4599 | 6000 | 1.2831 |
557
- | 1.5815 | 6500 | 1.4375 |
558
- | 1.7032 | 7000 | 1.3062 |
559
- | 1.8248 | 7500 | 1.3748 |
560
- | 1.9465 | 8000 | 1.1605 |
561
- | 2.0681 | 8500 | 1.2467 |
562
- | 2.1898 | 9000 | 1.1417 |
563
- | 2.3114 | 9500 | 1.26 |
564
- | 2.4331 | 10000 | 1.0447 |
565
- | 2.5547 | 10500 | 1.159 |
566
- | 2.6764 | 11000 | 0.9982 |
567
- | 2.7981 | 11500 | 1.0904 |
568
- | 2.9197 | 12000 | 0.9434 |
569
-
570
 
571
  ### Framework Versions
572
  - Python: 3.11.9
 
420
  ```
421
 
422
  ### Training Hyperparameters
423
+ #### Non-Default Hyperparameters
424
+
425
+ - `overwrite_output_dir`: True
426
+ - `eval_strategy`: steps
427
+ - `num_train_epochs`: 5
428
+ - `load_best_model_at_end`: True
429
 
430
  #### All Hyperparameters
431
  <details><summary>Click to expand</summary>
432
 
433
+ - `overwrite_output_dir`: True
434
  - `do_predict`: False
435
+ - `eval_strategy`: steps
436
  - `prediction_loss_only`: True
437
  - `per_device_train_batch_size`: 8
438
  - `per_device_eval_batch_size`: 8
 
447
  - `adam_beta2`: 0.999
448
  - `adam_epsilon`: 1e-08
449
  - `max_grad_norm`: 1.0
450
+ - `num_train_epochs`: 5
451
  - `max_steps`: -1
452
  - `lr_scheduler_type`: linear
453
  - `lr_scheduler_kwargs`: {}
 
487
  - `disable_tqdm`: False
488
  - `remove_unused_columns`: True
489
  - `label_names`: None
490
+ - `load_best_model_at_end`: True
491
  - `ignore_data_skip`: False
492
  - `fsdp`: []
493
  - `fsdp_min_num_params`: 0
 
546
  </details>
547
 
548
  ### Training Logs
549
+ | Epoch | Step | Training Loss | all-nli-triplet loss | stsb loss | natural-questions loss | quora loss |
550
+ |:----------:|:--------:|:-------------:|:--------------------:|:----------:|:----------------------:|:----------:|
551
+ | 0.0487 | 200 | 2.0928 | - | - | - | - |
552
+ | 0.0973 | 400 | 2.2013 | - | - | - | - |
553
+ | 0.1460 | 600 | 1.7404 | - | - | - | - |
554
+ | 0.1946 | 800 | 1.9134 | - | - | - | - |
555
+ | **0.2433** | **1000** | **2.043** | **0.5161** | **6.2815** | **0.1172** | **0.0192** |
556
+ | 0.2920 | 1200 | 1.8817 | - | - | - | - |
557
+ | 0.3406 | 1400 | 1.7734 | - | - | - | - |
558
+ | 0.3893 | 1600 | 1.5935 | - | - | - | - |
559
+ | 0.4380 | 1800 | 1.6762 | - | - | - | - |
560
+ | 0.4866 | 2000 | 1.7031 | 0.4555 | 6.3907 | 0.0726 | 0.0198 |
561
+ | 0.5353 | 2200 | 1.8561 | - | - | - | - |
562
+ | 0.5839 | 2400 | 1.6742 | - | - | - | - |
563
+ | 0.6326 | 2600 | 1.456 | - | - | - | - |
564
+ | 0.6813 | 2800 | 1.6122 | - | - | - | - |
565
+ | 0.7299 | 3000 | 1.8851 | 0.4975 | 6.1758 | 0.0841 | 0.0208 |
566
+ | 0.7786 | 3200 | 1.5684 | - | - | - | - |
567
+ | 0.8273 | 3400 | 1.6535 | - | - | - | - |
568
+ | 0.8759 | 3600 | 1.5043 | - | - | - | - |
569
+ | 0.9246 | 3800 | 1.4768 | - | - | - | - |
570
+ | 0.9732 | 4000 | 1.686 | 0.4912 | 6.1600 | 0.0795 | 0.0170 |
571
+
572
+ * The bold row denotes the saved checkpoint.
 
 
 
573
 
574
  ### Framework Versions
575
  - Python: 3.11.9
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d07d8769dae10deca2cf5f0aa5e64c226fd57b086435e42dfcefee4b3bfa43f8
3
  size 594668880
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bbc5e57f3e543b2aa7f15a158a3a5bb351bb99a79235706212199447b9614a3e
3
  size 594668880