Sentence Similarity
Safetensors
English
bert
SalmanFaroz commited on
Commit
77a399a
·
verified ·
1 Parent(s): 5411734

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -321
README.md CHANGED
@@ -82,22 +82,6 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [B
82
  - **Language:** en
83
  - **License:** mit
84
 
85
- ### Model Sources
86
-
87
- - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
88
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
89
- - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
90
-
91
- ### Full Model Architecture
92
-
93
- ```
94
- SentenceTransformer(
95
- (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
96
- (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
97
- (2): Normalize()
98
- )
99
- ```
100
-
101
  ## Usage
102
 
103
  ### Direct Usage (Sentence Transformers)
@@ -112,7 +96,6 @@ Then you can load this model and run inference.
112
  ```python
113
  from sentence_transformers import SentenceTransformer
114
 
115
- # Download from the 🤗 Hub
116
  model = SentenceTransformer("SalmanFaroz/DisEmbed-v1")
117
  # Run inference
118
  sentences = [
@@ -130,291 +113,6 @@ print(similarities.shape)
130
  # [3, 3]
131
  ```
132
 
133
- <!--
134
- ### Direct Usage (Transformers)
135
-
136
- <details><summary>Click to see the direct usage in Transformers</summary>
137
-
138
- </details>
139
- -->
140
-
141
- <!--
142
- ### Downstream Usage (Sentence Transformers)
143
-
144
- You can finetune this model on your own dataset.
145
-
146
- <details><summary>Click to expand</summary>
147
-
148
- </details>
149
- -->
150
-
151
- <!--
152
- ### Out-of-Scope Use
153
-
154
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
155
- -->
156
-
157
- <!--
158
- ## Bias, Risks and Limitations
159
-
160
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
161
- -->
162
-
163
- <!--
164
- ### Recommendations
165
-
166
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
167
- -->
168
-
169
- ## Training Details
170
-
171
- ### Training Dataset
172
-
173
- #### Unnamed Dataset
174
-
175
-
176
- * Size: 225,245 training samples
177
- * Columns: <code>0</code> and <code>1</code>
178
- * Approximate statistics based on the first 1000 samples:
179
- | | 0 | 1 |
180
- |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
181
- | type | string | string |
182
- | details | <ul><li>min: 6 tokens</li><li>mean: 35.19 tokens</li><li>max: 347 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 34.92 tokens</li><li>max: 281 tokens</li></ul> |
183
- * Samples:
184
- | 0 | 1 |
185
- |:--------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
186
- | <code>Disease Name : Lymphadenitis, due to, diphtheria</code> | <code>This condition involves lymphadenitis due to diphtheria infection, leading to symptoms such as swelling, tenderness, and potential pain in the lymph nodes. Patients may experience systemic symptoms like fever and malaise, indicating an underlying issue that requires attention. Complications can arise if the condition is not managed properly.</code> |
187
- | <code>nephropathy: kidney damage or disease; proteinuria: presence of excess protein in urine; edema: swelling due to fluid retention; ...</code> | <code>Disease Name : Nephropathy, phosphate-losing</code> |
188
- | <code>Disease Name : Cyst, renal</code> | <code>Renal cysts can lead to symptoms such as flank pain, hematuria, and potential urinary obstruction. If these cysts become infected, they may present with fever, chills, and significant discomfort in the abdominal or back regions.</code> |
189
- * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
190
- ```json
191
- {
192
- "scale": 20.0,
193
- "similarity_fct": "cos_sim"
194
- }
195
- ```
196
-
197
- ### Training Hyperparameters
198
- #### Non-Default Hyperparameters
199
-
200
- - `per_device_train_batch_size`: 122
201
- - `per_device_eval_batch_size`: 122
202
- - `learning_rate`: 2e-05
203
- - `num_train_epochs`: 1
204
- - `warmup_ratio`: 0.1
205
- - `fp16`: True
206
- - `batch_sampler`: no_duplicates
207
-
208
- #### All Hyperparameters
209
- <details><summary>Click to expand</summary>
210
-
211
- - `overwrite_output_dir`: False
212
- - `do_predict`: False
213
- - `eval_strategy`: no
214
- - `prediction_loss_only`: True
215
- - `per_device_train_batch_size`: 122
216
- - `per_device_eval_batch_size`: 122
217
- - `per_gpu_train_batch_size`: None
218
- - `per_gpu_eval_batch_size`: None
219
- - `gradient_accumulation_steps`: 1
220
- - `eval_accumulation_steps`: None
221
- - `torch_empty_cache_steps`: None
222
- - `learning_rate`: 2e-05
223
- - `weight_decay`: 0.0
224
- - `adam_beta1`: 0.9
225
- - `adam_beta2`: 0.999
226
- - `adam_epsilon`: 1e-08
227
- - `max_grad_norm`: 1.0
228
- - `num_train_epochs`: 1
229
- - `max_steps`: -1
230
- - `lr_scheduler_type`: linear
231
- - `lr_scheduler_kwargs`: {}
232
- - `warmup_ratio`: 0.1
233
- - `warmup_steps`: 0
234
- - `log_level`: passive
235
- - `log_level_replica`: warning
236
- - `log_on_each_node`: True
237
- - `logging_nan_inf_filter`: True
238
- - `save_safetensors`: True
239
- - `save_on_each_node`: False
240
- - `save_only_model`: False
241
- - `restore_callback_states_from_checkpoint`: False
242
- - `no_cuda`: False
243
- - `use_cpu`: False
244
- - `use_mps_device`: False
245
- - `seed`: 42
246
- - `data_seed`: None
247
- - `jit_mode_eval`: False
248
- - `use_ipex`: False
249
- - `bf16`: False
250
- - `fp16`: True
251
- - `fp16_opt_level`: O1
252
- - `half_precision_backend`: auto
253
- - `bf16_full_eval`: False
254
- - `fp16_full_eval`: False
255
- - `tf32`: None
256
- - `local_rank`: 0
257
- - `ddp_backend`: None
258
- - `tpu_num_cores`: None
259
- - `tpu_metrics_debug`: False
260
- - `debug`: []
261
- - `dataloader_drop_last`: False
262
- - `dataloader_num_workers`: 0
263
- - `dataloader_prefetch_factor`: None
264
- - `past_index`: -1
265
- - `disable_tqdm`: False
266
- - `remove_unused_columns`: True
267
- - `label_names`: None
268
- - `load_best_model_at_end`: False
269
- - `ignore_data_skip`: False
270
- - `fsdp`: []
271
- - `fsdp_min_num_params`: 0
272
- - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
273
- - `fsdp_transformer_layer_cls_to_wrap`: None
274
- - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
275
- - `deepspeed`: None
276
- - `label_smoothing_factor`: 0.0
277
- - `optim`: adamw_torch
278
- - `optim_args`: None
279
- - `adafactor`: False
280
- - `group_by_length`: False
281
- - `length_column_name`: length
282
- - `ddp_find_unused_parameters`: None
283
- - `ddp_bucket_cap_mb`: None
284
- - `ddp_broadcast_buffers`: False
285
- - `dataloader_pin_memory`: True
286
- - `dataloader_persistent_workers`: False
287
- - `skip_memory_metrics`: True
288
- - `use_legacy_prediction_loop`: False
289
- - `push_to_hub`: False
290
- - `resume_from_checkpoint`: None
291
- - `hub_model_id`: None
292
- - `hub_strategy`: every_save
293
- - `hub_private_repo`: None
294
- - `hub_always_push`: False
295
- - `gradient_checkpointing`: False
296
- - `gradient_checkpointing_kwargs`: None
297
- - `include_inputs_for_metrics`: False
298
- - `include_for_metrics`: []
299
- - `eval_do_concat_batches`: True
300
- - `fp16_backend`: auto
301
- - `push_to_hub_model_id`: None
302
- - `push_to_hub_organization`: None
303
- - `mp_parameters`:
304
- - `auto_find_batch_size`: False
305
- - `full_determinism`: False
306
- - `torchdynamo`: None
307
- - `ray_scope`: last
308
- - `ddp_timeout`: 1800
309
- - `torch_compile`: False
310
- - `torch_compile_backend`: None
311
- - `torch_compile_mode`: None
312
- - `dispatch_batches`: None
313
- - `split_batches`: None
314
- - `include_tokens_per_second`: False
315
- - `include_num_input_tokens_seen`: False
316
- - `neftune_noise_alpha`: None
317
- - `optim_target_modules`: None
318
- - `batch_eval_metrics`: False
319
- - `eval_on_start`: False
320
- - `use_liger_kernel`: False
321
- - `eval_use_gather_object`: False
322
- - `average_tokens_across_devices`: False
323
- - `prompts`: None
324
- - `batch_sampler`: no_duplicates
325
- - `multi_dataset_batch_sampler`: proportional
326
-
327
- </details>
328
-
329
- ### Training Logs
330
- | Epoch | Step | Training Loss |
331
- |:------:|:----:|:-------------:|
332
- | 0.0541 | 100 | 2.5621 |
333
- | 0.1083 | 200 | 1.3308 |
334
- | 0.1624 | 300 | 1.1403 |
335
- | 0.2166 | 400 | 1.0506 |
336
- | 0.2707 | 500 | 1.0135 |
337
- | 0.3249 | 600 | 0.9443 |
338
- | 0.3790 | 700 | 0.9412 |
339
- | 0.4331 | 800 | 0.9095 |
340
- | 0.4873 | 900 | 0.8945 |
341
- | 0.5414 | 1000 | 0.8533 |
342
- | 0.5956 | 1100 | 0.8601 |
343
- | 0.6497 | 1200 | 0.8425 |
344
- | 0.7038 | 1300 | 0.2919 |
345
- | 0.7580 | 1400 | 0.0249 |
346
- | 0.8121 | 1500 | 0.0231 |
347
- | 0.8663 | 1600 | 0.0182 |
348
- | 0.9204 | 1700 | 0.0206 |
349
- | 0.9746 | 1800 | 0.0206 |
350
- | 0.0541 | 100 | 0.8606 |
351
- | 0.1083 | 200 | 0.7361 |
352
- | 0.1624 | 300 | 0.6648 |
353
- | 0.2166 | 400 | 0.6506 |
354
- | 0.2707 | 500 | 0.6502 |
355
- | 0.3249 | 600 | 0.6249 |
356
- | 0.3790 | 700 | 0.6473 |
357
- | 0.4331 | 800 | 0.6391 |
358
- | 0.4873 | 900 | 0.6474 |
359
- | 0.5414 | 1000 | 0.6316 |
360
- | 0.5956 | 1100 | 0.6543 |
361
- | 0.6497 | 1200 | 0.6493 |
362
- | 0.7038 | 1300 | 0.2173 |
363
- | 0.7580 | 1400 | 0.0135 |
364
- | 0.8121 | 1500 | 0.0149 |
365
- | 0.8663 | 1600 | 0.0128 |
366
- | 0.9204 | 1700 | 0.0158 |
367
- | 0.9746 | 1800 | 0.0169 |
368
- | 0.0541 | 100 | 0.6698 |
369
- | 0.1083 | 200 | 0.5107 |
370
- | 0.1624 | 300 | 0.4378 |
371
- | 0.2166 | 400 | 0.4408 |
372
- | 0.2707 | 500 | 0.4452 |
373
- | 0.3249 | 600 | 0.4391 |
374
- | 0.3790 | 700 | 0.4672 |
375
- | 0.4331 | 800 | 0.4712 |
376
- | 0.4873 | 900 | 0.489 |
377
- | 0.5414 | 1000 | 0.4878 |
378
- | 0.5956 | 1100 | 0.5196 |
379
- | 0.6497 | 1200 | 0.5245 |
380
- | 0.7038 | 1300 | 0.1768 |
381
- | 0.7580 | 1400 | 0.0091 |
382
- | 0.8121 | 1500 | 0.0107 |
383
- | 0.8663 | 1600 | 0.0099 |
384
- | 0.9204 | 1700 | 0.0127 |
385
- | 0.9746 | 1800 | 0.0147 |
386
- | 0.0541 | 100 | 0.5605 |
387
- | 0.1083 | 200 | 0.3476 |
388
- | 0.1624 | 300 | 0.2772 |
389
- | 0.2166 | 400 | 0.2862 |
390
- | 0.2707 | 500 | 0.2937 |
391
- | 0.3249 | 600 | 0.2983 |
392
- | 0.3790 | 700 | 0.3293 |
393
- | 0.4331 | 800 | 0.3421 |
394
- | 0.4873 | 900 | 0.3634 |
395
- | 0.5414 | 1000 | 0.3732 |
396
- | 0.5956 | 1100 | 0.4125 |
397
- | 0.6497 | 1200 | 0.4266 |
398
- | 0.7038 | 1300 | 0.1474 |
399
- | 0.7580 | 1400 | 0.007 |
400
- | 0.8121 | 1500 | 0.0081 |
401
- | 0.8663 | 1600 | 0.0079 |
402
- | 0.9204 | 1700 | 0.0104 |
403
- | 0.9746 | 1800 | 0.0132 |
404
-
405
-
406
- ### Framework Versions
407
- - Python: 3.10.12
408
- - Sentence Transformers: 3.3.1
409
- - Transformers: 4.47.0
410
- - PyTorch: 2.1.0+cu118
411
- - Accelerate: 1.2.1
412
- - Datasets: 3.2.0
413
- - Tokenizers: 0.21.0
414
-
415
- ## Citation
416
-
417
- ### BibTeX
418
 
419
  #### Sentence Transformers
420
  ```bibtex
@@ -439,22 +137,4 @@ You can finetune this model on your own dataset.
439
  archivePrefix={arXiv},
440
  primaryClass={cs.CL}
441
  }
442
- ```
443
-
444
- <!--
445
- ## Glossary
446
-
447
- *Clearly define terms in order to be accessible across audiences.*
448
- -->
449
-
450
- <!--
451
- ## Model Card Authors
452
-
453
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
454
- -->
455
-
456
- <!--
457
- ## Model Card Contact
458
-
459
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
460
- -->
 
82
  - **Language:** en
83
  - **License:** mit
84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
  ## Usage
86
 
87
  ### Direct Usage (Sentence Transformers)
 
96
  ```python
97
  from sentence_transformers import SentenceTransformer
98
 
 
99
  model = SentenceTransformer("SalmanFaroz/DisEmbed-v1")
100
  # Run inference
101
  sentences = [
 
113
  # [3, 3]
114
  ```
115
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
116
 
117
  #### Sentence Transformers
118
  ```bibtex
 
137
  archivePrefix={arXiv},
138
  primaryClass={cs.CL}
139
  }
140
+ ```