metadata

base_model: dbourget/pb-small-10e-tsdae6e-philsim-cosine-6e-beatai-cosine-50e
library_name: sentence-transformers
metrics:
  - cosine_accuracy
  - dot_accuracy
  - manhattan_accuracy
  - euclidean_accuracy
  - max_accuracy
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:9504
  - loss:TripletLoss
widget:
  - source_sentence: cap product
    sentences:
      - >-
        method of adjoining a chain of degree p with a co-chain of degree q,
        where q is less than or equal to p, to form a composite chain of degree
        p-q
      - 'Ontology '
      - hat commodity
  - source_sentence: cognitivism
    sentences:
      - supporting cognitive science
      - >-
        study of changes in organisms caused by modification of gene expression
        rather than alteration of the genetic code
      - 'the idea that mind works like an algorithmic symbol manipulation '
  - source_sentence: doxastic voluntarism
    sentences:
      - Land surrounded by water
      - belief one is free
      - the ability to will beliefs
  - source_sentence: conceptual role
    sentences:
      - concept
      - inferential role
      - 'Theory of knowledge '
  - source_sentence: scientific revolutions
    sentences:
      - scientific realism
      - Universal moral principles govern legal systems
      - paradigm shifts
model-index:
  - name: >-
      SentenceTransformer based on
      dbourget/pb-small-10e-tsdae6e-philsim-cosine-6e-beatai-cosine-50e
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: beatai dev
          type: beatai-dev
        metrics:
          - type: cosine_accuracy
            value: 0.813973063973064
            name: Cosine Accuracy
          - type: dot_accuracy
            value: 0.22727272727272727
            name: Dot Accuracy
          - type: manhattan_accuracy
            value: 0.8198653198653199
            name: Manhattan Accuracy
          - type: euclidean_accuracy
            value: 0.8156565656565656
            name: Euclidean Accuracy
          - type: max_accuracy
            value: 0.8198653198653199
            name: Max Accuracy

SentenceTransformer based on dbourget/pb-small-10e-tsdae6e-philsim-cosine-6e-beatai-cosine-50e

This is a sentence-transformers model finetuned from dbourget/pb-small-10e-tsdae6e-philsim-cosine-6e-beatai-cosine-50e. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: dbourget/pb-small-10e-tsdae6e-philsim-cosine-6e-beatai-cosine-50e
Maximum Sequence Length: 512 tokens
Output Dimensionality: 1024 tokens
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("dbourget/pb-small-10e-tsdae6e-philsim-cosine-6e-beatai-cosine-80e")
# Run inference
sentences = [
    'scientific revolutions',
    'paradigm shifts',
    'scientific realism',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Dataset: beatai-dev
Evaluated with TripletEvaluator

Metric	Value
cosine_accuracy	0.814
dot_accuracy	0.2273
manhattan_accuracy	0.8199
euclidean_accuracy	0.8157
max_accuracy	0.8199

Training Details

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 138
per_device_eval_batch_size: 138
learning_rate: 5e-07
weight_decay: 0.01
num_train_epochs: 30
lr_scheduler_type: constant
bf16: True
dataloader_drop_last: True
resume_from_checkpoint: True

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 138
per_device_eval_batch_size: 138
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-07
weight_decay: 0.01
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 30
max_steps: -1
lr_scheduler_type: constant
lr_scheduler_kwargs: {}
warmup_ratio: 0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: True
dataloader_num_workers: 0
dataloader_prefetch_factor: 2
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: True
hub_model_id: None
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional

Training Logs

Click to expand

Epoch	Step	Training Loss	loss	beatai-dev_cosine_accuracy
0	0	-	-	0.7904
0.1471	10	0.0721	-	-
0.2941	20	0.0708	-	-
0.4412	30	0.0736	-	-
0.5882	40	0.0704	-	-
0.7353	50	0.0732	0.0971	0.7929
0.8824	60	0.0716	-	-
1.0294	70	0.0665	-	-
1.1765	80	0.0698	-	-
1.3235	90	0.0699	-	-
1.4706	100	0.0691	0.0968	0.7912
1.6176	110	0.0687	-	-
1.7647	120	0.0701	-	-
1.9118	130	0.0689	-	-
2.0588	140	0.0696	-	-
2.2059	150	0.071	0.0966	0.7929
2.3529	160	0.078	-	-
2.5	170	0.0675	-	-
2.6471	180	0.065	-	-
2.7941	190	0.0684	-	-
2.9412	200	0.0689	0.0963	0.7938
3.0882	210	0.0736	-	-
3.2353	220	0.0684	-	-
3.3824	230	0.0669	-	-
3.5294	240	0.0688	-	-
3.6765	250	0.0678	0.0959	0.7963
3.8235	260	0.0682	-	-
3.9706	270	0.0678	-	-
4.1176	280	0.0686	-	-
4.2647	290	0.0664	-	-
4.4118	300	0.0703	0.0957	0.7980
4.5588	310	0.065	-	-
4.7059	320	0.0719	-	-
4.8529	330	0.0685	-	-
5.0	340	0.0639	-	-
5.1471	350	0.0667	0.0957	0.7971
5.2941	360	0.0661	-	-
5.4412	370	0.0678	-	-
5.5882	380	0.0725	-	-
5.7353	390	0.0655	-	-
5.8824	400	0.0649	0.0953	0.7980
6.0294	410	0.0661	-	-
6.1765	420	0.0662	-	-
6.3235	430	0.0671	-	-
6.4706	440	0.0698	-	-
6.6176	450	0.0636	0.0951	0.7980
6.7647	460	0.0644	-	-
6.9118	470	0.0633	-	-
7.0588	480	0.0679	-	-
7.2059	490	0.067	-	-
7.3529	500	0.0713	0.0948	0.7963
7.5	510	0.0677	-	-
7.6471	520	0.0666	-	-
7.7941	530	0.065	-	-
7.9412	540	0.0665	-	-
8.0882	550	0.0656	0.0946	0.7963
8.2353	560	0.0649	-	-
8.3824	570	0.0649	-	-
8.5294	580	0.0653	-	-
8.6765	590	0.0648	-	-
8.8235	600	0.0622	0.0944	0.7946
8.9706	610	0.0689	-	-
9.1176	620	0.0711	-	-
9.2647	630	0.0611	-	-
9.4118	640	0.0697	-	-
9.5588	650	0.0645	0.0942	0.7963
9.7059	660	0.0639	-	-
9.8529	670	0.0643	-	-
10.0	680	0.0644	-	-
10.1471	690	0.0599	-	-
10.2941	700	0.0723	0.0940	0.7955
10.4412	710	0.0652	-	-
10.5882	720	0.0646	-	-
10.7353	730	0.0602	-	-
10.8824	740	0.0644	-	-
11.0294	750	0.066	0.0938	0.7971
11.1765	760	0.0624	-	-
11.3235	770	0.0652	-	-
11.4706	780	0.0649	-	-
11.6176	790	0.0624	-	-
11.7647	800	0.0626	0.0937	0.7988
11.9118	810	0.0635	-	-
12.0588	820	0.0643	-	-
12.2059	830	0.0663	-	-
12.3529	840	0.0641	-	-
12.5	850	0.0614	0.0933	0.8005
12.6471	860	0.0613	-	-
12.7941	870	0.0648	-	-
12.9412	880	0.065	-	-
13.0882	890	0.0589	-	-
13.2353	900	0.0632	0.0931	0.7997
13.3824	910	0.0649	-	-
13.5294	920	0.0612	-	-
13.6765	930	0.0634	-	-
13.8235	940	0.0637	-	-
13.9706	950	0.0626	0.0930	0.7997
14.1176	960	0.0593	-	-
14.2647	970	0.0662	-	-
14.4118	980	0.0644	-	-
14.5588	990	0.0582	-	-
14.7059	1000	0.0626	0.0927	0.8013
14.8529	1010	0.0605	-	-
15.0	1020	0.0615	-	-
15.1471	1030	0.0676	-	-
15.2941	1040	0.0633	-	-
15.4412	1050	0.06	0.0927	0.8047
15.5882	1060	0.0572	-	-
15.7353	1070	0.0579	-	-
15.8824	1080	0.0594	-	-
16.0294	1090	0.063	-	-
16.1765	1100	0.0581	0.0927	0.8030
16.3235	1110	0.0564	-	-
16.4706	1120	0.0632	-	-
16.6176	1130	0.065	-	-
16.7647	1140	0.0602	-	-
16.9118	1150	0.0581	0.0926	0.8039
17.0588	1160	0.0623	-	-
17.2059	1170	0.06	-	-
17.3529	1180	0.0562	-	-
17.5	1190	0.0627	-	-
17.6471	1200	0.056	0.0924	0.8013
17.7941	1210	0.0586	-	-
17.9412	1220	0.0576	-	-
18.0882	1230	0.056	-	-
18.2353	1240	0.0611	-	-
18.3824	1250	0.0551	0.0922	0.8047
18.5294	1260	0.058	-	-
18.6765	1270	0.0571	-	-
18.8235	1280	0.0616	-	-
18.9706	1290	0.0599	-	-
19.1176	1300	0.0604	0.0920	0.8081
19.2647	1310	0.0633	-	-
19.4118	1320	0.0573	-	-
19.5588	1330	0.0549	-	-
19.7059	1340	0.0591	-	-
19.8529	1350	0.0585	0.0918	0.8089
20.0	1360	0.057	-	-
20.1471	1370	0.057	-	-
20.2941	1380	0.0625	-	-
20.4412	1390	0.0589	-	-
20.5882	1400	0.0577	0.0918	0.8098
20.7353	1410	0.0583	-	-
20.8824	1420	0.0567	-	-
21.0294	1430	0.0619	-	-
21.1765	1440	0.0572	-	-
21.3235	1450	0.0594	0.0917	0.8123
21.4706	1460	0.0567	-	-
21.6176	1470	0.0611	-	-
21.7647	1480	0.0533	-	-
21.9118	1490	0.0595	-	-
22.0588	1500	0.0521	0.0913	0.8114
22.2059	1510	0.0586	-	-
22.3529	1520	0.0603	-	-
22.5	1530	0.0601	-	-
22.6471	1540	0.0567	-	-
22.7941	1550	0.0551	0.0911	0.8114
22.9412	1560	0.0542	-	-
23.0882	1570	0.057	-	-
23.2353	1580	0.0541	-	-
23.3824	1590	0.0586	-	-
23.5294	1600	0.0573	0.0912	0.8106
23.6765	1610	0.0543	-	-
23.8235	1620	0.0578	-	-
23.9706	1630	0.0563	-	-
24.1176	1640	0.0549	-	-
24.2647	1650	0.0549	0.0909	0.8140
24.4118	1660	0.056	-	-
24.5588	1670	0.0599	-	-
24.7059	1680	0.0543	-	-
24.8529	1690	0.0547	-	-
25.0	1700	0.0575	0.0906	0.8114
25.1471	1710	0.0544	-	-
25.2941	1720	0.0574	-	-
25.4412	1730	0.0565	-	-
25.5882	1740	0.0587	-	-
25.7353	1750	0.0559	0.0905	0.8157
25.8824	1760	0.0551	-	-
26.0294	1770	0.0569	-	-
26.1765	1780	0.0516	-	-
26.3235	1790	0.0561	-	-
26.4706	1800	0.0567	0.0906	0.8165
26.6176	1810	0.0599	-	-
26.7647	1820	0.0577	-	-
26.9118	1830	0.0532	-	-
27.0588	1840	0.0554	-	-
27.2059	1850	0.0579	0.0906	0.8123
27.3529	1860	0.0532	-	-
27.5	1870	0.0493	-	-
27.6471	1880	0.0552	-	-
27.7941	1890	0.0532	-	-
27.9412	1900	0.0569	0.0904	0.8089
28.0882	1910	0.0568	-	-
28.2353	1920	0.052	-	-
28.3824	1930	0.0555	-	-
28.5294	1940	0.0563	-	-
28.6765	1950	0.0555	0.0903	0.8140
28.8235	1960	0.0535	-	-
28.9706	1970	0.0525	-	-
29.1176	1980	0.0566	-	-
29.2647	1990	0.0562	-	-
29.4118	2000	0.0547	0.0902	0.8140
29.5588	2010	0.0495	-	-
29.7059	2020	0.0532	-	-
29.8529	2030	0.0553	-	-
30.0	2040	0.0544	-	-

Framework Versions

Python: 3.8.18
Sentence Transformers: 3.1.1
Transformers: 4.45.1
PyTorch: 1.13.1+cu117
Accelerate: 0.34.2
Datasets: 3.0.0
Tokenizers: 0.20.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}