metadata
base_model: Snowflake/snowflake-arctic-embed-m
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
- dot_accuracy@1
- dot_accuracy@3
- dot_accuracy@5
- dot_accuracy@10
- dot_precision@1
- dot_precision@3
- dot_precision@5
- dot_precision@10
- dot_recall@1
- dot_recall@3
- dot_recall@5
- dot_recall@10
- dot_ndcg@10
- dot_mrr@10
- dot_map@100
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:600
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: >-
What is the purpose of the Artificial Intelligence Ethics for the
Intelligence Community as mentioned in the context?
sentences:
- |-
You should be able to opt out, where appropriate, and
have access to a person who can quickly consider and
remedy problems you encounter. You should be able to opt
out from automated systems in favor of a human alternative, where
appropriate. Appropriateness should be determined based on rea
sonable expectations in a given context and with a focus on ensuring
broad accessibility and protecting the public from especially harm
ful impacts. In some cases, a human or other alternative may be re
quired by law. You should have access to timely human consider
ation and remedy by a fallback and escalation process if an automat
ed system fails, it produces an error, or you would like to appeal or
contest its impacts on you. Human consideration and fallback
should be accessible, equitable, effective, maintained, accompanied
by appropriate operator training, and should not impose an unrea
sonable burden on the public. Automated systems with an intended
- >-
points to numerous examples of effective and proactive stakeholder
engagement, including the Community-
Based Participatory Research Program developed by the National
Institutes of Health and the participatory
technology assessments developed by the National Oceanic and Atmospheric
Administration.18
The National Institute of Standards and Technology (NIST) is developing
a risk
management framework to better manage risks posed to individuals,
organizations, and
society by AI.19 The NIST AI Risk Management Framework, as mandated by
Congress, is intended for
voluntary use to help incorporate trustworthiness considerations into
the design, development, use, and
evaluation of AI products, services, and systems. The NIST framework is
being developed through a consensus-
driven, open, transparent, and collaborative process that includes
workshops and other opportunities to provide
input. The NIST framework aims to foster the development of innovative
approaches to address
- >-
of Artificial Intelligence Ethics for the Intelligence Community to
guide personnel on whether and how to
develop and use AI in furtherance of the IC's mission, as well as an AI
Ethics Framework to help implement
these principles.22
The National Science Foundation (NSF) funds extensive research to help
foster the
development of automated systems that adhere to and advance their
safety, security and
effectiveness. Multiple NSF programs support research that directly
addresses many of these principles:
the National AI Research Institutes23 support research on all aspects of
safe, trustworthy, fair, and explainable
AI algorithms and systems; the Cyber Physical Systems24 program supports
research on developing safe
autonomous and cyber physical systems with AI components; the Secure and
Trustworthy Cyberspace25
program supports research on cybersecurity and privacy enhancing
technologies in automated systems; the
- source_sentence: >-
How does the Department of Defense's approach to AI ethics differ from
that of the Department of Energy?
sentences:
- >-
NOTICE &
EXPLANATION
WHAT SHOULD BE EXPECTED OF AUTOMATED SYSTEMS
The expectations for automated systems are meant to serve as a blueprint
for the development of additional
technical standards and practices that are tailored for particular
sectors and contexts.
Tailored to the level of risk. An assessment should be done to determine
the level of risk of the auto
mated system. In settings where the consequences are high as determined
by a risk assessment, or extensive
oversight is expected (e.g., in criminal justice or some public sector
settings), explanatory mechanisms should
be built into the system design so that the system’s full behavior can
be explained in advance (i.e., only fully
transparent models should be used), rather than as an after-the-decision
interpretation. In other settings, the
extent of explanation provided should be tailored to the risk level.
- >-
SAFE AND EFFECTIVE
SYSTEMS
HOW THESE PRINCIPLES CAN MOVE INTO PRACTICE
Real-life examples of how these principles can become reality, through
laws, policies, and practical
technical and sociotechnical approaches to protecting rights,
opportunities, and access.
Some U.S government agencies have developed specific frameworks for
ethical use of AI
systems. The Department of Energy (DOE) has activated the AI Advancement
Council that oversees coordina-
tion and advises on implementation of the DOE AI Strategy and addresses
issues and/or escalations on the
ethical use and development of AI systems.20 The Department of Defense
has adopted Artificial Intelligence
Ethical Principles, and tenets for Responsible Artificial Intelligence
specifically tailored to its national
security and defense activities.21 Similarly, the U.S. Intelligence
Community (IC) has developed the Principles
- >-
Formal Methods in the Field26 program supports research on rigorous
formal verification and analysis of
automated systems and machine learning, and the Designing Accountable
Software Systems27 program supports
research on rigorous and reproducible methodologies for developing
software systems with legal and regulatory
compliance in mind.
Some state legislatures have placed strong transparency and validity
requirements on
the use of pretrial risk assessments. The use of algorithmic pretrial
risk assessments has been a
cause of concern for civil rights groups.28 Idaho Code Section 19-1910,
enacted in 2019,29 requires that any
pretrial risk assessment, before use in the state, first be "shown to be
free of bias against any class of
individuals protected from discrimination by state or federal law", that
any locality using a pretrial risk
assessment must first formally validate the claim of its being free of
bias, that "all documents, records, and
- source_sentence: >-
What are the expectations for automated systems intended to serve as a
blueprint for?
sentences:
- >-
help to mitigate biases and potential harms.
Guarding against proxies. Directly using demographic information in the
design, development, or
deployment of an automated system (for purposes other than evaluating a
system for discrimination or using
a system to counter discrimination) runs a high risk of leading to
algorithmic discrimination and should be
avoided. In many cases, attributes that are highly correlated with
demographic features, known as proxies, can
contribute to algorithmic discrimination. In cases where use of the
demographic features themselves would
lead to illegal algorithmic discrimination, reliance on such proxies in
decision-making (such as that facilitated
by an algorithm) may also be prohibited by law. Proactive testing should
be performed to identify proxies by
testing for correlation between demographic information and attributes
in any data used as part of system
- >-
describes three broad challenges for mitigating bias – datasets, testing
and evaluation, and human factors – and
introduces preliminary guidance for addressing them. Throughout, the
special publication takes a socio-
technical perspective to identifying and managing AI bias.
29
Algorithmic
Discrimination
Protections
- >-
SAFE AND EFFECTIVE
SYSTEMS
WHAT SHOULD BE EXPECTED OF AUTOMATED SYSTEMS
The expectations for automated systems are meant to serve as a blueprint
for the development of additional
technical standards and practices that are tailored for particular
sectors and contexts.
Derived data sources tracked and reviewed carefully. Data that is
derived from other data through
the use of algorithms, such as data derived or inferred from prior model
outputs, should be identified and
tracked, e.g., via a specialized type in a data schema. Derived data
should be viewed as potentially high-risk
inputs that may lead to feedback loops, compounded harm, or inaccurate
results. Such sources should be care
fully validated against the risk of collateral consequences.
Data reuse limits in sensitive domains. Data reuse, and especially data
reuse in a new context, can result
in the spreading and scaling of harms. Data from some domains, including
criminal justice data and data indi
- source_sentence: >-
What should individuals have access to regarding their data decisions and
the impact of surveillance technologies?
sentences:
- >-
•
Searches for “Black girls,” “Asian girls,” or “Latina girls” return
predominantly39 sexualized content, rather
than role models, toys, or activities.40 Some search engines have been
working to reduce the prevalence of
these results, but the problem remains.41
•
Advertisement delivery systems that predict who is most likely to click
on a job advertisement end up deliv-
ering ads in ways that reinforce racial and gender stereotypes, such as
overwhelmingly directing supermar-
ket cashier ads to women and jobs with taxi companies to primarily Black
people.42
•
Body scanners, used by TSA at airport checkpoints, require the operator
to select a “male” or “female”
scanning setting based on the passenger’s sex, but the setting is chosen
based on the operator’s perception of
the passenger’s gender identity. These scanners are more likely to flag
transgender travelers as requiring
extra screening done by a person. Transgender travelers have described
degrading experiences associated
- >-
information used to build or validate the risk assessment shall be open
to public inspection," and that assertions
of trade secrets cannot be used "to quash discovery in a criminal matter
by a party to a criminal case."
22
- >-
tect privacy and civil liberties. Continuous surveillance and
monitoring
should not be used in education, work, housing, or in other contexts
where the
use of such surveillance technologies is likely to limit rights,
opportunities, or
access. Whenever possible, you should have access to reporting that
confirms
your data decisions have been respected and provides an assessment of
the
potential impact of surveillance technologies on your rights,
opportunities, or
access.
DATA PRIVACY
30
- source_sentence: >-
What are the implications of the digital divide highlighted in Andrew
Kenney's article regarding unemployment benefits?
sentences:
- >-
cating adverse outcomes in domains such as finance, employment, and
housing, is especially sensitive, and in
some cases its reuse is limited by law. Accordingly, such data should be
subject to extra oversight to ensure
safety and efficacy. Data reuse of sensitive domain data in other
contexts (e.g., criminal data reuse for civil legal
matters or private sector use) should only occur where use of such data
is legally authorized and, after examina
tion, has benefits for those impacted by the system that outweigh
identified risks and, as appropriate, reason
able measures have been implemented to mitigate the identified risks.
Such data should be clearly labeled to
identify contexts for limited reuse based on sensitivity. Where
possible, aggregated datasets may be useful for
replacing individual-level sensitive data.
Demonstrate the safety and effectiveness of the system
Independent evaluation. Automated systems should be designed to allow
for independent evaluation (e.g.,
- >-
5. Environmental Impacts: Impacts due to high compute resource
utilization in training or
operating GAI models, and related outcomes that may adversely impact
ecosystems.
6. Harmful Bias or Homogenization: Amplification and exacerbation of
historical, societal, and
systemic biases; performance disparities8 between sub-groups or
languages, possibly due to
non-representative training data, that result in discrimination,
amplification of biases, or
incorrect presumptions about performance; undesired homogeneity that
skews system or model
outputs, which may be erroneous, lead to ill-founded decision-making, or
amplify harmful
biases.
7. Human-AI Configuration: Arrangements of or interactions between a
human and an AI system
which can result in the human inappropriately anthropomorphizing GAI
systems or experiencing
algorithmic aversion, automation bias, over-reliance, or emotional
entanglement with GAI
systems.
- >-
https://bipartisanpolicy.org/blog/the-low-down-on-ballot-curing/
101. Andrew Kenney. 'I'm shocked that they need to have a smartphone':
System for unemployment
benefits exposes digital divide. USA Today. May 2, 2021.
https://www.usatoday.com/story/tech/news/2021/05/02/unemployment-benefits-system-leaving
people-behind/4915248001/
102. Allie Gross. UIA lawsuit shows how the state criminalizes the
unemployed. Detroit Metro-Times.
Sep. 18, 2015.
https://www.metrotimes.com/news/uia-lawsuit-shows-how-the-state-criminalizes-the
unemployed-2369412
103. Maia Szalavitz. The Pain Was Unbearable. So Why Did Doctors Turn
Her Away? Wired. Aug. 11,
2021.
https://www.wired.com/story/opioid-drug-addiction-algorithm-chronic-pain/
104. Spencer Soper. Fired by Bot at Amazon: "It's You Against the
Machine". Bloomberg, Jun. 28, 2021.
https://www.bloomberg.com/news/features/2021-06-28/fired-by-bot-amazon-turns-to-machine
managers-and-workers-are-losing-out
model-index:
- name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-m
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: Unknown
type: unknown
metrics:
- type: cosine_accuracy@1
value: 0.73
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.9
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.935
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.96
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.73
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.3
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.187
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.096
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.73
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.9
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.935
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.96
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.8511693160760204
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.8155396825396827
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.8172228277187864
name: Cosine Map@100
- type: dot_accuracy@1
value: 0.73
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.9
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.935
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.96
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.73
name: Dot Precision@1
- type: dot_precision@3
value: 0.3
name: Dot Precision@3
- type: dot_precision@5
value: 0.187
name: Dot Precision@5
- type: dot_precision@10
value: 0.096
name: Dot Precision@10
- type: dot_recall@1
value: 0.73
name: Dot Recall@1
- type: dot_recall@3
value: 0.9
name: Dot Recall@3
- type: dot_recall@5
value: 0.935
name: Dot Recall@5
- type: dot_recall@10
value: 0.96
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.8511693160760204
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.8155396825396827
name: Dot Mrr@10
- type: dot_map@100
value: 0.8172228277187864
name: Dot Map@100
SentenceTransformer based on Snowflake/snowflake-arctic-embed-m
This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Snowflake/snowflake-arctic-embed-m
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ldldld/snowflake-arctic-embed-m-finetuned")
# Run inference
sentences = [
"What are the implications of the digital divide highlighted in Andrew Kenney's article regarding unemployment benefits?",
'https://bipartisanpolicy.org/blog/the-low-down-on-ballot-curing/\n101. Andrew Kenney. \'I\'m shocked that they need to have a smartphone\': System for unemployment\nbenefits exposes digital divide. USA Today. May 2, 2021.\nhttps://www.usatoday.com/story/tech/news/2021/05/02/unemployment-benefits-system-leaving\xad\npeople-behind/4915248001/\n102. Allie Gross. UIA lawsuit shows how the state criminalizes the unemployed. Detroit Metro-Times.\nSep. 18, 2015.\nhttps://www.metrotimes.com/news/uia-lawsuit-shows-how-the-state-criminalizes-the\xad\nunemployed-2369412\n103. Maia Szalavitz. The Pain Was Unbearable. So Why Did Doctors Turn Her Away? Wired. Aug. 11,\n2021. https://www.wired.com/story/opioid-drug-addiction-algorithm-chronic-pain/\n104. Spencer Soper. Fired by Bot at Amazon: "It\'s You Against the Machine". Bloomberg, Jun. 28, 2021.\nhttps://www.bloomberg.com/news/features/2021-06-28/fired-by-bot-amazon-turns-to-machine\xad\nmanagers-and-workers-are-losing-out',
'5. Environmental Impacts: Impacts due to high compute resource utilization in training or \noperating GAI models, and related outcomes that may adversely impact ecosystems. \n6. Harmful Bias or Homogenization: Amplification and exacerbation of historical, societal, and \nsystemic biases; performance disparities8 between sub-groups or languages, possibly due to \nnon-representative training data, that result in discrimination, amplification of biases, or \nincorrect presumptions about performance; undesired homogeneity that skews system or model \noutputs, which may be erroneous, lead to ill-founded decision-making, or amplify harmful \nbiases. \n7. Human-AI Configuration: Arrangements of or interactions between a human and an AI system \nwhich can result in the human inappropriately anthropomorphizing GAI systems or experiencing \nalgorithmic aversion, automation bias, over-reliance, or emotional entanglement with GAI \nsystems.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.73 |
cosine_accuracy@3 | 0.9 |
cosine_accuracy@5 | 0.935 |
cosine_accuracy@10 | 0.96 |
cosine_precision@1 | 0.73 |
cosine_precision@3 | 0.3 |
cosine_precision@5 | 0.187 |
cosine_precision@10 | 0.096 |
cosine_recall@1 | 0.73 |
cosine_recall@3 | 0.9 |
cosine_recall@5 | 0.935 |
cosine_recall@10 | 0.96 |
cosine_ndcg@10 | 0.8512 |
cosine_mrr@10 | 0.8155 |
cosine_map@100 | 0.8172 |
dot_accuracy@1 | 0.73 |
dot_accuracy@3 | 0.9 |
dot_accuracy@5 | 0.935 |
dot_accuracy@10 | 0.96 |
dot_precision@1 | 0.73 |
dot_precision@3 | 0.3 |
dot_precision@5 | 0.187 |
dot_precision@10 | 0.096 |
dot_recall@1 | 0.73 |
dot_recall@3 | 0.9 |
dot_recall@5 | 0.935 |
dot_recall@10 | 0.96 |
dot_ndcg@10 | 0.8512 |
dot_mrr@10 | 0.8155 |
dot_map@100 | 0.8172 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 600 training samples
- Columns:
sentence_0
andsentence_1
- Approximate statistics based on the first 600 samples:
sentence_0 sentence_1 type string string details - min: 12 tokens
- mean: 20.66 tokens
- max: 34 tokens
- min: 21 tokens
- mean: 165.88 tokens
- max: 512 tokens
- Samples:
sentence_0 sentence_1 What is the main purpose of the "Blueprint for an AI Bill of Rights" as indicated in the context?
BLUEPRINT FOR AN
AI BILL OF
RIGHTS
MAKING AUTOMATED
SYSTEMS WORK FOR
THE AMERICAN PEOPLE
OCTOBER 2022When was the "Blueprint for an AI Bill of Rights" created?
BLUEPRINT FOR AN
AI BILL OF
RIGHTS
MAKING AUTOMATED
SYSTEMS WORK FOR
THE AMERICAN PEOPLE
OCTOBER 2022What was the purpose of the Blueprint for an AI Bill of Rights published by the White House Office of Science and Technology Policy in October 2022?
About this Document
The Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People was
published by the White House Office of Science and Technology Policy in October 2022. This framework was
released one year after OSTP announced the launch of a process to develop “a bill of rights for an AI-powered
world.” Its release follows a year of public engagement to inform this initiative. The framework is available
online at: https://www.whitehouse.gov/ostp/ai-bill-of-rights
About the Office of Science and Technology Policy
The Office of Science and Technology Policy (OSTP) was established by the National Science and Technology
Policy, Organization, and Priorities Act of 1976 to provide the President and others within the Executive Office
of the President with advice on the scientific, engineering, and technological aspects of the economy, national - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 20per_device_eval_batch_size
: 20num_train_epochs
: 5multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 20per_device_eval_batch_size
: 20per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 5max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | cosine_map@100 |
---|---|---|
1.0 | 30 | 0.7953 |
1.6667 | 50 | 0.8326 |
2.0 | 60 | 0.8277 |
3.0 | 90 | 0.8250 |
3.3333 | 100 | 0.8284 |
4.0 | 120 | 0.8200 |
5.0 | 150 | 0.8172 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.1.1
- Transformers: 4.44.2
- PyTorch: 2.4.1+cu121
- Accelerate: 0.34.2
- Datasets: 3.0.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}