|
--- |
|
language: |
|
- en |
|
tags: |
|
- sentence-transformers |
|
- sentence-similarity |
|
- feature-extraction |
|
- generated_from_trainer |
|
- dataset_size:14737 |
|
- loss:MultipleNegativesRankingLoss |
|
base_model: BAAI/bge-large-en-v1.5 |
|
widget: |
|
- source_sentence: >- |
|
Represent this sentence for searching relevant passages: What are some best |
|
practices for ensuring images in horizontal cards are visually appealing |
|
despite being cropped to fit a square format? |
|
sentences: |
|
- > |
|
Tree view |
|
|
|
Usage guidelines |
|
|
|
Horizontal scrolling: If you have a layout that doesn't allow for users to |
|
adjust the width of the container for a tree view, allow them to |
|
horizontally scroll in order to see the full depth of the hierarchy. |
|
|
|
Do: Allow horizontal scrolling in a fixed layout. |
|
- >- |
|
Cards |
|
|
|
Options |
|
|
|
Vertical or horizontal : Standard cards can be laid out vertically |
|
(components are organized in a column) or horizontally (components are |
|
organized in a row). |
|
|
|
|
|
Horizontal cards always have a square preview, and the image is cropped to |
|
fit inside the square. These can only be laid out in a tile grid where every |
|
card is the same size. |
|
- >- |
|
Alert dialog |
|
|
|
Behaviors |
|
|
|
Button group overflow: An alert dialog can have up to 3 buttons. When |
|
horizontal space is limited, button groups stack vertically. They should |
|
appear in ascending order based on importance, with the most critical action |
|
at the bottom. |
|
- source_sentence: >- |
|
Represent this sentence for searching relevant passages: Are there any |
|
guidelines for the timing and smoothness of the fading effect when hovering |
|
over a segment in a donut chart? |
|
sentences: |
|
- >- |
|
Color for data visualization |
|
|
|
Usage guidelines |
|
|
|
Categorical colors are not ordered. Use these for categorical scales. Do not |
|
use these for ordinal, interval, or ratio scales. |
|
|
|
Sequential colors are ordered. Use these for ordinal and interval scales. |
|
It’s also acceptable to use these for ratio scales. Do not use these for |
|
categorical scales. |
|
|
|
Diverging colors are ordered. Use these for ordinal and ratio scales, |
|
especially when there is a meaningful middle value. These may also be used |
|
for interval scales. Do not use these for categorical scales. |
|
- >- |
|
Action group |
|
|
|
Options |
|
|
|
Density: Action groups come in 2 densities: regular and compact. The compact |
|
density retains the same font and icon sizes, but has tighter spacing. The |
|
action buttons also become connected for non-quiet action groups. |
|
- >- |
|
Donut chart |
|
|
|
Behaviors |
|
|
|
Hover: Hovering over a segment of a donut chart causes all other segments to |
|
fade back from the view. A tooltip displays the segment name, percentage of |
|
total, and metric value. |
|
- source_sentence: >- |
|
Represent this sentence for searching relevant passages: Why is it important |
|
to orient the legend to match the chart whenever possible? |
|
sentences: |
|
- >- |
|
Breadcrumbs |
|
|
|
Options |
|
|
|
Multiline: The multiline variation places emphasis on the selected |
|
breadcrumb item as a page title, helping a user to more clearly identify |
|
their current location. |
|
- >- |
|
Cards |
|
|
|
Layout |
|
|
|
Card width: Cards are laid out in either a fluid card grid or have fixed |
|
widths. Most cards can be organized within a grid where the width of each |
|
card is fluid depending on the nature of the grid. In rare cases where cards |
|
can’t be laid out in a card grid, they’ll have a fixed width that is defined |
|
manually. |
|
- >- |
|
Legend |
|
|
|
Options |
|
|
|
Orientation: Legends can have horizontal or vertical orientation. Whenever |
|
possible, orient the legend to match the chart. |
|
- source_sentence: >- |
|
Represent this sentence for searching relevant passages: What is the primary |
|
use case for radio buttons according to the Adobe Spectrum Design |
|
Documentation? |
|
sentences: |
|
- >+ |
|
Radio group |
|
|
|
Usage guidelines |
|
|
|
Use radio buttons for mutually exclusive options: Radio buttons and |
|
[checkboxes](/page/checkbox) are not interchangeable. Radio buttons are best |
|
used for selecting a single option from a list of mutually exclusive |
|
options. Checkboxes are best used for selecting multiple options at once (or |
|
no options). |
|
|
|
- > |
|
Additional resources: - [Human Interface Guidelines: iOS Tab |
|
Bars](https://developer.apple.com/design/human-interface-guidelines/ios/bars/tab-bars/) |
|
|
|
- [Human Interface Guidelines: |
|
Accessibility](https://developer.apple.com/design/human-interface-guidelines/accessibility/overview/introduction/) |
|
- >- |
|
Picker |
|
|
|
Options |
|
|
|
Label position: Labels can be placed either on top or on the side. Top |
|
labels are the default and are recommended because they work better with |
|
long copy, localization, and responsive layouts. Side labels are most useful |
|
when vertical space is limited. |
|
- source_sentence: >- |
|
Represent this sentence for searching relevant passages: How can a designer |
|
balance the need for clear text links and the need for emphasized text in a |
|
user interface? |
|
sentences: |
|
- >- |
|
Meter |
|
|
|
Options |
|
|
|
Positive variant: The positive variant has a green fill to show the value. |
|
This can be used to represent a positive semantic value, such as when |
|
there’s a lot of space remaining. |
|
- >- |
|
Badge |
|
|
|
Options |
|
|
|
Size: Badges come in four different sizes: small, medium, large, and |
|
extra-large. The small size is the default and most frequently used option. |
|
Use the other sizes sparingly to create a hierarchy of importance on a page. |
|
- >+ |
|
Typography |
|
|
|
Usage guidelines |
|
|
|
Don't use underlines for adding emphasis: Underlines are reserved for text |
|
links only. They should not be used as a way for adding emphasis to words. |
|
|
|
datasets: |
|
- JianLiao/spectrum-design-docs |
|
pipeline_tag: sentence-similarity |
|
library_name: sentence-transformers |
|
metrics: |
|
- cosine_accuracy@1 |
|
- cosine_accuracy@3 |
|
- cosine_accuracy@5 |
|
- cosine_accuracy@10 |
|
- cosine_precision@1 |
|
- cosine_precision@3 |
|
- cosine_precision@5 |
|
- cosine_precision@10 |
|
- cosine_recall@1 |
|
- cosine_recall@3 |
|
- cosine_recall@5 |
|
- cosine_recall@10 |
|
- cosine_ndcg@10 |
|
- cosine_mrr@10 |
|
- cosine_map@100 |
|
model-index: |
|
- name: SentenceTransformer based on BAAI/bge-large-en-v1.5 |
|
results: |
|
- task: |
|
type: information-retrieval |
|
name: Information Retrieval |
|
dataset: |
|
name: sds |
|
type: sds |
|
metrics: |
|
- type: cosine_accuracy@1 |
|
value: 0.007462686567164179 |
|
name: Cosine Accuracy@1 |
|
- type: cosine_accuracy@3 |
|
value: 0.015603799185888738 |
|
name: Cosine Accuracy@3 |
|
- type: cosine_accuracy@5 |
|
value: 0.04748982360922659 |
|
name: Cosine Accuracy@5 |
|
- type: cosine_accuracy@10 |
|
value: 0.7815468113975577 |
|
name: Cosine Accuracy@10 |
|
- type: cosine_precision@1 |
|
value: 0.007462686567164179 |
|
name: Cosine Precision@1 |
|
- type: cosine_precision@3 |
|
value: 0.005201266395296246 |
|
name: Cosine Precision@3 |
|
- type: cosine_precision@5 |
|
value: 0.009497964721845319 |
|
name: Cosine Precision@5 |
|
- type: cosine_precision@10 |
|
value: 0.07815468113975575 |
|
name: Cosine Precision@10 |
|
- type: cosine_recall@1 |
|
value: 0.007462686567164179 |
|
name: Cosine Recall@1 |
|
- type: cosine_recall@3 |
|
value: 0.015603799185888738 |
|
name: Cosine Recall@3 |
|
- type: cosine_recall@5 |
|
value: 0.04748982360922659 |
|
name: Cosine Recall@5 |
|
- type: cosine_recall@10 |
|
value: 0.7815468113975577 |
|
name: Cosine Recall@10 |
|
- type: cosine_ndcg@10 |
|
value: 0.25440066233238845 |
|
name: Cosine Ndcg@10 |
|
- type: cosine_mrr@10 |
|
value: 0.10778547737502948 |
|
name: Cosine Mrr@10 |
|
- type: cosine_map@100 |
|
value: 0.11639203259428242 |
|
name: Cosine Map@100 |
|
license: mit |
|
--- |
|
|
|
# SentenceTransformer based on BAAI/bge-large-en-v1.5 |
|
|
|
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) on the [spectrum-design-docs](https://huggingface.co/datasets/JianLiao/spectrum-design-docs) dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
- **Model Type:** Sentence Transformer |
|
- **Base model:** [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) <!-- at revision d4aa6901d3a41ba39fb536a557fa166f842b0e09 --> |
|
- **Maximum Sequence Length:** 512 tokens |
|
- **Output Dimensionality:** 1024 dimensions |
|
- **Similarity Function:** Cosine Similarity |
|
- **Training Dataset:** |
|
- [spectrum-design-docs](https://huggingface.co/datasets/JianLiao/spectrum-design-docs) |
|
- **Language:** en |
|
<!-- - **License:** Unknown --> |
|
|
|
### Model Sources |
|
|
|
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net) |
|
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) |
|
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) |
|
|
|
### Full Model Architecture |
|
|
|
``` |
|
SentenceTransformer( |
|
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel |
|
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) |
|
(2): Normalize() |
|
) |
|
``` |
|
|
|
## Usage |
|
|
|
### Direct Usage (Sentence Transformers) |
|
|
|
First install the Sentence Transformers library: |
|
|
|
```bash |
|
pip install -U sentence-transformers |
|
``` |
|
|
|
Then you can load this model and run inference. |
|
```python |
|
from sentence_transformers import SentenceTransformer |
|
|
|
# Download from the 🤗 Hub |
|
model = SentenceTransformer("JianLiao/spectrum-doc-fine-tuned") |
|
# Run inference |
|
sentences = [ |
|
'Represent this sentence for searching relevant passages: How can a designer balance the need for clear text links and the need for emphasized text in a user interface?', |
|
"Typography\nUsage guidelines\nDon't use underlines for adding emphasis: Underlines are reserved for text links only. They should not be used as a way for adding emphasis to words.\n\n", |
|
'Meter\nOptions\nPositive variant: The positive variant has a green fill to show the value. This can be used to represent a positive semantic value, such as when there’s a lot of space remaining.', |
|
] |
|
embeddings = model.encode(sentences) |
|
print(embeddings.shape) |
|
# [3, 1024] |
|
|
|
# Get the similarity scores for the embeddings |
|
similarities = model.similarity(embeddings, embeddings) |
|
print(similarities.shape) |
|
# [3, 3] |
|
``` |
|
|
|
<!-- |
|
### Direct Usage (Transformers) |
|
|
|
<details><summary>Click to see the direct usage in Transformers</summary> |
|
|
|
</details> |
|
--> |
|
|
|
<!-- |
|
### Downstream Usage (Sentence Transformers) |
|
|
|
You can finetune this model on your own dataset. |
|
|
|
<details><summary>Click to expand</summary> |
|
|
|
</details> |
|
--> |
|
|
|
<!-- |
|
### Out-of-Scope Use |
|
|
|
*List how the model may foreseeably be misused and address what users ought not to do with the model.* |
|
--> |
|
|
|
## Evaluation |
|
|
|
### Metrics |
|
|
|
#### Information Retrieval |
|
|
|
* Dataset: `sds` |
|
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) |
|
|
|
| Metric | Value | |
|
|:--------------------|:-----------| |
|
| cosine_accuracy@1 | 0.0075 | |
|
| cosine_accuracy@3 | 0.0156 | |
|
| cosine_accuracy@5 | 0.0475 | |
|
| cosine_accuracy@10 | 0.7815 | |
|
| cosine_precision@1 | 0.0075 | |
|
| cosine_precision@3 | 0.0052 | |
|
| cosine_precision@5 | 0.0095 | |
|
| cosine_precision@10 | 0.0782 | |
|
| cosine_recall@1 | 0.0075 | |
|
| cosine_recall@3 | 0.0156 | |
|
| cosine_recall@5 | 0.0475 | |
|
| cosine_recall@10 | 0.7815 | |
|
| **cosine_ndcg@10** | **0.2544** | |
|
| cosine_mrr@10 | 0.1078 | |
|
| cosine_map@100 | 0.1164 | |
|
|
|
<!-- |
|
## Bias, Risks and Limitations |
|
|
|
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.* |
|
--> |
|
|
|
<!-- |
|
### Recommendations |
|
|
|
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.* |
|
--> |
|
|
|
## Training Details |
|
|
|
### Training Dataset |
|
|
|
#### spectrum-design-docs |
|
|
|
* Dataset: [spectrum-design-docs](https://huggingface.co/datasets/JianLiao/spectrum-design-docs) at [23f5565](https://huggingface.co/datasets/JianLiao/spectrum-design-docs/tree/23f5565f9fc1cfe31d1245ca9e5368f00fcaec00) |
|
* Size: 14,737 training samples |
|
* Columns: <code>anchor</code> and <code>positive</code> |
|
* Approximate statistics based on the first 1000 samples: |
|
| | anchor | positive | |
|
|:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| |
|
| type | string | string | |
|
| details | <ul><li>min: 20 tokens</li><li>mean: 30.87 tokens</li><li>max: 47 tokens</li></ul> | <ul><li>min: 18 tokens</li><li>mean: 97.17 tokens</li><li>max: 512 tokens</li></ul> | |
|
* Samples: |
|
| anchor | positive | |
|
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
|
| <code>Represent this sentence for searching relevant passages: Are there any specific guidelines or best practices provided by the Spectrum team for integrating Spectrum CSS into a new or existing project?</code> | <code>Spectrum CSS: An open source CSS-only implementation of Spectrum, maintained by the Spectrum team. <br><div class="well-box">Dependency chain: Spectrum DNA → Spectrum CSS</div><br><br>[GitHub repository](https://github.com/adobe/spectrum-css/) <br>[Website](https://opensource.adobe.com/spectrum-css/) <br>[#spectrum_css](https://adobe.slack.com/archives/C5N154FEY)</code> | |
|
| <code>Represent this sentence for searching relevant passages: How does the default setting for progress circles affect their behavior in a UI?</code> | <code>Progress circle<br>Options<br>Indeterminate: A progress circle can be either determinate or indeterminate. By default, progress circles are determinate. Use a determinate progress circle when progress can be calculated against a specific goal (e.g., downloading a file of a known size). Use an indeterminate progress circle when progress is happening but the time or effort to completion can’t be determined (e.g., attempting to reconnect to a server).</code> | |
|
| <code>Represent this sentence for searching relevant passages: What tools or methods can designers use to test the effectiveness of wrapped legends in their designs?</code> | <code>Legend<br>Behaviors<br>Wrapping: When there isn’t enough space, wrap legends to ensure that dimension values are shown.</code> | |
|
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: |
|
```json |
|
{ |
|
"scale": 20.0, |
|
"similarity_fct": "cos_sim" |
|
} |
|
``` |
|
|
|
### Training Hyperparameters |
|
#### Non-Default Hyperparameters |
|
|
|
- `eval_strategy`: epoch |
|
- `per_device_train_batch_size`: 22 |
|
- `per_device_eval_batch_size`: 16 |
|
- `gradient_accumulation_steps`: 16 |
|
- `learning_rate`: 2e-05 |
|
- `num_train_epochs`: 100 |
|
- `lr_scheduler_type`: cosine |
|
- `warmup_ratio`: 0.1 |
|
- `bf16`: True |
|
- `tf32`: True |
|
- `load_best_model_at_end`: True |
|
- `optim`: adamw_torch_fused |
|
- `prompts`: {'anchor': 'Represent this sentence for searching relevant passages: '} |
|
- `batch_sampler`: no_duplicates |
|
|
|
#### All Hyperparameters |
|
<details><summary>Click to expand</summary> |
|
|
|
- `overwrite_output_dir`: False |
|
- `do_predict`: False |
|
- `eval_strategy`: epoch |
|
- `prediction_loss_only`: True |
|
- `per_device_train_batch_size`: 22 |
|
- `per_device_eval_batch_size`: 16 |
|
- `per_gpu_train_batch_size`: None |
|
- `per_gpu_eval_batch_size`: None |
|
- `gradient_accumulation_steps`: 16 |
|
- `eval_accumulation_steps`: None |
|
- `torch_empty_cache_steps`: None |
|
- `learning_rate`: 2e-05 |
|
- `weight_decay`: 0.0 |
|
- `adam_beta1`: 0.9 |
|
- `adam_beta2`: 0.999 |
|
- `adam_epsilon`: 1e-08 |
|
- `max_grad_norm`: 1.0 |
|
- `num_train_epochs`: 100 |
|
- `max_steps`: -1 |
|
- `lr_scheduler_type`: cosine |
|
- `lr_scheduler_kwargs`: {} |
|
- `warmup_ratio`: 0.1 |
|
- `warmup_steps`: 0 |
|
- `log_level`: passive |
|
- `log_level_replica`: warning |
|
- `log_on_each_node`: True |
|
- `logging_nan_inf_filter`: True |
|
- `save_safetensors`: True |
|
- `save_on_each_node`: False |
|
- `save_only_model`: False |
|
- `restore_callback_states_from_checkpoint`: False |
|
- `no_cuda`: False |
|
- `use_cpu`: False |
|
- `use_mps_device`: False |
|
- `seed`: 42 |
|
- `data_seed`: None |
|
- `jit_mode_eval`: False |
|
- `use_ipex`: False |
|
- `bf16`: True |
|
- `fp16`: False |
|
- `fp16_opt_level`: O1 |
|
- `half_precision_backend`: auto |
|
- `bf16_full_eval`: False |
|
- `fp16_full_eval`: False |
|
- `tf32`: True |
|
- `local_rank`: 0 |
|
- `ddp_backend`: None |
|
- `tpu_num_cores`: None |
|
- `tpu_metrics_debug`: False |
|
- `debug`: [] |
|
- `dataloader_drop_last`: True |
|
- `dataloader_num_workers`: 0 |
|
- `dataloader_prefetch_factor`: None |
|
- `past_index`: -1 |
|
- `disable_tqdm`: False |
|
- `remove_unused_columns`: True |
|
- `label_names`: None |
|
- `load_best_model_at_end`: True |
|
- `ignore_data_skip`: False |
|
- `fsdp`: [] |
|
- `fsdp_min_num_params`: 0 |
|
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} |
|
- `fsdp_transformer_layer_cls_to_wrap`: None |
|
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} |
|
- `deepspeed`: None |
|
- `label_smoothing_factor`: 0.0 |
|
- `optim`: adamw_torch_fused |
|
- `optim_args`: None |
|
- `adafactor`: False |
|
- `group_by_length`: False |
|
- `length_column_name`: length |
|
- `ddp_find_unused_parameters`: None |
|
- `ddp_bucket_cap_mb`: None |
|
- `ddp_broadcast_buffers`: False |
|
- `dataloader_pin_memory`: True |
|
- `dataloader_persistent_workers`: False |
|
- `skip_memory_metrics`: True |
|
- `use_legacy_prediction_loop`: False |
|
- `push_to_hub`: False |
|
- `resume_from_checkpoint`: None |
|
- `hub_model_id`: None |
|
- `hub_strategy`: every_save |
|
- `hub_private_repo`: None |
|
- `hub_always_push`: False |
|
- `gradient_checkpointing`: False |
|
- `gradient_checkpointing_kwargs`: None |
|
- `include_inputs_for_metrics`: False |
|
- `include_for_metrics`: [] |
|
- `eval_do_concat_batches`: True |
|
- `fp16_backend`: auto |
|
- `push_to_hub_model_id`: None |
|
- `push_to_hub_organization`: None |
|
- `mp_parameters`: |
|
- `auto_find_batch_size`: False |
|
- `full_determinism`: False |
|
- `torchdynamo`: None |
|
- `ray_scope`: last |
|
- `ddp_timeout`: 1800 |
|
- `torch_compile`: False |
|
- `torch_compile_backend`: None |
|
- `torch_compile_mode`: None |
|
- `dispatch_batches`: None |
|
- `split_batches`: None |
|
- `include_tokens_per_second`: False |
|
- `include_num_input_tokens_seen`: False |
|
- `neftune_noise_alpha`: None |
|
- `optim_target_modules`: None |
|
- `batch_eval_metrics`: False |
|
- `eval_on_start`: False |
|
- `use_liger_kernel`: False |
|
- `eval_use_gather_object`: False |
|
- `average_tokens_across_devices`: False |
|
- `prompts`: {'anchor': 'Represent this sentence for searching relevant passages: '} |
|
- `batch_sampler`: no_duplicates |
|
- `multi_dataset_batch_sampler`: proportional |
|
|
|
</details> |
|
|
|
### Training Logs |
|
<details><summary>Click to expand</summary> |
|
|
|
| Epoch | Step | Training Loss | sds_cosine_ndcg@10 | |
|
|:--------:|:-------:|:-------------:|:------------------:| |
|
| 1.0 | 7 | - | 0.2255 | |
|
| 1.48 | 10 | 0.2646 | - | |
|
| 2.0 | 14 | - | 0.2282 | |
|
| 2.96 | 20 | 0.1412 | - | |
|
| 3.0 | 21 | - | 0.2358 | |
|
| 4.0 | 28 | - | 0.2397 | |
|
| 4.32 | 30 | 0.0638 | - | |
|
| 5.0 | 35 | - | 0.2430 | |
|
| 5.8 | 40 | 0.0425 | - | |
|
| 6.0 | 42 | - | 0.2449 | |
|
| 7.0 | 49 | - | 0.2462 | |
|
| 7.16 | 50 | 0.0237 | - | |
|
| 8.0 | 56 | - | 0.2428 | |
|
| 8.64 | 60 | 0.015 | - | |
|
| 9.0 | 63 | - | 0.2456 | |
|
| 10.0 | 70 | 0.0082 | 0.2456 | |
|
| 11.0 | 77 | - | 0.2498 | |
|
| 11.48 | 80 | 0.0052 | - | |
|
| 12.0 | 84 | - | 0.2474 | |
|
| 12.96 | 90 | 0.0035 | - | |
|
| 13.0 | 91 | - | 0.2455 | |
|
| 14.0 | 98 | - | 0.2475 | |
|
| 14.32 | 100 | 0.0022 | - | |
|
| 15.0 | 105 | - | 0.2472 | |
|
| 15.8 | 110 | 0.002 | - | |
|
| 16.0 | 112 | - | 0.2486 | |
|
| 17.0 | 119 | - | 0.2506 | |
|
| 17.16 | 120 | 0.0015 | - | |
|
| 18.0 | 126 | - | 0.2490 | |
|
| 18.64 | 130 | 0.0013 | - | |
|
| 19.0 | 133 | - | 0.2489 | |
|
| 20.0 | 140 | 0.0012 | 0.2491 | |
|
| 21.0 | 147 | - | 0.2493 | |
|
| 21.48 | 150 | 0.0011 | - | |
|
| 22.0 | 154 | - | 0.2487 | |
|
| 22.96 | 160 | 0.001 | - | |
|
| 23.0 | 161 | - | 0.2486 | |
|
| 24.0 | 168 | - | 0.2490 | |
|
| 24.32 | 170 | 0.0008 | - | |
|
| 25.0 | 175 | - | 0.2502 | |
|
| 25.8 | 180 | 0.0008 | - | |
|
| 26.0 | 182 | - | 0.2505 | |
|
| 27.0 | 189 | - | 0.2523 | |
|
| 27.16 | 190 | 0.0008 | - | |
|
| 28.0 | 196 | - | 0.2516 | |
|
| 28.64 | 200 | 0.0007 | - | |
|
| 29.0 | 203 | - | 0.2509 | |
|
| 30.0 | 210 | 0.0007 | 0.2522 | |
|
| 31.0 | 217 | - | 0.2522 | |
|
| 31.48 | 220 | 0.0006 | - | |
|
| 32.0 | 224 | - | 0.2534 | |
|
| 32.96 | 230 | 0.0007 | - | |
|
| 33.0 | 231 | - | 0.2523 | |
|
| 34.0 | 238 | - | 0.2524 | |
|
| 34.32 | 240 | 0.0006 | - | |
|
| 35.0 | 245 | - | 0.2518 | |
|
| 35.8 | 250 | 0.0006 | - | |
|
| 36.0 | 252 | - | 0.2529 | |
|
| 37.0 | 259 | - | 0.2524 | |
|
| 37.16 | 260 | 0.0006 | - | |
|
| 38.0 | 266 | - | 0.2530 | |
|
| 38.64 | 270 | 0.0005 | - | |
|
| 39.0 | 273 | - | 0.2526 | |
|
| 40.0 | 280 | 0.0006 | 0.2539 | |
|
| 41.0 | 287 | - | 0.2529 | |
|
| 41.48 | 290 | 0.0005 | - | |
|
| 42.0 | 294 | - | 0.2545 | |
|
| 42.96 | 300 | 0.0006 | - | |
|
| 43.0 | 301 | - | 0.2534 | |
|
| 44.0 | 308 | - | 0.2536 | |
|
| 44.32 | 310 | 0.0004 | - | |
|
| 45.0 | 315 | - | 0.2521 | |
|
| 45.8 | 320 | 0.0005 | - | |
|
| 46.0 | 322 | - | 0.2532 | |
|
| 47.0 | 329 | - | 0.2519 | |
|
| 47.16 | 330 | 0.0005 | - | |
|
| 48.0 | 336 | - | 0.2525 | |
|
| 48.64 | 340 | 0.0004 | - | |
|
| 49.0 | 343 | - | 0.2535 | |
|
| 50.0 | 350 | 0.0005 | 0.2542 | |
|
| 51.0 | 357 | - | 0.2540 | |
|
| 51.48 | 360 | 0.0004 | - | |
|
| 52.0 | 364 | - | 0.2542 | |
|
| 52.96 | 370 | 0.0005 | - | |
|
| 53.0 | 371 | - | 0.2538 | |
|
| 54.0 | 378 | - | 0.2533 | |
|
| 54.32 | 380 | 0.0004 | - | |
|
| 55.0 | 385 | - | 0.2544 | |
|
| 55.8 | 390 | 0.0004 | - | |
|
| 56.0 | 392 | - | 0.2539 | |
|
| 57.0 | 399 | - | 0.2541 | |
|
| 57.16 | 400 | 0.0005 | - | |
|
| 58.0 | 406 | - | 0.2532 | |
|
| 58.64 | 410 | 0.0004 | - | |
|
| 59.0 | 413 | - | 0.2543 | |
|
| 60.0 | 420 | 0.0004 | 0.2532 | |
|
| 61.0 | 427 | - | 0.2541 | |
|
| 61.48 | 430 | 0.0004 | - | |
|
| 62.0 | 434 | - | 0.2542 | |
|
| 62.96 | 440 | 0.0005 | - | |
|
| 63.0 | 441 | - | 0.2546 | |
|
| 64.0 | 448 | - | 0.2549 | |
|
| 64.32 | 450 | 0.0003 | - | |
|
| **65.0** | **455** | **-** | **0.2557** | |
|
| 65.8 | 460 | 0.0004 | - | |
|
| 66.0 | 462 | - | 0.2557 | |
|
| 67.0 | 469 | - | 0.2539 | |
|
| 67.16 | 470 | 0.0004 | - | |
|
| 68.0 | 476 | - | 0.2538 | |
|
| 68.64 | 480 | 0.0004 | - | |
|
| 69.0 | 483 | - | 0.2538 | |
|
| 70.0 | 490 | 0.0004 | 0.2542 | |
|
| 71.0 | 497 | - | 0.2532 | |
|
| 71.48 | 500 | 0.0004 | - | |
|
| 72.0 | 504 | - | 0.2538 | |
|
| 72.96 | 510 | 0.0004 | - | |
|
| 73.0 | 511 | - | 0.2545 | |
|
| 74.0 | 518 | - | 0.2531 | |
|
| 74.32 | 520 | 0.0003 | - | |
|
| 75.0 | 525 | - | 0.2534 | |
|
| 75.8 | 530 | 0.0004 | - | |
|
| 76.0 | 532 | - | 0.2541 | |
|
| 77.0 | 539 | - | 0.2545 | |
|
| 77.16 | 540 | 0.0004 | - | |
|
| 78.0 | 546 | - | 0.2536 | |
|
| 78.64 | 550 | 0.0004 | - | |
|
| 79.0 | 553 | - | 0.2545 | |
|
| 80.0 | 560 | 0.0004 | 0.2540 | |
|
| 81.0 | 567 | - | 0.2545 | |
|
| 81.48 | 570 | 0.0004 | - | |
|
| 82.0 | 574 | - | 0.2541 | |
|
| 82.96 | 580 | 0.0004 | - | |
|
| 83.0 | 581 | - | 0.2545 | |
|
| 84.0 | 588 | - | 0.2538 | |
|
| 84.32 | 590 | 0.0004 | - | |
|
| 85.0 | 595 | - | 0.2546 | |
|
| 85.8 | 600 | 0.0004 | 0.2544 | |
|
|
|
* The bold row denotes the saved checkpoint. |
|
</details> |
|
|
|
### Framework Versions |
|
- Python: 3.12.8 |
|
- Sentence Transformers: 3.3.1 |
|
- Transformers: 4.47.1 |
|
- PyTorch: 2.5.1+cu124 |
|
- Accelerate: 1.2.1 |
|
- Datasets: 3.2.0 |
|
- Tokenizers: 0.21.0 |
|
|
|
## Citation |
|
|
|
### BibTeX |
|
|
|
#### Sentence Transformers |
|
```bibtex |
|
@inproceedings{reimers-2019-sentence-bert, |
|
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", |
|
author = "Reimers, Nils and Gurevych, Iryna", |
|
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", |
|
month = "11", |
|
year = "2019", |
|
publisher = "Association for Computational Linguistics", |
|
url = "https://arxiv.org/abs/1908.10084", |
|
} |
|
``` |
|
|
|
#### MultipleNegativesRankingLoss |
|
```bibtex |
|
@misc{henderson2017efficient, |
|
title={Efficient Natural Language Response Suggestion for Smart Reply}, |
|
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, |
|
year={2017}, |
|
eprint={1705.00652}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
``` |
|
|
|
<!-- |
|
## Glossary |
|
|
|
*Clearly define terms in order to be accessible across audiences.* |
|
--> |
|
|
|
<!-- |
|
## Model Card Authors |
|
|
|
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.* |
|
--> |
|
|
|
<!-- |
|
## Model Card Contact |
|
|
|
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.* |
|
--> |