JianLiao's picture
Update README.md
e6f84b7 verified
metadata
language:
  - en
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:14737
  - loss:MultipleNegativesRankingLoss
base_model: BAAI/bge-large-en-v1.5
widget:
  - source_sentence: >-
      Represent this sentence for searching relevant passages: What are some
      best practices for ensuring images in horizontal cards are visually
      appealing despite being cropped to fit a square format?
    sentences:
      - >
        Tree view

        Usage guidelines

        Horizontal scrolling: If you have a layout that doesn't allow for users
        to adjust the width of the container for a tree view, allow them to
        horizontally scroll in order to see the full depth of the hierarchy.

        Do: Allow horizontal scrolling in a fixed layout.
      - >-
        Cards

        Options

        Vertical or horizontal : Standard cards can be laid out vertically
        (components are organized in a column) or horizontally (components are
        organized in a row).


        Horizontal cards always have a square preview, and the image is cropped
        to fit inside the square. These can only be laid out in a tile grid
        where every card is the same size.
      - >-
        Alert dialog

        Behaviors

        Button group overflow: An alert dialog can have up to 3 buttons. When
        horizontal space is limited, button groups stack vertically. They should
        appear in ascending order based on importance, with the most critical
        action at the bottom.
  - source_sentence: >-
      Represent this sentence for searching relevant passages: Are there any
      guidelines for the timing and smoothness of the fading effect when
      hovering over a segment in a donut chart?
    sentences:
      - >-
        Color for data visualization

        Usage guidelines

        Categorical colors are not ordered. Use these for categorical scales. Do
        not use these for ordinal, interval, or ratio scales.

        Sequential colors are ordered. Use these for ordinal and interval
        scales. It’s also acceptable to use these for ratio scales. Do not use
        these for categorical scales.

        Diverging colors are ordered. Use these for ordinal and ratio scales,
        especially when there is a meaningful middle value. These may also be
        used for interval scales. Do not use these for categorical scales.
      - >-
        Action group

        Options

        Density: Action groups come in 2 densities: regular and compact. The
        compact density retains the same font and icon sizes, but has tighter
        spacing. The action buttons also become connected for non-quiet action
        groups.
      - >-
        Donut chart

        Behaviors

        Hover: Hovering over a segment of a donut chart causes all other
        segments to fade back from the view. A tooltip displays the segment
        name, percentage of total, and metric value.
  - source_sentence: >-
      Represent this sentence for searching relevant passages: Why is it
      important to orient the legend to match the chart whenever possible?
    sentences:
      - >-
        Breadcrumbs

        Options

        Multiline: The multiline variation places emphasis on the selected
        breadcrumb item as a page title, helping a user to more clearly identify
        their current location.
      - >-
        Cards

        Layout

        Card width: Cards are laid out in either a fluid card grid or have fixed
        widths. Most cards can be organized within a grid where the width of
        each card is fluid depending on the nature of the grid. In rare cases
        where cards can’t be laid out in a card grid, they’ll have a fixed width
        that is defined manually.
      - >-
        Legend

        Options

        Orientation: Legends can have horizontal or vertical orientation.
        Whenever possible, orient the legend to match the chart.
  - source_sentence: >-
      Represent this sentence for searching relevant passages: What is the
      primary use case for radio buttons according to the Adobe Spectrum Design
      Documentation?
    sentences:
      - >+
        Radio group

        Usage guidelines

        Use radio buttons for mutually exclusive options: Radio buttons and
        [checkboxes](/page/checkbox) are not interchangeable. Radio buttons are
        best used for selecting a single option from a list of mutually
        exclusive options. Checkboxes are best used for selecting multiple
        options at once (or no options).

      - >
        Additional resources: - [Human Interface Guidelines: iOS Tab
        Bars](https://developer.apple.com/design/human-interface-guidelines/ios/bars/tab-bars/)

        - [Human Interface Guidelines:
        Accessibility](https://developer.apple.com/design/human-interface-guidelines/accessibility/overview/introduction/)
      - >-
        Picker

        Options

        Label position: Labels can be placed  either on top or on the side. Top
        labels are the default and are recommended because they work better with
        long copy, localization, and responsive layouts. Side labels are most
        useful when vertical space is limited.
  - source_sentence: >-
      Represent this sentence for searching relevant passages: How can a
      designer balance the need for clear text links and the need for emphasized
      text in a user interface?
    sentences:
      - >-
        Meter

        Options

        Positive variant: The positive variant has a green fill to show the
        value. This can be used to represent a positive semantic value, such as
        when there’s a lot of space remaining.
      - >-
        Badge

        Options

        Size: Badges come in four different sizes: small, medium, large, and
        extra-large. The small size is the default and most frequently used
        option. Use the other sizes sparingly to create a hierarchy of
        importance on a page.
      - >+
        Typography

        Usage guidelines

        Don't use underlines for adding emphasis: Underlines are reserved for
        text links only. They should not be used as a way for adding emphasis to
        words.

datasets:
  - JianLiao/spectrum-design-docs
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on BAAI/bge-large-en-v1.5
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: sds
          type: sds
        metrics:
          - type: cosine_accuracy@1
            value: 0.007462686567164179
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.015603799185888738
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.04748982360922659
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.7815468113975577
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.007462686567164179
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.005201266395296246
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.009497964721845319
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.07815468113975575
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.007462686567164179
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.015603799185888738
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.04748982360922659
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.7815468113975577
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.25440066233238845
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.10778547737502948
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.11639203259428242
            name: Cosine Map@100
license: mit

SentenceTransformer based on BAAI/bge-large-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-large-en-v1.5 on the spectrum-design-docs dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-large-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("JianLiao/spectrum-doc-fine-tuned")
# Run inference
sentences = [
    'Represent this sentence for searching relevant passages: How can a designer balance the need for clear text links and the need for emphasized text in a user interface?',
    "Typography\nUsage guidelines\nDon't use underlines for adding emphasis: Underlines are reserved for text links only. They should not be used as a way for adding emphasis to words.\n\n",
    'Meter\nOptions\nPositive variant: The positive variant has a green fill to show the value. This can be used to represent a positive semantic value, such as when there’s a lot of space remaining.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.0075
cosine_accuracy@3 0.0156
cosine_accuracy@5 0.0475
cosine_accuracy@10 0.7815
cosine_precision@1 0.0075
cosine_precision@3 0.0052
cosine_precision@5 0.0095
cosine_precision@10 0.0782
cosine_recall@1 0.0075
cosine_recall@3 0.0156
cosine_recall@5 0.0475
cosine_recall@10 0.7815
cosine_ndcg@10 0.2544
cosine_mrr@10 0.1078
cosine_map@100 0.1164

Training Details

Training Dataset

spectrum-design-docs

  • Dataset: spectrum-design-docs at 23f5565
  • Size: 14,737 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 20 tokens
    • mean: 30.87 tokens
    • max: 47 tokens
    • min: 18 tokens
    • mean: 97.17 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    Represent this sentence for searching relevant passages: Are there any specific guidelines or best practices provided by the Spectrum team for integrating Spectrum CSS into a new or existing project? Spectrum CSS: An open source CSS-only implementation of Spectrum, maintained by the Spectrum team.
    Dependency chain: Spectrum DNA → Spectrum CSS


    GitHub repository
    Website
    #spectrum_css
    Represent this sentence for searching relevant passages: How does the default setting for progress circles affect their behavior in a UI? Progress circle
    Options
    Indeterminate: A progress circle can be either determinate or indeterminate. By default, progress circles are determinate. Use a determinate progress circle when progress can be calculated against a specific goal (e.g., downloading a file of a known size). Use an indeterminate progress circle when progress is happening but the time or effort to completion can’t be determined (e.g., attempting to reconnect to a server).
    Represent this sentence for searching relevant passages: What tools or methods can designers use to test the effectiveness of wrapped legends in their designs? Legend
    Behaviors
    Wrapping: When there isn’t enough space, wrap legends to ensure that dimension values are shown.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 22
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 100
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • prompts: {'anchor': 'Represent this sentence for searching relevant passages: '}
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 22
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 100
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: {'anchor': 'Represent this sentence for searching relevant passages: '}
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss sds_cosine_ndcg@10
1.0 7 - 0.2255
1.48 10 0.2646 -
2.0 14 - 0.2282
2.96 20 0.1412 -
3.0 21 - 0.2358
4.0 28 - 0.2397
4.32 30 0.0638 -
5.0 35 - 0.2430
5.8 40 0.0425 -
6.0 42 - 0.2449
7.0 49 - 0.2462
7.16 50 0.0237 -
8.0 56 - 0.2428
8.64 60 0.015 -
9.0 63 - 0.2456
10.0 70 0.0082 0.2456
11.0 77 - 0.2498
11.48 80 0.0052 -
12.0 84 - 0.2474
12.96 90 0.0035 -
13.0 91 - 0.2455
14.0 98 - 0.2475
14.32 100 0.0022 -
15.0 105 - 0.2472
15.8 110 0.002 -
16.0 112 - 0.2486
17.0 119 - 0.2506
17.16 120 0.0015 -
18.0 126 - 0.2490
18.64 130 0.0013 -
19.0 133 - 0.2489
20.0 140 0.0012 0.2491
21.0 147 - 0.2493
21.48 150 0.0011 -
22.0 154 - 0.2487
22.96 160 0.001 -
23.0 161 - 0.2486
24.0 168 - 0.2490
24.32 170 0.0008 -
25.0 175 - 0.2502
25.8 180 0.0008 -
26.0 182 - 0.2505
27.0 189 - 0.2523
27.16 190 0.0008 -
28.0 196 - 0.2516
28.64 200 0.0007 -
29.0 203 - 0.2509
30.0 210 0.0007 0.2522
31.0 217 - 0.2522
31.48 220 0.0006 -
32.0 224 - 0.2534
32.96 230 0.0007 -
33.0 231 - 0.2523
34.0 238 - 0.2524
34.32 240 0.0006 -
35.0 245 - 0.2518
35.8 250 0.0006 -
36.0 252 - 0.2529
37.0 259 - 0.2524
37.16 260 0.0006 -
38.0 266 - 0.2530
38.64 270 0.0005 -
39.0 273 - 0.2526
40.0 280 0.0006 0.2539
41.0 287 - 0.2529
41.48 290 0.0005 -
42.0 294 - 0.2545
42.96 300 0.0006 -
43.0 301 - 0.2534
44.0 308 - 0.2536
44.32 310 0.0004 -
45.0 315 - 0.2521
45.8 320 0.0005 -
46.0 322 - 0.2532
47.0 329 - 0.2519
47.16 330 0.0005 -
48.0 336 - 0.2525
48.64 340 0.0004 -
49.0 343 - 0.2535
50.0 350 0.0005 0.2542
51.0 357 - 0.2540
51.48 360 0.0004 -
52.0 364 - 0.2542
52.96 370 0.0005 -
53.0 371 - 0.2538
54.0 378 - 0.2533
54.32 380 0.0004 -
55.0 385 - 0.2544
55.8 390 0.0004 -
56.0 392 - 0.2539
57.0 399 - 0.2541
57.16 400 0.0005 -
58.0 406 - 0.2532
58.64 410 0.0004 -
59.0 413 - 0.2543
60.0 420 0.0004 0.2532
61.0 427 - 0.2541
61.48 430 0.0004 -
62.0 434 - 0.2542
62.96 440 0.0005 -
63.0 441 - 0.2546
64.0 448 - 0.2549
64.32 450 0.0003 -
65.0 455 - 0.2557
65.8 460 0.0004 -
66.0 462 - 0.2557
67.0 469 - 0.2539
67.16 470 0.0004 -
68.0 476 - 0.2538
68.64 480 0.0004 -
69.0 483 - 0.2538
70.0 490 0.0004 0.2542
71.0 497 - 0.2532
71.48 500 0.0004 -
72.0 504 - 0.2538
72.96 510 0.0004 -
73.0 511 - 0.2545
74.0 518 - 0.2531
74.32 520 0.0003 -
75.0 525 - 0.2534
75.8 530 0.0004 -
76.0 532 - 0.2541
77.0 539 - 0.2545
77.16 540 0.0004 -
78.0 546 - 0.2536
78.64 550 0.0004 -
79.0 553 - 0.2545
80.0 560 0.0004 0.2540
81.0 567 - 0.2545
81.48 570 0.0004 -
82.0 574 - 0.2541
82.96 580 0.0004 -
83.0 581 - 0.2545
84.0 588 - 0.2538
84.32 590 0.0004 -
85.0 595 - 0.2546
85.8 600 0.0004 0.2544
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.8
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.1
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}