Add new SentenceTransformer model

Browse files

Files changed (11) hide show

1_Pooling/config.json +10 -0
README.md +727 -0
config.json +32 -0
config_sentence_transformers.json +12 -0
model.safetensors +3 -0
modules.json +20 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +65 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 1024,
+  "pooling_mode_cls_token": true,
+  "pooling_mode_mean_tokens": false,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,727 @@

+---
+language:
+- en
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:14737
+- loss:MultipleNegativesRankingLoss
+base_model: BAAI/bge-large-en-v1.5
+widget:
+- source_sentence: 'Represent this sentence for searching relevant passages: What
+    are some best practices for ensuring images in horizontal cards are visually appealing
+    despite being cropped to fit a square format?'
+  sentences:
+  - 'Tree view
+    Usage guidelines
+    Horizontal scrolling: If you have a layout that doesn''t allow for users to adjust
+    the width of the container for a tree view, allow them to horizontally scroll
+    in order to see the full depth of the hierarchy.
+    Do: Allow horizontal scrolling in a fixed layout.
+    '
+  - 'Cards
+    Options
+    Vertical or horizontal : Standard cards can be laid out vertically (components
+    are organized in a column) or horizontally (components are organized in a row).
+    Horizontal cards always have a square preview, and the image is cropped to fit
+    inside the square. These can only be laid out in a tile grid where every card
+    is the same size.'
+  - 'Alert dialog
+    Behaviors
+    Button group overflow: An alert dialog can have up to 3 buttons. When horizontal
+    space is limited, button groups stack vertically. They should appear in ascending
+    order based on importance, with the most critical action at the bottom.'
+- source_sentence: 'Represent this sentence for searching relevant passages: Are there
+    any guidelines for the timing and smoothness of the fading effect when hovering
+    over a segment in a donut chart?'
+  sentences:
+  - 'Color for data visualization
+    Usage guidelines
+    Categorical colors are not ordered. Use these for categorical scales. Do not use
+    these for ordinal, interval, or ratio scales.
+    Sequential colors are ordered. Use these for ordinal and interval scales. It’s
+    also acceptable to use these for ratio scales. Do not use these for categorical
+    scales.
+    Diverging colors are ordered. Use these for ordinal and ratio scales, especially
+    when there is a meaningful middle value. These may also be used for interval scales.
+    Do not use these for categorical scales.'
+  - 'Action group
+    Options
+    Density: Action groups come in 2 densities: regular and compact. The compact density
+    retains the same font and icon sizes, but has tighter spacing. The action buttons
+    also become connected for non-quiet action groups.'
+  - 'Donut chart
+    Behaviors
+    Hover: Hovering over a segment of a donut chart causes all other segments to fade
+    back from the view. A tooltip displays the segment name, percentage of total,
+    and metric value.'
+- source_sentence: 'Represent this sentence for searching relevant passages: Why is
+    it important to orient the legend to match the chart whenever possible?'
+  sentences:
+  - 'Breadcrumbs
+    Options
+    Multiline: The multiline variation places emphasis on the selected breadcrumb
+    item as a page title, helping a user to more clearly identify their current location.'
+  - 'Cards
+    Layout
+    Card width: Cards are laid out in either a fluid card grid or have fixed widths.
+    Most cards can be organized within a grid where the width of each card is fluid
+    depending on the nature of the grid. In rare cases where cards can’t be laid out
+    in a card grid, they’ll have a fixed width that is defined manually.'
+  - 'Legend
+    Options
+    Orientation: Legends can have horizontal or vertical orientation. Whenever possible,
+    orient the legend to match the chart.'
+- source_sentence: 'Represent this sentence for searching relevant passages: What
+    is the primary use case for radio buttons according to the Adobe Spectrum Design
+    Documentation?'
+  sentences:
+  - 'Radio group
+    Usage guidelines
+    Use radio buttons for mutually exclusive options: Radio buttons and [checkboxes](/page/checkbox)
+    are not interchangeable. Radio buttons are best used for selecting a single option
+    from a list of mutually exclusive options. Checkboxes are best used for selecting
+    multiple options at once (or no options).
+    '
+  - 'Additional resources: - [Human Interface Guidelines: iOS Tab Bars](https://developer.apple.com/design/human-interface-guidelines/ios/bars/tab-bars/)
+    - [Human Interface Guidelines: Accessibility](https://developer.apple.com/design/human-interface-guidelines/accessibility/overview/introduction/)
+    '
+  - 'Picker
+    Options
+    Label position: Labels can be placed  either on top or on the side. Top labels
+    are the default and are recommended because they work better with long copy, localization,
+    and responsive layouts. Side labels are most useful when vertical space is limited.'
+- source_sentence: 'Represent this sentence for searching relevant passages: How can
+    a designer balance the need for clear text links and the need for emphasized text
+    in a user interface?'
+  sentences:
+  - 'Meter
+    Options
+    Positive variant: The positive variant has a green fill to show the value. This
+    can be used to represent a positive semantic value, such as when there’s a lot
+    of space remaining.'
+  - 'Badge
+    Options
+    Size: Badges come in four different sizes: small, medium, large, and extra-large.
+    The small size is the default and most frequently used option. Use the other sizes
+    sparingly to create a hierarchy of importance on a page.'
+  - 'Typography
+    Usage guidelines
+    Don''t use underlines for adding emphasis: Underlines are reserved for text links
+    only. They should not be used as a way for adding emphasis to words.
+    '
+datasets:
+- JianLiao/spectrum-design-docs
+pipeline_tag: sentence-similarity
+library_name: sentence-transformers
+metrics:
+- cosine_accuracy@1
+- cosine_accuracy@3
+- cosine_accuracy@5
+- cosine_accuracy@10
+- cosine_precision@1
+- cosine_precision@3
+- cosine_precision@5
+- cosine_precision@10
+- cosine_recall@1
+- cosine_recall@3
+- cosine_recall@5
+- cosine_recall@10
+- cosine_ndcg@10
+- cosine_mrr@10
+- cosine_map@100
+model-index:
+- name: SentenceTransformer based on BAAI/bge-large-en-v1.5
+  results:
+  - task:
+      type: information-retrieval
+      name: Information Retrieval
+    dataset:
+      name: sds
+      type: sds
+    metrics:
+    - type: cosine_accuracy@1
+      value: 0.007462686567164179
+      name: Cosine Accuracy@1
+    - type: cosine_accuracy@3
+      value: 0.015603799185888738
+      name: Cosine Accuracy@3
+    - type: cosine_accuracy@5
+      value: 0.04748982360922659
+      name: Cosine Accuracy@5
+    - type: cosine_accuracy@10
+      value: 0.7815468113975577
+      name: Cosine Accuracy@10
+    - type: cosine_precision@1
+      value: 0.007462686567164179
+      name: Cosine Precision@1
+    - type: cosine_precision@3
+      value: 0.005201266395296246
+      name: Cosine Precision@3
+    - type: cosine_precision@5
+      value: 0.009497964721845319
+      name: Cosine Precision@5
+    - type: cosine_precision@10
+      value: 0.07815468113975575
+      name: Cosine Precision@10
+    - type: cosine_recall@1
+      value: 0.007462686567164179
+      name: Cosine Recall@1
+    - type: cosine_recall@3
+      value: 0.015603799185888738
+      name: Cosine Recall@3
+    - type: cosine_recall@5
+      value: 0.04748982360922659
+      name: Cosine Recall@5
+    - type: cosine_recall@10
+      value: 0.7815468113975577
+      name: Cosine Recall@10
+    - type: cosine_ndcg@10
+      value: 0.25440066233238845
+      name: Cosine Ndcg@10
+    - type: cosine_mrr@10
+      value: 0.10778547737502948
+      name: Cosine Mrr@10
+    - type: cosine_map@100
+      value: 0.11639203259428242
+      name: Cosine Map@100
+---
+# SentenceTransformer based on BAAI/bge-large-en-v1.5
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) on the [spectrum-design-docs](https://huggingface.co/datasets/JianLiao/spectrum-design-docs) dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) <!-- at revision d4aa6901d3a41ba39fb536a557fa166f842b0e09 -->
+- **Maximum Sequence Length:** 512 tokens
+- **Output Dimensionality:** 1024 dimensions
+- **Similarity Function:** Cosine Similarity
+- **Training Dataset:**
+    - [spectrum-design-docs](https://huggingface.co/datasets/JianLiao/spectrum-design-docs)
+- **Language:** en
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
+  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+  (2): Normalize()
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("JianLiao/spectrum-doc-fine-tuned")
+# Run inference
+sentences = [
+    'Represent this sentence for searching relevant passages: How can a designer balance the need for clear text links and the need for emphasized text in a user interface?',
+    "Typography\nUsage guidelines\nDon't use underlines for adding emphasis: Underlines are reserved for text links only. They should not be used as a way for adding emphasis to words.\n\n",
+    'Meter\nOptions\nPositive variant: The positive variant has a green fill to show the value. This can be used to represent a positive semantic value, such as when there’s a lot of space remaining.',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 1024]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+## Evaluation
+### Metrics
+#### Information Retrieval
+* Dataset: `sds`
+* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
+| Metric              | Value      |
+|:--------------------|:-----------|
+| cosine_accuracy@1   | 0.0075     |
+| cosine_accuracy@3   | 0.0156     |
+| cosine_accuracy@5   | 0.0475     |
+| cosine_accuracy@10  | 0.7815     |
+| cosine_precision@1  | 0.0075     |
+| cosine_precision@3  | 0.0052     |
+| cosine_precision@5  | 0.0095     |
+| cosine_precision@10 | 0.0782     |
+| cosine_recall@1     | 0.0075     |
+| cosine_recall@3     | 0.0156     |
+| cosine_recall@5     | 0.0475     |
+| cosine_recall@10    | 0.7815     |
+| **cosine_ndcg@10**  | **0.2544** |
+| cosine_mrr@10       | 0.1078     |
+| cosine_map@100      | 0.1164     |
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### spectrum-design-docs
+* Dataset: [spectrum-design-docs](https://huggingface.co/datasets/JianLiao/spectrum-design-docs) at [23f5565](https://huggingface.co/datasets/JianLiao/spectrum-design-docs/tree/23f5565f9fc1cfe31d1245ca9e5368f00fcaec00)
+* Size: 14,737 training samples
+* Columns: <code>anchor</code> and <code>positive</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | anchor                                                                             | positive                                                                            |
+  |:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                              |
+  | details | <ul><li>min: 20 tokens</li><li>mean: 30.87 tokens</li><li>max: 47 tokens</li></ul> | <ul><li>min: 18 tokens</li><li>mean: 97.17 tokens</li><li>max: 512 tokens</li></ul> |
+* Samples:
+  | anchor                                                                                                                                                                                                               | positive                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
+  |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>Represent this sentence for searching relevant passages: Are there any specific guidelines or best practices provided by the Spectrum team for integrating Spectrum CSS into a new or existing project?</code> | <code>Spectrum CSS: An open source CSS-only implementation of Spectrum, maintained by the Spectrum team.  <br><div class="well-box">Dependency chain: Spectrum DNA → Spectrum CSS</div><br><br>[GitHub repository](https://github.com/adobe/spectrum-css/)  <br>[Website](https://opensource.adobe.com/spectrum-css/)  <br>[#spectrum_css](https://adobe.slack.com/archives/C5N154FEY)</code>                                                                                     |
+  | <code>Represent this sentence for searching relevant passages: How does the default setting for progress circles affect their behavior in a UI?</code>                                                               | <code>Progress circle<br>Options<br>Indeterminate: A progress circle can be either determinate or indeterminate. By default, progress circles are determinate. Use a determinate progress circle when progress can be calculated against a specific goal (e.g., downloading a file of a known size). Use an indeterminate progress circle when progress is happening but the time or effort to completion can’t be determined (e.g., attempting to reconnect to a server).</code> |
+  | <code>Represent this sentence for searching relevant passages: What tools or methods can designers use to test the effectiveness of wrapped legends in their designs?</code>                                         | <code>Legend<br>Behaviors<br>Wrapping: When there isn’t enough space, wrap legends to ensure that dimension values are shown.</code>                                                                                                                                                                                                                                                                                                                                              |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `eval_strategy`: epoch
+- `per_device_train_batch_size`: 22
+- `per_device_eval_batch_size`: 16
+- `gradient_accumulation_steps`: 16
+- `learning_rate`: 2e-05
+- `num_train_epochs`: 100
+- `lr_scheduler_type`: cosine
+- `warmup_ratio`: 0.1
+- `bf16`: True
+- `tf32`: True
+- `load_best_model_at_end`: True
+- `optim`: adamw_torch_fused
+- `prompts`: {'anchor': 'Represent this sentence for searching relevant passages: '}
+- `batch_sampler`: no_duplicates
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: epoch
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 22
+- `per_device_eval_batch_size`: 16
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 16
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 2e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 100
+- `max_steps`: -1
+- `lr_scheduler_type`: cosine
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.1
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: True
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: True
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: True
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: True
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch_fused
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: None
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `include_for_metrics`: []
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `dispatch_batches`: None
+- `split_batches`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `eval_use_gather_object`: False
+- `average_tokens_across_devices`: False
+- `prompts`: {'anchor': 'Represent this sentence for searching relevant passages: '}
+- `batch_sampler`: no_duplicates
+- `multi_dataset_batch_sampler`: proportional
+</details>
+### Training Logs
+<details><summary>Click to expand</summary>
+| Epoch    | Step    | Training Loss | sds_cosine_ndcg@10 |
+|:--------:|:-------:|:-------------:|:------------------:|
+| 1.0      | 7       | -             | 0.2255             |
+| 1.48     | 10      | 0.2646        | -                  |
+| 2.0      | 14      | -             | 0.2282             |
+| 2.96     | 20      | 0.1412        | -                  |
+| 3.0      | 21      | -             | 0.2358             |
+| 4.0      | 28      | -             | 0.2397             |
+| 4.32     | 30      | 0.0638        | -                  |
+| 5.0      | 35      | -             | 0.2430             |
+| 5.8      | 40      | 0.0425        | -                  |
+| 6.0      | 42      | -             | 0.2449             |
+| 7.0      | 49      | -             | 0.2462             |
+| 7.16     | 50      | 0.0237        | -                  |
+| 8.0      | 56      | -             | 0.2428             |
+| 8.64     | 60      | 0.015         | -                  |
+| 9.0      | 63      | -             | 0.2456             |
+| 10.0     | 70      | 0.0082        | 0.2456             |
+| 11.0     | 77      | -             | 0.2498             |
+| 11.48    | 80      | 0.0052        | -                  |
+| 12.0     | 84      | -             | 0.2474             |
+| 12.96    | 90      | 0.0035        | -                  |
+| 13.0     | 91      | -             | 0.2455             |
+| 14.0     | 98      | -             | 0.2475             |
+| 14.32    | 100     | 0.0022        | -                  |
+| 15.0     | 105     | -             | 0.2472             |
+| 15.8     | 110     | 0.002         | -                  |
+| 16.0     | 112     | -             | 0.2486             |
+| 17.0     | 119     | -             | 0.2506             |
+| 17.16    | 120     | 0.0015        | -                  |
+| 18.0     | 126     | -             | 0.2490             |
+| 18.64    | 130     | 0.0013        | -                  |
+| 19.0     | 133     | -             | 0.2489             |
+| 20.0     | 140     | 0.0012        | 0.2491             |
+| 21.0     | 147     | -             | 0.2493             |
+| 21.48    | 150     | 0.0011        | -                  |
+| 22.0     | 154     | -             | 0.2487             |
+| 22.96    | 160     | 0.001         | -                  |
+| 23.0     | 161     | -             | 0.2486             |
+| 24.0     | 168     | -             | 0.2490             |
+| 24.32    | 170     | 0.0008        | -                  |
+| 25.0     | 175     | -             | 0.2502             |
+| 25.8     | 180     | 0.0008        | -                  |
+| 26.0     | 182     | -             | 0.2505             |
+| 27.0     | 189     | -             | 0.2523             |
+| 27.16    | 190     | 0.0008        | -                  |
+| 28.0     | 196     | -             | 0.2516             |
+| 28.64    | 200     | 0.0007        | -                  |
+| 29.0     | 203     | -             | 0.2509             |
+| 30.0     | 210     | 0.0007        | 0.2522             |
+| 31.0     | 217     | -             | 0.2522             |
+| 31.48    | 220     | 0.0006        | -                  |
+| 32.0     | 224     | -             | 0.2534             |
+| 32.96    | 230     | 0.0007        | -                  |
+| 33.0     | 231     | -             | 0.2523             |
+| 34.0     | 238     | -             | 0.2524             |
+| 34.32    | 240     | 0.0006        | -                  |
+| 35.0     | 245     | -             | 0.2518             |
+| 35.8     | 250     | 0.0006        | -                  |
+| 36.0     | 252     | -             | 0.2529             |
+| 37.0     | 259     | -             | 0.2524             |
+| 37.16    | 260     | 0.0006        | -                  |
+| 38.0     | 266     | -             | 0.2530             |
+| 38.64    | 270     | 0.0005        | -                  |
+| 39.0     | 273     | -             | 0.2526             |
+| 40.0     | 280     | 0.0006        | 0.2539             |
+| 41.0     | 287     | -             | 0.2529             |
+| 41.48    | 290     | 0.0005        | -                  |
+| 42.0     | 294     | -             | 0.2545             |
+| 42.96    | 300     | 0.0006        | -                  |
+| 43.0     | 301     | -             | 0.2534             |
+| 44.0     | 308     | -             | 0.2536             |
+| 44.32    | 310     | 0.0004        | -                  |
+| 45.0     | 315     | -             | 0.2521             |
+| 45.8     | 320     | 0.0005        | -                  |
+| 46.0     | 322     | -             | 0.2532             |
+| 47.0     | 329     | -             | 0.2519             |
+| 47.16    | 330     | 0.0005        | -                  |
+| 48.0     | 336     | -             | 0.2525             |
+| 48.64    | 340     | 0.0004        | -                  |
+| 49.0     | 343     | -             | 0.2535             |
+| 50.0     | 350     | 0.0005        | 0.2542             |
+| 51.0     | 357     | -             | 0.2540             |
+| 51.48    | 360     | 0.0004        | -                  |
+| 52.0     | 364     | -             | 0.2542             |
+| 52.96    | 370     | 0.0005        | -                  |
+| 53.0     | 371     | -             | 0.2538             |
+| 54.0     | 378     | -             | 0.2533             |
+| 54.32    | 380     | 0.0004        | -                  |
+| 55.0     | 385     | -             | 0.2544             |
+| 55.8     | 390     | 0.0004        | -                  |
+| 56.0     | 392     | -             | 0.2539             |
+| 57.0     | 399     | -             | 0.2541             |
+| 57.16    | 400     | 0.0005        | -                  |
+| 58.0     | 406     | -             | 0.2532             |
+| 58.64    | 410     | 0.0004        | -                  |
+| 59.0     | 413     | -             | 0.2543             |
+| 60.0     | 420     | 0.0004        | 0.2532             |
+| 61.0     | 427     | -             | 0.2541             |
+| 61.48    | 430     | 0.0004        | -                  |
+| 62.0     | 434     | -             | 0.2542             |
+| 62.96    | 440     | 0.0005        | -                  |
+| 63.0     | 441     | -             | 0.2546             |
+| 64.0     | 448     | -             | 0.2549             |
+| 64.32    | 450     | 0.0003        | -                  |
+| **65.0** | **455** | **-**         | **0.2557**         |
+| 65.8     | 460     | 0.0004        | -                  |
+| 66.0     | 462     | -             | 0.2557             |
+| 67.0     | 469     | -             | 0.2539             |
+| 67.16    | 470     | 0.0004        | -                  |
+| 68.0     | 476     | -             | 0.2538             |
+| 68.64    | 480     | 0.0004        | -                  |
+| 69.0     | 483     | -             | 0.2538             |
+| 70.0     | 490     | 0.0004        | 0.2542             |
+| 71.0     | 497     | -             | 0.2532             |
+| 71.48    | 500     | 0.0004        | -                  |
+| 72.0     | 504     | -             | 0.2538             |
+| 72.96    | 510     | 0.0004        | -                  |
+| 73.0     | 511     | -             | 0.2545             |
+| 74.0     | 518     | -             | 0.2531             |
+| 74.32    | 520     | 0.0003        | -                  |
+| 75.0     | 525     | -             | 0.2534             |
+| 75.8     | 530     | 0.0004        | -                  |
+| 76.0     | 532     | -             | 0.2541             |
+| 77.0     | 539     | -             | 0.2545             |
+| 77.16    | 540     | 0.0004        | -                  |
+| 78.0     | 546     | -             | 0.2536             |
+| 78.64    | 550     | 0.0004        | -                  |
+| 79.0     | 553     | -             | 0.2545             |
+| 80.0     | 560     | 0.0004        | 0.2540             |
+| 81.0     | 567     | -             | 0.2545             |
+| 81.48    | 570     | 0.0004        | -                  |
+| 82.0     | 574     | -             | 0.2541             |
+| 82.96    | 580     | 0.0004        | -                  |
+| 83.0     | 581     | -             | 0.2545             |
+| 84.0     | 588     | -             | 0.2538             |
+| 84.32    | 590     | 0.0004        | -                  |
+| 85.0     | 595     | -             | 0.2546             |
+| 85.8     | 600     | 0.0004        | 0.2544             |
+* The bold row denotes the saved checkpoint.
+</details>
+### Framework Versions
+- Python: 3.12.8
+- Sentence Transformers: 3.3.1
+- Transformers: 4.47.1
+- PyTorch: 2.5.1+cu124
+- Accelerate: 1.2.1
+- Datasets: 3.2.0
+- Tokenizers: 0.21.0
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+  "_name_or_path": "./ft-v3.0.0",
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "gradient_checkpointing": false,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 1024,
+  "id2label": {
+    "0": "LABEL_0"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 4096,
+  "label2id": {
+    "LABEL_0": 0
+  },
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 16,
+  "num_hidden_layers": 24,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.47.1",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 30522
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.3.1",
+    "transformers": "4.47.1",
+    "pytorch": "2.5.1+cu124"
+  },
+  "prompts": {
+    "anchor": "Represent this sentence for searching relevant passages: "
+  },
+  "default_prompt_name": null,
+  "similarity_fn_name": "cosine"
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e3e7b0abdb38881b87c3e79e4d866887ef4cc01a9ce4faccc124b1b2cedbaf3d
+size 1340612432

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 512,
+  "do_lower_case": true
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,65 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "max_length": 512,
+  "model_max_length": 512,
+  "never_split": null,
+  "pad_to_multiple_of": null,
+  "pad_token": "[PAD]",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
+  "sep_token": "[SEP]",
+  "stride": 0,
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff