Add new SentenceTransformer model.

Browse files

Files changed (11) hide show

1_Pooling/config.json +10 -0
README.md +525 -0
config.json +57 -0
config_sentence_transformers.json +12 -0
model.safetensors +3 -0
modules.json +20 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +63 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 768,
+  "pooling_mode_cls_token": true,
+  "pooling_mode_mean_tokens": false,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,525 @@

+---
+base_model: Snowflake/snowflake-arctic-embed-m-long
+library_name: sentence-transformers
+pipeline_tag: sentence-similarity
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:29547
+- loss:MultipleNegativesRankingLoss
+widget:
+- source_sentence: According to the Client Money Auditor's Report, how did the Authorised
+    Person manage Client Money—was it pooled in a single client Account or segregated
+    into individual Client Accounts as per COBS Chapter 14?
+  sentences:
+  - "The written notice in Rule ‎6.2.1(a)(i) must make it explicit that, if an Employee\
+    \ is prohibited from undertaking a Personal Account Transaction, he must not,\
+    \ except in the proper course of his employment:\n(a)\tprocure another Person\
+    \ to enter into such a Transaction; or\n(b)\tcommunicate any information or opinion\
+    \ to another Person if he knows, or ought to know, that the Person will as a result,\
+    \ enter into such a Transaction or procure some other Person to do so."
+  - "Client Money Auditor's Report:An Authorised Person must, in procuring the production\
+    \ of a Client Money Auditor's Report, ensure that an Auditor states, as at the\
+    \ date of which the Authorised Person's audited statement of financial position\
+    \ was prepared:\n(1)\tthe amount of Client Money an Authorised Person was holding\
+    \ and controlling in accordance with COBS Chapter 14; and\n(2)\twhether:\n(a)\t\
+    the Authorised Person has maintained throughout the year systems and controls\
+    \ to enable it to comply with the relevant provisions of COBS Chapter 14;\n(b)\t\
+    the Authorised Person's controls are such as to ensure that Client Money is identifiable\
+    \ and secure at all times;\n(c)\tany of the requirements in COBS Chapter 14 have\
+    \ not been met;\n(d)\tClient Money has been pooled in a single client Account\
+    \ or segregated in Client Accounts maintained for individual Clients in accordance\
+    \ with COBS Chapter 14;\n(e)\tif applicable, the Authorised Person as holding\
+    \ and controlling the appropriate amount of Client Money in accordance with COBS\
+    \ Chapter 14 as at the date on which the Authorised Person's audited statement\
+    \ of financial position was prepared;\n(f)\tthe Auditor has received all necessary\
+    \ information and explanations for the purposes of preparing the report to the\
+    \ Regulator; and\n(g)\tif applicable, there have been any material discrepancies\
+    \ in the reconciliation of Client Money."
+  - "CRS Options\n/Table Start\nNo.\tOPTIONS\tCOMMENTS\n1.\tAlternative approach to\
+    \ calculating account balances\tNO\n2.\tUse of other reporting period\tNO\n3.\t\
+    Filing deadlines\t30th June\n4.\tFiling Nil returns\tYES\n5.\tAllowing third party\
+    \ service providers to fulfil the obligations on behalf of\nthe Financial Institutions\t\
+    YES\n6.\tAllowing the due diligence procedures for New Accounts to be used for\n\
+    Pre-existing Accounts\tYES\n7.\tAllowing the due diligence procedures for High\
+    \ Value Accounts to be used\nfor Lower Value Accounts\tYES\n8.\tResidence address\
+    \ test for Lower Value Accounts\tYES\n9.\tExclusion from Due Diligence for Pre-existing\
+    \ Entity Accounts not exceeding $250,000\tYES\n10.\tAlternative documentation\
+    \ procedure for certain employer-sponsored\ngroup insurance contracts or annuity\
+    \ contracts\tYES\n11.\tAllowing Financial Institutions to make greater use of\
+    \ existing\nstandardised industry coding systems for the due diligence process\t\
+    YES\n12.\tCurrency translation\tUSE USD$\n\n13.\tAllow a Financial Institution\
+    \ to treat certain New Accounts held by pre-existing customers as a Pre-existing\
+    \ Account for due diligence purposes\tYES\n14.\tExpanded definition of Related\
+    \ Entity for Investment Entities\tYES\n15.\tGrandfathering rule for bearer shares\
+    \ issued by Exempt Collective\nInvestment Vehicle\tRemoved\n16.\tPhasing in the\
+    \ requirements to report gross proceeds\tNO\n/Table End\n\n"
+- source_sentence: What reporting and disclosure requirements are FinTech Participants
+    expected to comply with when operating within the ADGM RegLab?
+  sentences:
+  - 'INTRODUCTION
+    For more details on the requirements, and process, for making ensuring compliance
+    with the Continuous Disclosure framework, please contact the Listing Authority
+    at [email protected].
+    '
+  - An Authorised Person or Recognised Body must perform an internal Shari'a review
+    to assess the extent to which the Authorised Person or Recognised Body complies
+    with fatwa, rulings and guidelines issued by its Shari'a Supervisory Board.
+  - Similarly, in using a new or developing technology, such as those associated with
+    the Regulated Activity of Developing Financial Technology Services within the
+    RegLab or when undertaking NFTF business, a Relevant Person should pay specific
+    attention to assessing the potential for risks associated with Financial Crime
+    that might arise as a result of implementing that innovative technology. For example,
+    while the use of eKYC Systems may reduce the risk of impersonation fraud at customer
+    onboarding, NFTF interaction with the customer may increase the risk of Financial
+    Crime after a business relationship has been established, through transaction
+    fraud, money laundering or theft of digitally stored CDD documentation.
+- source_sentence: How does the ADGM expect an Authorised Person to document and demonstrate
+    adherence to the lines of authority and responsibility established by the Governing
+    Body for managing Liquidity Risk in compliance with Rule 9.2.2(2)(b)(b)?
+  sentences:
+  - An Authorised Person or a Recognised Body must ensure that its internal audit
+    function undertakes regular reviews and assessments of the effectiveness of the
+    Authorised Person or Recognised Body's money laundering policies, procedures,
+    systems and controls, and its compliance with its obligations in the AML Rulebook.
+  - "If a Fund intends to change its annual or interim accounting period, the Fund\
+    \ Manager must:\n(a)\tobtain written confirmation from its auditor that the change\
+    \ of its annual accounting period would not result in any significant distortion\
+    \ of the financial position of the Fund; and\n(b)\tobtain the Regulator's prior\
+    \ consent before implementing the change."
+  - "Guidance on risks to be covered as part of the IRAP. An Authorised Person should\
+    \ consider the following risks, where relevant, in its IRAP:\na.\tCredit Risk,\
+    \ including Large Exposures and concentration risks;\nb.\tMarket Risk;\nc.\tLiquidity\
+    \ Risk;\nd.\tfor Islamic Financial Business involving PSIAs, displaced commercial\
+    \ risk;\ne.\tinterest rate risk in the Non Trading Book;\nf.\tOperational Risk;\n\
+    g.\tinternal controls and systems; and\nh.\treputational risk."
+- source_sentence: If a Recognised Body receives a notification from the Regulator
+    regarding an application, which of the following actions would allow the Recognised
+    Body to avoid the application of section 268 of the Insolvency Regulations to
+    Market Contracts of a Member or designated non-Member?
+  sentences:
+  - "The procedure is that the Regulator must notify the Recognised Body of the application\
+    \ and unless the Recognised Body:\n(a)\ttakes action under its Default Rules;\n\
+    (b)\tnotifies the Regulator that it proposes to take action forthwith; or\n(c)\t\
+    is directed to take action by the Regulator,\nwithin three Business Days after\
+    \ receipt of that notice section 268 of the Insolvency Regulations will not apply\
+    \ in relation to Market Contracts to which the Member or designated non-Member\
+    \ is a party or to anything done by the Recognised Body for the purpose of, or\
+    \ in connection with, the settlement of Market Contracts."
+  - The Regulator shall have the power to designate a Regulated Activity or specified
+    category of Regulated Activity as not being in compliance with Shari'a in the
+    event that the Regulator believes that such Regulated Activity or specified category
+    of Regulated Activity involves matters that are contrary to the aims of Shari'a.
+  - "An Authorised Person and Recognised Body must:\n(a)\twhen it sends or receives\
+    \ a wire transfer on behalf of a customer, ensure that the wire transfer and any\
+    \ related messages contain accurate originator and beneficiary information;\n\
+    (b)\tensure that, while the wire transfer is under its control, the information\
+    \ in (a) remains with the wire transfer and any related message throughout the\
+    \ payment chain;\n(c)\tmonitor wire transfers for the purpose of detecting those\
+    \ wire transfers that do not contain both originator and beneficiary information\
+    \ and take appropriate measures to identify any money laundering risks; and\n\
+    (d)\tnot effect wire transfers without the information required under (3) and\
+    \ (4)."
+- source_sentence: How should a Relevant Person ensure and demonstrate compliance
+    with both UNSC Sanctions and U.A.E.-administered Sanctions, specifically Targeted
+    Financial Sanctions, within the ADGM jurisdiction?
+  sentences:
+  - 'REGULATORY REQUIREMENTS - SPOT COMMODITY ACTIVITIES
+    RIEs operating an MTF or OTF using Accepted Spot Commodities
+    Authorised Persons that are operating an MTF or OTF wishing to also operate a
+    RIE will be required to relinquish their FSP upon obtaining a Recognition Order
+    (to operate the RIE).  If licensed by the FSRA to carry out both Regulated Activities
+    (e.g., operating an MTF and operating an RIE), the Recognition Order will include
+    a stipulation to that effect pursuant to MIR Rule 3.4.1.
+    '
+  - "Where a Relevant Person seeks to rely on a Person in (1) it may only do so if\
+    \ and to the extent that:\n(a)\tit immediately obtains the necessary CDD information\
+    \ from the third party in (1);\n(b)\tit takes adequate steps to satisfy itself\
+    \ that certified copies of the documents used to undertake the relevant elements\
+    \ of CDD will be available from the third party on request without delay;\n(c)\t\
+    the Person in (1)(b) to (d) is subject to regulation, including AML/TFS compliance\
+    \ requirements, by a Non-ADGM Financial Services Regulator or other competent\
+    \ authority in a country with AML/TFS regulations which are equivalent to the\
+    \ standards set out in the FATF Recommendations and it is supervised for compliance\
+    \ with such regulations;\n(d)\tthe Person in (1) has not relied on any exception\
+    \ from the requirement to conduct any relevant elements of CDD which the Relevant\
+    \ Person seeks to rely on; and\n(e)\tin relation to (2), the information is up\
+    \ to date."
+  - "Financial Services Permissions. VC Managers operating in ADGM require a Financial\
+    \ Services Permission (“FSP”) to undertake any Regulated Activity pertaining to\
+    \ VC Funds and/or co-investments by third parties in VC Funds. The Regulated Activities\
+    \ covered by the FSP will be dependent on the VC Managers’ investment strategy\
+    \ and business model.\n(a)\tManaging a Collective Investment Fund: this includes\
+    \ carrying out fund management activities in respect of a VC Fund.\n(b)\tAdvising\
+    \ on Investments or Credit : for VC Managers these activities will be restricted\
+    \ to activities related to co-investment alongside a VC Fund which the VC Manager\
+    \ manages, such as recommending that a client invest in an investee company alongside\
+    \ the VC Fund and on the strategy and structure required to make the investment.\n\
+    (c)\tArranging Deals in Investments: VC Managers may also wish to make arrangements\
+    \ to facilitate co-investments in the investee company.\nAuthorisation fees and\
+    \ supervision fees for a VC Manager are capped at USD 10,000 regardless of whether\
+    \ one or both of the additional Regulated Activities in b) and c) above in relation\
+    \ to co-investments are included in its FSP. The FSP will include restrictions\
+    \ appropriate to the business model of a VC Manager."
+---
+# SentenceTransformer based on Snowflake/snowflake-arctic-embed-m-long
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Snowflake/snowflake-arctic-embed-m-long](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-long) on the csv dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [Snowflake/snowflake-arctic-embed-m-long](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-long) <!-- at revision 89d0f6ab196eead40b90cb6f9fefec01a908d2d1 -->
+- **Maximum Sequence Length:** 8192 tokens
+- **Output Dimensionality:** 768 tokens
+- **Similarity Function:** Cosine Similarity
+- **Training Dataset:**
+    - csv
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NomicBertModel
+  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+  (2): Normalize()
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("jebish7/snowflake-arctic-embed-m-long_MNR_1")
+# Run inference
+sentences = [
+    'How should a Relevant Person ensure and demonstrate compliance with both UNSC Sanctions and U.A.E.-administered Sanctions, specifically Targeted Financial Sanctions, within the ADGM jurisdiction?',
+    'Where a Relevant Person seeks to rely on a Person in (1) it may only do so if and to the extent that:\n(a)\tit immediately obtains the necessary CDD information from the third party in (1);\n(b)\tit takes adequate steps to satisfy itself that certified copies of the documents used to undertake the relevant elements of CDD will be available from the third party on request without delay;\n(c)\tthe Person in (1)(b) to (d) is subject to regulation, including AML/TFS compliance requirements, by a Non-ADGM Financial Services Regulator or other competent authority in a country with AML/TFS regulations which are equivalent to the standards set out in the FATF Recommendations and it is supervised for compliance with such regulations;\n(d)\tthe Person in (1) has not relied on any exception from the requirement to conduct any relevant elements of CDD which the Relevant Person seeks to rely on; and\n(e)\tin relation to (2), the information is up to date.',
+    'REGULATORY REQUIREMENTS - SPOT COMMODITY ACTIVITIES\nRIEs operating an MTF or OTF using Accepted Spot Commodities\nAuthorised Persons that are operating an MTF or OTF wishing to also operate a RIE will be required to relinquish their FSP upon obtaining a Recognition Order (to operate the RIE).  If licensed by the FSRA to carry out both Regulated Activities (e.g., operating an MTF and operating an RIE), the Recognition Order will include a stipulation to that effect pursuant to MIR Rule 3.4.1.\n',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 768]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### csv
+* Dataset: csv
+* Size: 29,547 training samples
+* Columns: <code>Question</code> and <code>positive</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | Question                                                                           | positive                                                                              |
+  |:--------|:-----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                                |
+  | details | <ul><li>min: 18 tokens</li><li>mean: 34.91 tokens</li><li>max: 83 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 118.51 tokens</li><li>max: 1090 tokens</li></ul> |
+* Samples:
+  | Question                                                                                                                                                                                           | positive                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
+  |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>Under which circumstances is a Mining Reporting Entity exempt from immediate disclosure of material information about its mining activities according to the FSRA guidelines?</code>         | <code>INTERACTION OF CHAPTER 11 WITH OTHER RULE DISCLOSURE OBLIGATIONS. Prior to a Mining Reporting Entity having all the information available to it, the FSRA considers that whatever material information it may have about the mining activity will generally be insufficiently definite to warrant disclosure under the Rules.  Therefore, provided the material information is and remains confidential, and the FSRA has not formed the view that the information ceases to remain confidential (e.g., where there are exceptions from disclosing the information), the material information is not immediately required to be disclosed under Rule 7.2.1.  For more information, please refer to Chapter 7 of the Rules, and any relevant Guidance that the FSRA may publish from time in relation to the FSRA’s expectations as to how Reporting Entities are to comply with Chapter 7.<br><br></code> |
+  | <code>What specific IAASB standards or other standards acceptable to the Regulator are required for the audit of a Public Listed Company's financial statements?</code>                            | <code>Where an Authorised Person does not hold or control any Client Money as at the date on which the Authorised Person's audited statement of financial position was prepared, the Regulator expects that a nil balance be stated to comply with Rule ‎6.6.6.<br></code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+  | <code>How does the ADGM monitor compliance with the principles of effective dialogue with shareholders, and what are the consequences for companies that fail to establish such a dialogue?</code> | <code>Audit committee. The Board as a whole has responsibility for ensuring that a satisfactory dialogue with Shareholders takes place. Such dialogue should be based on the mutual understanding of objectives and provision of adequate information relating to the Reporting Entity including financial information, and how the business and affairs of the Reporting Entity are carried out.</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `per_device_train_batch_size`: 4
+- `learning_rate`: 2e-05
+- `num_train_epochs`: 1
+- `warmup_ratio`: 0.1
+- `batch_sampler`: no_duplicates
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: no
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 4
+- `per_device_eval_batch_size`: 8
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 2e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 1
+- `max_steps`: -1
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.1
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: False
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: False
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `dispatch_batches`: None
+- `split_batches`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `eval_use_gather_object`: False
+- `batch_sampler`: no_duplicates
+- `multi_dataset_batch_sampler`: proportional
+</details>
+### Training Logs
+| Epoch  | Step | Training Loss |
+|:------:|:----:|:-------------:|
+| 0.0271 | 100  | 0.6411        |
+| 0.0541 | 200  | 0.3289        |
+| 0.0812 | 300  | 0.2395        |
+| 0.1083 | 400  | 0.2711        |
+| 0.1354 | 500  | 0.2746        |
+| 0.1624 | 600  | 0.2602        |
+| 0.1895 | 700  | 0.285         |
+| 0.2166 | 800  | 0.2965        |
+| 0.2436 | 900  | 0.2772        |
+| 0.2707 | 1000 | 0.3043        |
+| 0.2978 | 1100 | 0.3059        |
+| 0.3249 | 1200 | 0.316         |
+| 0.3519 | 1300 | 0.2765        |
+| 0.3790 | 1400 | 0.249         |
+| 0.4061 | 1500 | 0.2601        |
+| 0.4331 | 1600 | 0.2538        |
+| 0.4602 | 1700 | 0.2443        |
+| 0.4873 | 1800 | 0.2151        |
+| 0.5143 | 1900 | 0.2335        |
+| 0.5414 | 2000 | 0.2611        |
+| 0.5685 | 2100 | 0.2557        |
+| 0.5956 | 2200 | 0.2793        |
+| 0.0694 | 100  | 0.2141        |
+| 0.1389 | 200  | 0.273         |
+| 0.2083 | 300  | 0.295         |
+| 0.2778 | 400  | 0.2079        |
+| 0.3472 | 500  | 0.2556        |
+| 0.4167 | 600  | 0.252         |
+| 0.4861 | 700  | 0.2142        |
+| 0.5556 | 800  | 0.2181        |
+| 0.625  | 900  | 0.2347        |
+| 0.6944 | 1000 | 0.1754        |
+| 0.7639 | 1100 | 0.2313        |
+| 0.8333 | 1200 | 0.2104        |
+| 0.9028 | 1300 | 0.2435        |
+| 0.9722 | 1400 | 0.2399        |
+### Framework Versions
+- Python: 3.10.14
+- Sentence Transformers: 3.1.1
+- Transformers: 4.45.2
+- PyTorch: 2.4.0
+- Accelerate: 0.34.2
+- Datasets: 3.0.1
+- Tokenizers: 0.20.0
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,57 @@

+{
+  "_name_or_path": "Snowflake/snowflake-arctic-embed-m-long",
+  "activation_function": "swiglu",
+  "architectures": [
+    "NomicBertModel"
+  ],
+  "attn_pdrop": 0.0,
+  "auto_map": {
+    "AutoConfig": "Snowflake/snowflake-arctic-embed-m-long--configuration_hf_nomic_bert.NomicBertConfig",
+    "AutoModel": "Snowflake/snowflake-arctic-embed-m-long--modeling_hf_nomic_bert.NomicBertModel"
+  },
+  "bos_token_id": null,
+  "causal": false,
+  "dense_seq_output": true,
+  "embd_pdrop": 0.1,
+  "eos_token_id": null,
+  "fused_bias_fc": true,
+  "fused_dropout_add_ln": true,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-12,
+  "max_trained_positions": 2048,
+  "mlp_fc1_bias": false,
+  "mlp_fc2_bias": false,
+  "model_type": "nomic_bert",
+  "n_embd": 768,
+  "n_head": 12,
+  "n_inner": 3072,
+  "n_layer": 12,
+  "n_positions": 8192,
+  "pad_vocab_size_multiple": 64,
+  "parallel_block": false,
+  "parallel_block_tied_norm": false,
+  "prenorm": false,
+  "qkv_proj_bias": false,
+  "reorder_and_upcast_attn": false,
+  "resid_pdrop": 0.1,
+  "rotary_emb_base": 1000,
+  "rotary_emb_fraction": 1.0,
+  "rotary_emb_interleaved": false,
+  "rotary_emb_scale_base": null,
+  "rotary_scaling_factor": 2,
+  "scale_attn_by_inverse_layer_idx": false,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.45.2",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "use_flash_attn": true,
+  "use_rms_norm": false,
+  "use_xentropy": true,
+  "vocab_size": 30528
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.1.1",
+    "transformers": "4.45.2",
+    "pytorch": "2.4.0"
+  },
+  "prompts": {
+    "query": "Represent this sentence for searching relevant passages: "
+  },
+  "default_prompt_name": null,
+  "similarity_fn_name": null
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f10acc8253b625b8e8a407abcb816b2b1abce765c03f823621517db531265397
+size 546938168

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 8192,
+  "do_lower_case": false
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,63 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "additional_special_tokens": [],
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_lower_case": true,
+  "mask_token": "[MASK]",
+  "max_length": 512,
+  "model_max_length": 8192,
+  "pad_to_multiple_of": null,
+  "pad_token": "[PAD]",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
+  "sep_token": "[SEP]",
+  "stride": 0,
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff