jebish7 commited on
Commit
ac74b03
·
verified ·
1 Parent(s): 7a05f96

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,525 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Snowflake/snowflake-arctic-embed-m-long
3
+ library_name: sentence-transformers
4
+ pipeline_tag: sentence-similarity
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:29547
11
+ - loss:MultipleNegativesRankingLoss
12
+ widget:
13
+ - source_sentence: According to the Client Money Auditor's Report, how did the Authorised
14
+ Person manage Client Money—was it pooled in a single client Account or segregated
15
+ into individual Client Accounts as per COBS Chapter 14?
16
+ sentences:
17
+ - "The written notice in Rule ‎6.2.1(a)(i) must make it explicit that, if an Employee\
18
+ \ is prohibited from undertaking a Personal Account Transaction, he must not,\
19
+ \ except in the proper course of his employment:\n(a)\tprocure another Person\
20
+ \ to enter into such a Transaction; or\n(b)\tcommunicate any information or opinion\
21
+ \ to another Person if he knows, or ought to know, that the Person will as a result,\
22
+ \ enter into such a Transaction or procure some other Person to do so."
23
+ - "Client Money Auditor's Report:An Authorised Person must, in procuring the production\
24
+ \ of a Client Money Auditor's Report, ensure that an Auditor states, as at the\
25
+ \ date of which the Authorised Person's audited statement of financial position\
26
+ \ was prepared:\n(1)\tthe amount of Client Money an Authorised Person was holding\
27
+ \ and controlling in accordance with COBS Chapter 14; and\n(2)\twhether:\n(a)\t\
28
+ the Authorised Person has maintained throughout the year systems and controls\
29
+ \ to enable it to comply with the relevant provisions of COBS Chapter 14;\n(b)\t\
30
+ the Authorised Person's controls are such as to ensure that Client Money is identifiable\
31
+ \ and secure at all times;\n(c)\tany of the requirements in COBS Chapter 14 have\
32
+ \ not been met;\n(d)\tClient Money has been pooled in a single client Account\
33
+ \ or segregated in Client Accounts maintained for individual Clients in accordance\
34
+ \ with COBS Chapter 14;\n(e)\tif applicable, the Authorised Person as holding\
35
+ \ and controlling the appropriate amount of Client Money in accordance with COBS\
36
+ \ Chapter 14 as at the date on which the Authorised Person's audited statement\
37
+ \ of financial position was prepared;\n(f)\tthe Auditor has received all necessary\
38
+ \ information and explanations for the purposes of preparing the report to the\
39
+ \ Regulator; and\n(g)\tif applicable, there have been any material discrepancies\
40
+ \ in the reconciliation of Client Money."
41
+ - "CRS Options\n/Table Start\nNo.\tOPTIONS\tCOMMENTS\n1.\tAlternative approach to\
42
+ \ calculating account balances\tNO\n2.\tUse of other reporting period\tNO\n3.\t\
43
+ Filing deadlines\t30th June\n4.\tFiling Nil returns\tYES\n5.\tAllowing third party\
44
+ \ service providers to fulfil the obligations on behalf of\nthe Financial Institutions\t\
45
+ YES\n6.\tAllowing the due diligence procedures for New Accounts to be used for\n\
46
+ Pre-existing Accounts\tYES\n7.\tAllowing the due diligence procedures for High\
47
+ \ Value Accounts to be used\nfor Lower Value Accounts\tYES\n8.\tResidence address\
48
+ \ test for Lower Value Accounts\tYES\n9.\tExclusion from Due Diligence for Pre-existing\
49
+ \ Entity Accounts not exceeding $250,000\tYES\n10.\tAlternative documentation\
50
+ \ procedure for certain employer-sponsored\ngroup insurance contracts or annuity\
51
+ \ contracts\tYES\n11.\tAllowing Financial Institutions to make greater use of\
52
+ \ existing\nstandardised industry coding systems for the due diligence process\t\
53
+ YES\n12.\tCurrency translation\tUSE USD$\n\n13.\tAllow a Financial Institution\
54
+ \ to treat certain New Accounts held by pre-existing customers as a Pre-existing\
55
+ \ Account for due diligence purposes\tYES\n14.\tExpanded definition of Related\
56
+ \ Entity for Investment Entities\tYES\n15.\tGrandfathering rule for bearer shares\
57
+ \ issued by Exempt Collective\nInvestment Vehicle\tRemoved\n16.\tPhasing in the\
58
+ \ requirements to report gross proceeds\tNO\n/Table End\n\n"
59
+ - source_sentence: What reporting and disclosure requirements are FinTech Participants
60
+ expected to comply with when operating within the ADGM RegLab?
61
+ sentences:
62
+ - 'INTRODUCTION
63
+
64
+ For more details on the requirements, and process, for making ensuring compliance
65
+ with the Continuous Disclosure framework, please contact the Listing Authority
66
67
+
68
+
69
+ '
70
+ - An Authorised Person or Recognised Body must perform an internal Shari'a review
71
+ to assess the extent to which the Authorised Person or Recognised Body complies
72
+ with fatwa, rulings and guidelines issued by its Shari'a Supervisory Board.
73
+ - Similarly, in using a new or developing technology, such as those associated with
74
+ the Regulated Activity of Developing Financial Technology Services within the
75
+ RegLab or when undertaking NFTF business, a Relevant Person should pay specific
76
+ attention to assessing the potential for risks associated with Financial Crime
77
+ that might arise as a result of implementing that innovative technology. For example,
78
+ while the use of eKYC Systems may reduce the risk of impersonation fraud at customer
79
+ onboarding, NFTF interaction with the customer may increase the risk of Financial
80
+ Crime after a business relationship has been established, through transaction
81
+ fraud, money laundering or theft of digitally stored CDD documentation.
82
+ - source_sentence: How does the ADGM expect an Authorised Person to document and demonstrate
83
+ adherence to the lines of authority and responsibility established by the Governing
84
+ Body for managing Liquidity Risk in compliance with Rule 9.2.2(2)(b)(b)?
85
+ sentences:
86
+ - An Authorised Person or a Recognised Body must ensure that its internal audit
87
+ function undertakes regular reviews and assessments of the effectiveness of the
88
+ Authorised Person or Recognised Body's money laundering policies, procedures,
89
+ systems and controls, and its compliance with its obligations in the AML Rulebook.
90
+ - "If a Fund intends to change its annual or interim accounting period, the Fund\
91
+ \ Manager must:\n(a)\tobtain written confirmation from its auditor that the change\
92
+ \ of its annual accounting period would not result in any significant distortion\
93
+ \ of the financial position of the Fund; and\n(b)\tobtain the Regulator's prior\
94
+ \ consent before implementing the change."
95
+ - "Guidance on risks to be covered as part of the IRAP. An Authorised Person should\
96
+ \ consider the following risks, where relevant, in its IRAP:\na.\tCredit Risk,\
97
+ \ including Large Exposures and concentration risks;\nb.\tMarket Risk;\nc.\tLiquidity\
98
+ \ Risk;\nd.\tfor Islamic Financial Business involving PSIAs, displaced commercial\
99
+ \ risk;\ne.\tinterest rate risk in the Non Trading Book;\nf.\tOperational Risk;\n\
100
+ g.\tinternal controls and systems; and\nh.\treputational risk."
101
+ - source_sentence: If a Recognised Body receives a notification from the Regulator
102
+ regarding an application, which of the following actions would allow the Recognised
103
+ Body to avoid the application of section 268 of the Insolvency Regulations to
104
+ Market Contracts of a Member or designated non-Member?
105
+ sentences:
106
+ - "The procedure is that the Regulator must notify the Recognised Body of the application\
107
+ \ and unless the Recognised Body:\n(a)\ttakes action under its Default Rules;\n\
108
+ (b)\tnotifies the Regulator that it proposes to take action forthwith; or\n(c)\t\
109
+ is directed to take action by the Regulator,\nwithin three Business Days after\
110
+ \ receipt of that notice section 268 of the Insolvency Regulations will not apply\
111
+ \ in relation to Market Contracts to which the Member or designated non-Member\
112
+ \ is a party or to anything done by the Recognised Body for the purpose of, or\
113
+ \ in connection with, the settlement of Market Contracts."
114
+ - The Regulator shall have the power to designate a Regulated Activity or specified
115
+ category of Regulated Activity as not being in compliance with Shari'a in the
116
+ event that the Regulator believes that such Regulated Activity or specified category
117
+ of Regulated Activity involves matters that are contrary to the aims of Shari'a.
118
+ - "An Authorised Person and Recognised Body must:\n(a)\twhen it sends or receives\
119
+ \ a wire transfer on behalf of a customer, ensure that the wire transfer and any\
120
+ \ related messages contain accurate originator and beneficiary information;\n\
121
+ (b)\tensure that, while the wire transfer is under its control, the information\
122
+ \ in (a) remains with the wire transfer and any related message throughout the\
123
+ \ payment chain;\n(c)\tmonitor wire transfers for the purpose of detecting those\
124
+ \ wire transfers that do not contain both originator and beneficiary information\
125
+ \ and take appropriate measures to identify any money laundering risks; and\n\
126
+ (d)\tnot effect wire transfers without the information required under (3) and\
127
+ \ (4)."
128
+ - source_sentence: How should a Relevant Person ensure and demonstrate compliance
129
+ with both UNSC Sanctions and U.A.E.-administered Sanctions, specifically Targeted
130
+ Financial Sanctions, within the ADGM jurisdiction?
131
+ sentences:
132
+ - 'REGULATORY REQUIREMENTS - SPOT COMMODITY ACTIVITIES
133
+
134
+ RIEs operating an MTF or OTF using Accepted Spot Commodities
135
+
136
+ Authorised Persons that are operating an MTF or OTF wishing to also operate a
137
+ RIE will be required to relinquish their FSP upon obtaining a Recognition Order
138
+ (to operate the RIE). If licensed by the FSRA to carry out both Regulated Activities
139
+ (e.g., operating an MTF and operating an RIE), the Recognition Order will include
140
+ a stipulation to that effect pursuant to MIR Rule 3.4.1.
141
+
142
+ '
143
+ - "Where a Relevant Person seeks to rely on a Person in (1) it may only do so if\
144
+ \ and to the extent that:\n(a)\tit immediately obtains the necessary CDD information\
145
+ \ from the third party in (1);\n(b)\tit takes adequate steps to satisfy itself\
146
+ \ that certified copies of the documents used to undertake the relevant elements\
147
+ \ of CDD will be available from the third party on request without delay;\n(c)\t\
148
+ the Person in (1)(b) to (d) is subject to regulation, including AML/TFS compliance\
149
+ \ requirements, by a Non-ADGM Financial Services Regulator or other competent\
150
+ \ authority in a country with AML/TFS regulations which are equivalent to the\
151
+ \ standards set out in the FATF Recommendations and it is supervised for compliance\
152
+ \ with such regulations;\n(d)\tthe Person in (1) has not relied on any exception\
153
+ \ from the requirement to conduct any relevant elements of CDD which the Relevant\
154
+ \ Person seeks to rely on; and\n(e)\tin relation to (2), the information is up\
155
+ \ to date."
156
+ - "Financial Services Permissions. VC Managers operating in ADGM require a Financial\
157
+ \ Services Permission (“FSP”) to undertake any Regulated Activity pertaining to\
158
+ \ VC Funds and/or co-investments by third parties in VC Funds. The Regulated Activities\
159
+ \ covered by the FSP will be dependent on the VC Managers’ investment strategy\
160
+ \ and business model.\n(a)\tManaging a Collective Investment Fund: this includes\
161
+ \ carrying out fund management activities in respect of a VC Fund.\n(b)\tAdvising\
162
+ \ on Investments or Credit : for VC Managers these activities will be restricted\
163
+ \ to activities related to co-investment alongside a VC Fund which the VC Manager\
164
+ \ manages, such as recommending that a client invest in an investee company alongside\
165
+ \ the VC Fund and on the strategy and structure required to make the investment.\n\
166
+ (c)\tArranging Deals in Investments: VC Managers may also wish to make arrangements\
167
+ \ to facilitate co-investments in the investee company.\nAuthorisation fees and\
168
+ \ supervision fees for a VC Manager are capped at USD 10,000 regardless of whether\
169
+ \ one or both of the additional Regulated Activities in b) and c) above in relation\
170
+ \ to co-investments are included in its FSP. The FSP will include restrictions\
171
+ \ appropriate to the business model of a VC Manager."
172
+ ---
173
+
174
+ # SentenceTransformer based on Snowflake/snowflake-arctic-embed-m-long
175
+
176
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Snowflake/snowflake-arctic-embed-m-long](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-long) on the csv dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
177
+
178
+ ## Model Details
179
+
180
+ ### Model Description
181
+ - **Model Type:** Sentence Transformer
182
+ - **Base model:** [Snowflake/snowflake-arctic-embed-m-long](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-long) <!-- at revision 89d0f6ab196eead40b90cb6f9fefec01a908d2d1 -->
183
+ - **Maximum Sequence Length:** 8192 tokens
184
+ - **Output Dimensionality:** 768 tokens
185
+ - **Similarity Function:** Cosine Similarity
186
+ - **Training Dataset:**
187
+ - csv
188
+ <!-- - **Language:** Unknown -->
189
+ <!-- - **License:** Unknown -->
190
+
191
+ ### Model Sources
192
+
193
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
194
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
195
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
196
+
197
+ ### Full Model Architecture
198
+
199
+ ```
200
+ SentenceTransformer(
201
+ (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NomicBertModel
202
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
203
+ (2): Normalize()
204
+ )
205
+ ```
206
+
207
+ ## Usage
208
+
209
+ ### Direct Usage (Sentence Transformers)
210
+
211
+ First install the Sentence Transformers library:
212
+
213
+ ```bash
214
+ pip install -U sentence-transformers
215
+ ```
216
+
217
+ Then you can load this model and run inference.
218
+ ```python
219
+ from sentence_transformers import SentenceTransformer
220
+
221
+ # Download from the 🤗 Hub
222
+ model = SentenceTransformer("jebish7/snowflake-arctic-embed-m-long_MNR_1")
223
+ # Run inference
224
+ sentences = [
225
+ 'How should a Relevant Person ensure and demonstrate compliance with both UNSC Sanctions and U.A.E.-administered Sanctions, specifically Targeted Financial Sanctions, within the ADGM jurisdiction?',
226
+ 'Where a Relevant Person seeks to rely on a Person in (1) it may only do so if and to the extent that:\n(a)\tit immediately obtains the necessary CDD information from the third party in (1);\n(b)\tit takes adequate steps to satisfy itself that certified copies of the documents used to undertake the relevant elements of CDD will be available from the third party on request without delay;\n(c)\tthe Person in (1)(b) to (d) is subject to regulation, including AML/TFS compliance requirements, by a Non-ADGM Financial Services Regulator or other competent authority in a country with AML/TFS regulations which are equivalent to the standards set out in the FATF Recommendations and it is supervised for compliance with such regulations;\n(d)\tthe Person in (1) has not relied on any exception from the requirement to conduct any relevant elements of CDD which the Relevant Person seeks to rely on; and\n(e)\tin relation to (2), the information is up to date.',
227
+ 'REGULATORY REQUIREMENTS - SPOT COMMODITY ACTIVITIES\nRIEs operating an MTF or OTF using Accepted Spot Commodities\nAuthorised Persons that are operating an MTF or OTF wishing to also operate a RIE will be required to relinquish their FSP upon obtaining a Recognition Order (to operate the RIE). If licensed by the FSRA to carry out both Regulated Activities (e.g., operating an MTF and operating an RIE), the Recognition Order will include a stipulation to that effect pursuant to MIR Rule 3.4.1.\n',
228
+ ]
229
+ embeddings = model.encode(sentences)
230
+ print(embeddings.shape)
231
+ # [3, 768]
232
+
233
+ # Get the similarity scores for the embeddings
234
+ similarities = model.similarity(embeddings, embeddings)
235
+ print(similarities.shape)
236
+ # [3, 3]
237
+ ```
238
+
239
+ <!--
240
+ ### Direct Usage (Transformers)
241
+
242
+ <details><summary>Click to see the direct usage in Transformers</summary>
243
+
244
+ </details>
245
+ -->
246
+
247
+ <!--
248
+ ### Downstream Usage (Sentence Transformers)
249
+
250
+ You can finetune this model on your own dataset.
251
+
252
+ <details><summary>Click to expand</summary>
253
+
254
+ </details>
255
+ -->
256
+
257
+ <!--
258
+ ### Out-of-Scope Use
259
+
260
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
261
+ -->
262
+
263
+ <!--
264
+ ## Bias, Risks and Limitations
265
+
266
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
267
+ -->
268
+
269
+ <!--
270
+ ### Recommendations
271
+
272
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
273
+ -->
274
+
275
+ ## Training Details
276
+
277
+ ### Training Dataset
278
+
279
+ #### csv
280
+
281
+ * Dataset: csv
282
+ * Size: 29,547 training samples
283
+ * Columns: <code>Question</code> and <code>positive</code>
284
+ * Approximate statistics based on the first 1000 samples:
285
+ | | Question | positive |
286
+ |:--------|:-----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
287
+ | type | string | string |
288
+ | details | <ul><li>min: 18 tokens</li><li>mean: 34.91 tokens</li><li>max: 83 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 118.51 tokens</li><li>max: 1090 tokens</li></ul> |
289
+ * Samples:
290
+ | Question | positive |
291
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
292
+ | <code>Under which circumstances is a Mining Reporting Entity exempt from immediate disclosure of material information about its mining activities according to the FSRA guidelines?</code> | <code>INTERACTION OF CHAPTER 11 WITH OTHER RULE DISCLOSURE OBLIGATIONS. Prior to a Mining Reporting Entity having all the information available to it, the FSRA considers that whatever material information it may have about the mining activity will generally be insufficiently definite to warrant disclosure under the Rules. Therefore, provided the material information is and remains confidential, and the FSRA has not formed the view that the information ceases to remain confidential (e.g., where there are exceptions from disclosing the information), the material information is not immediately required to be disclosed under Rule 7.2.1. For more information, please refer to Chapter 7 of the Rules, and any relevant Guidance that the FSRA may publish from time in relation to the FSRA’s expectations as to how Reporting Entities are to comply with Chapter 7.<br><br></code> |
293
+ | <code>What specific IAASB standards or other standards acceptable to the Regulator are required for the audit of a Public Listed Company's financial statements?</code> | <code>Where an Authorised Person does not hold or control any Client Money as at the date on which the Authorised Person's audited statement of financial position was prepared, the Regulator expects that a nil balance be stated to comply with Rule ‎6.6.6.<br></code> |
294
+ | <code>How does the ADGM monitor compliance with the principles of effective dialogue with shareholders, and what are the consequences for companies that fail to establish such a dialogue?</code> | <code>Audit committee. The Board as a whole has responsibility for ensuring that a satisfactory dialogue with Shareholders takes place. Such dialogue should be based on the mutual understanding of objectives and provision of adequate information relating to the Reporting Entity including financial information, and how the business and affairs of the Reporting Entity are carried out.</code> |
295
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
296
+ ```json
297
+ {
298
+ "scale": 20.0,
299
+ "similarity_fct": "cos_sim"
300
+ }
301
+ ```
302
+
303
+ ### Training Hyperparameters
304
+ #### Non-Default Hyperparameters
305
+
306
+ - `per_device_train_batch_size`: 4
307
+ - `learning_rate`: 2e-05
308
+ - `num_train_epochs`: 1
309
+ - `warmup_ratio`: 0.1
310
+ - `batch_sampler`: no_duplicates
311
+
312
+ #### All Hyperparameters
313
+ <details><summary>Click to expand</summary>
314
+
315
+ - `overwrite_output_dir`: False
316
+ - `do_predict`: False
317
+ - `eval_strategy`: no
318
+ - `prediction_loss_only`: True
319
+ - `per_device_train_batch_size`: 4
320
+ - `per_device_eval_batch_size`: 8
321
+ - `per_gpu_train_batch_size`: None
322
+ - `per_gpu_eval_batch_size`: None
323
+ - `gradient_accumulation_steps`: 1
324
+ - `eval_accumulation_steps`: None
325
+ - `torch_empty_cache_steps`: None
326
+ - `learning_rate`: 2e-05
327
+ - `weight_decay`: 0.0
328
+ - `adam_beta1`: 0.9
329
+ - `adam_beta2`: 0.999
330
+ - `adam_epsilon`: 1e-08
331
+ - `max_grad_norm`: 1.0
332
+ - `num_train_epochs`: 1
333
+ - `max_steps`: -1
334
+ - `lr_scheduler_type`: linear
335
+ - `lr_scheduler_kwargs`: {}
336
+ - `warmup_ratio`: 0.1
337
+ - `warmup_steps`: 0
338
+ - `log_level`: passive
339
+ - `log_level_replica`: warning
340
+ - `log_on_each_node`: True
341
+ - `logging_nan_inf_filter`: True
342
+ - `save_safetensors`: True
343
+ - `save_on_each_node`: False
344
+ - `save_only_model`: False
345
+ - `restore_callback_states_from_checkpoint`: False
346
+ - `no_cuda`: False
347
+ - `use_cpu`: False
348
+ - `use_mps_device`: False
349
+ - `seed`: 42
350
+ - `data_seed`: None
351
+ - `jit_mode_eval`: False
352
+ - `use_ipex`: False
353
+ - `bf16`: False
354
+ - `fp16`: False
355
+ - `fp16_opt_level`: O1
356
+ - `half_precision_backend`: auto
357
+ - `bf16_full_eval`: False
358
+ - `fp16_full_eval`: False
359
+ - `tf32`: None
360
+ - `local_rank`: 0
361
+ - `ddp_backend`: None
362
+ - `tpu_num_cores`: None
363
+ - `tpu_metrics_debug`: False
364
+ - `debug`: []
365
+ - `dataloader_drop_last`: False
366
+ - `dataloader_num_workers`: 0
367
+ - `dataloader_prefetch_factor`: None
368
+ - `past_index`: -1
369
+ - `disable_tqdm`: False
370
+ - `remove_unused_columns`: True
371
+ - `label_names`: None
372
+ - `load_best_model_at_end`: False
373
+ - `ignore_data_skip`: False
374
+ - `fsdp`: []
375
+ - `fsdp_min_num_params`: 0
376
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
377
+ - `fsdp_transformer_layer_cls_to_wrap`: None
378
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
379
+ - `deepspeed`: None
380
+ - `label_smoothing_factor`: 0.0
381
+ - `optim`: adamw_torch
382
+ - `optim_args`: None
383
+ - `adafactor`: False
384
+ - `group_by_length`: False
385
+ - `length_column_name`: length
386
+ - `ddp_find_unused_parameters`: None
387
+ - `ddp_bucket_cap_mb`: None
388
+ - `ddp_broadcast_buffers`: False
389
+ - `dataloader_pin_memory`: True
390
+ - `dataloader_persistent_workers`: False
391
+ - `skip_memory_metrics`: True
392
+ - `use_legacy_prediction_loop`: False
393
+ - `push_to_hub`: False
394
+ - `resume_from_checkpoint`: None
395
+ - `hub_model_id`: None
396
+ - `hub_strategy`: every_save
397
+ - `hub_private_repo`: False
398
+ - `hub_always_push`: False
399
+ - `gradient_checkpointing`: False
400
+ - `gradient_checkpointing_kwargs`: None
401
+ - `include_inputs_for_metrics`: False
402
+ - `eval_do_concat_batches`: True
403
+ - `fp16_backend`: auto
404
+ - `push_to_hub_model_id`: None
405
+ - `push_to_hub_organization`: None
406
+ - `mp_parameters`:
407
+ - `auto_find_batch_size`: False
408
+ - `full_determinism`: False
409
+ - `torchdynamo`: None
410
+ - `ray_scope`: last
411
+ - `ddp_timeout`: 1800
412
+ - `torch_compile`: False
413
+ - `torch_compile_backend`: None
414
+ - `torch_compile_mode`: None
415
+ - `dispatch_batches`: None
416
+ - `split_batches`: None
417
+ - `include_tokens_per_second`: False
418
+ - `include_num_input_tokens_seen`: False
419
+ - `neftune_noise_alpha`: None
420
+ - `optim_target_modules`: None
421
+ - `batch_eval_metrics`: False
422
+ - `eval_on_start`: False
423
+ - `use_liger_kernel`: False
424
+ - `eval_use_gather_object`: False
425
+ - `batch_sampler`: no_duplicates
426
+ - `multi_dataset_batch_sampler`: proportional
427
+
428
+ </details>
429
+
430
+ ### Training Logs
431
+ | Epoch | Step | Training Loss |
432
+ |:------:|:----:|:-------------:|
433
+ | 0.0271 | 100 | 0.6411 |
434
+ | 0.0541 | 200 | 0.3289 |
435
+ | 0.0812 | 300 | 0.2395 |
436
+ | 0.1083 | 400 | 0.2711 |
437
+ | 0.1354 | 500 | 0.2746 |
438
+ | 0.1624 | 600 | 0.2602 |
439
+ | 0.1895 | 700 | 0.285 |
440
+ | 0.2166 | 800 | 0.2965 |
441
+ | 0.2436 | 900 | 0.2772 |
442
+ | 0.2707 | 1000 | 0.3043 |
443
+ | 0.2978 | 1100 | 0.3059 |
444
+ | 0.3249 | 1200 | 0.316 |
445
+ | 0.3519 | 1300 | 0.2765 |
446
+ | 0.3790 | 1400 | 0.249 |
447
+ | 0.4061 | 1500 | 0.2601 |
448
+ | 0.4331 | 1600 | 0.2538 |
449
+ | 0.4602 | 1700 | 0.2443 |
450
+ | 0.4873 | 1800 | 0.2151 |
451
+ | 0.5143 | 1900 | 0.2335 |
452
+ | 0.5414 | 2000 | 0.2611 |
453
+ | 0.5685 | 2100 | 0.2557 |
454
+ | 0.5956 | 2200 | 0.2793 |
455
+ | 0.0694 | 100 | 0.2141 |
456
+ | 0.1389 | 200 | 0.273 |
457
+ | 0.2083 | 300 | 0.295 |
458
+ | 0.2778 | 400 | 0.2079 |
459
+ | 0.3472 | 500 | 0.2556 |
460
+ | 0.4167 | 600 | 0.252 |
461
+ | 0.4861 | 700 | 0.2142 |
462
+ | 0.5556 | 800 | 0.2181 |
463
+ | 0.625 | 900 | 0.2347 |
464
+ | 0.6944 | 1000 | 0.1754 |
465
+ | 0.7639 | 1100 | 0.2313 |
466
+ | 0.8333 | 1200 | 0.2104 |
467
+ | 0.9028 | 1300 | 0.2435 |
468
+ | 0.9722 | 1400 | 0.2399 |
469
+
470
+
471
+ ### Framework Versions
472
+ - Python: 3.10.14
473
+ - Sentence Transformers: 3.1.1
474
+ - Transformers: 4.45.2
475
+ - PyTorch: 2.4.0
476
+ - Accelerate: 0.34.2
477
+ - Datasets: 3.0.1
478
+ - Tokenizers: 0.20.0
479
+
480
+ ## Citation
481
+
482
+ ### BibTeX
483
+
484
+ #### Sentence Transformers
485
+ ```bibtex
486
+ @inproceedings{reimers-2019-sentence-bert,
487
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
488
+ author = "Reimers, Nils and Gurevych, Iryna",
489
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
490
+ month = "11",
491
+ year = "2019",
492
+ publisher = "Association for Computational Linguistics",
493
+ url = "https://arxiv.org/abs/1908.10084",
494
+ }
495
+ ```
496
+
497
+ #### MultipleNegativesRankingLoss
498
+ ```bibtex
499
+ @misc{henderson2017efficient,
500
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
501
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
502
+ year={2017},
503
+ eprint={1705.00652},
504
+ archivePrefix={arXiv},
505
+ primaryClass={cs.CL}
506
+ }
507
+ ```
508
+
509
+ <!--
510
+ ## Glossary
511
+
512
+ *Clearly define terms in order to be accessible across audiences.*
513
+ -->
514
+
515
+ <!--
516
+ ## Model Card Authors
517
+
518
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
519
+ -->
520
+
521
+ <!--
522
+ ## Model Card Contact
523
+
524
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
525
+ -->
config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "Snowflake/snowflake-arctic-embed-m-long",
3
+ "activation_function": "swiglu",
4
+ "architectures": [
5
+ "NomicBertModel"
6
+ ],
7
+ "attn_pdrop": 0.0,
8
+ "auto_map": {
9
+ "AutoConfig": "Snowflake/snowflake-arctic-embed-m-long--configuration_hf_nomic_bert.NomicBertConfig",
10
+ "AutoModel": "Snowflake/snowflake-arctic-embed-m-long--modeling_hf_nomic_bert.NomicBertModel"
11
+ },
12
+ "bos_token_id": null,
13
+ "causal": false,
14
+ "dense_seq_output": true,
15
+ "embd_pdrop": 0.1,
16
+ "eos_token_id": null,
17
+ "fused_bias_fc": true,
18
+ "fused_dropout_add_ln": true,
19
+ "initializer_range": 0.02,
20
+ "layer_norm_epsilon": 1e-12,
21
+ "max_trained_positions": 2048,
22
+ "mlp_fc1_bias": false,
23
+ "mlp_fc2_bias": false,
24
+ "model_type": "nomic_bert",
25
+ "n_embd": 768,
26
+ "n_head": 12,
27
+ "n_inner": 3072,
28
+ "n_layer": 12,
29
+ "n_positions": 8192,
30
+ "pad_vocab_size_multiple": 64,
31
+ "parallel_block": false,
32
+ "parallel_block_tied_norm": false,
33
+ "prenorm": false,
34
+ "qkv_proj_bias": false,
35
+ "reorder_and_upcast_attn": false,
36
+ "resid_pdrop": 0.1,
37
+ "rotary_emb_base": 1000,
38
+ "rotary_emb_fraction": 1.0,
39
+ "rotary_emb_interleaved": false,
40
+ "rotary_emb_scale_base": null,
41
+ "rotary_scaling_factor": 2,
42
+ "scale_attn_by_inverse_layer_idx": false,
43
+ "scale_attn_weights": true,
44
+ "summary_activation": null,
45
+ "summary_first_dropout": 0.1,
46
+ "summary_proj_to_labels": true,
47
+ "summary_type": "cls_index",
48
+ "summary_use_proj": true,
49
+ "torch_dtype": "float32",
50
+ "transformers_version": "4.45.2",
51
+ "type_vocab_size": 2,
52
+ "use_cache": true,
53
+ "use_flash_attn": true,
54
+ "use_rms_norm": false,
55
+ "use_xentropy": true,
56
+ "vocab_size": 30528
57
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.1",
4
+ "transformers": "4.45.2",
5
+ "pytorch": "2.4.0"
6
+ },
7
+ "prompts": {
8
+ "query": "Represent this sentence for searching relevant passages: "
9
+ },
10
+ "default_prompt_name": null,
11
+ "similarity_fn_name": null
12
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f10acc8253b625b8e8a407abcb816b2b1abce765c03f823621517db531265397
3
+ size 546938168
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 8192,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "additional_special_tokens": [],
45
+ "clean_up_tokenization_spaces": true,
46
+ "cls_token": "[CLS]",
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "max_length": 512,
50
+ "model_max_length": 8192,
51
+ "pad_to_multiple_of": null,
52
+ "pad_token": "[PAD]",
53
+ "pad_token_type_id": 0,
54
+ "padding_side": "right",
55
+ "sep_token": "[SEP]",
56
+ "stride": 0,
57
+ "strip_accents": null,
58
+ "tokenize_chinese_chars": true,
59
+ "tokenizer_class": "BertTokenizer",
60
+ "truncation_side": "right",
61
+ "truncation_strategy": "longest_first",
62
+ "unk_token": "[UNK]"
63
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff