iris49 commited on
Commit
8892830
·
verified ·
1 Parent(s): d5a0df7

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md CHANGED
@@ -1,5 +1,837 @@
1
- Using Matryoshka Representation Learning
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- Based on BAAI/bge-base-en-v1.5
4
 
5
- Trained on q-a pairs retrieved from :23.501,23.040, 23.502,24.301,29.274,29.109,29.272,29.283,29.109,29.244,29.281,33.513,33.220,33.221,36.413, 38.415,23040,33513,33220,33221
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:56041
11
+ - loss:MatryoshkaLoss
12
+ - loss:MultipleNegativesRankingLoss
13
+ base_model: BAAI/bge-base-en-v1.5
14
+ widget:
15
+ - source_sentence: What is the significance of the tables 6.1.6.2.5-1 and 6.1.6.2.6-1
16
+ in the context of the Namf_Communication Service API?
17
+ sentences:
18
+ - The 'notifId' attribute in the PolicyDataSubscription type serves as a Notification
19
+ Correlation ID assigned by the NF service consumer. It is included when the 'ConditionalSubscriptionwithPartialNotification'
20
+ or the 'ConditionalSubscriptionWithExcludeNotification' feature is supported.
21
+ This ID is used to correlate notifications with the specific subscription request,
22
+ ensuring that the NF service consumer can track and manage notifications effectively.
23
+ - The 'sessRuleReports' attribute in the 'ErrorReport' type is specifically used
24
+ to report failures related to session rules, whereas the 'ruleReports' attribute
25
+ reports failures related to PCC rules. 'sessRuleReports' contains an array of
26
+ 'SessionRuleReport' objects, which provide details about the session rule failures.
27
+ Like 'ruleReports', it is optional and can have one or more entries (cardinality
28
+ 1..N).
29
+ - Tables 6.1.6.2.5-1 and 6.1.6.2.6-1 are significant in the Namf_Communication Service
30
+ API as they provide the definitions for the types 'AssignEbiData' and 'AssignedEbiData',
31
+ respectively. These tables outline the structure, attributes, and possibly the
32
+ constraints or rules associated with these data types, which are essential for
33
+ understanding and implementing the API's functionality related to EBI assignment
34
+ and management.
35
+ - source_sentence: What document defines the basic principles for online charging,
36
+ and where is this information referenced?
37
+ sentences:
38
+ - The UDM (Unified Data Management) returns the Ranging and Sidelink Positioning
39
+ Subscription Data for the UE (User Equipment) identified by the supi (Subscription
40
+ Permanent Identifier). This data is retrieved using the GET method, which supports
41
+ the URI query parameters outlined in table 6.1.3.37.3.1-1.
42
+ - The Nsmf_PDUSession_SMContextStatusNotify service operation is used by the SMF
43
+ (Session Management Function) to notify its consumers about the status of an SM
44
+ (Session Management) context related to a PDU (Packet Data Unit) Session. In the
45
+ context of I-SMF (Intermediate SMF) context transfer, this service operation is
46
+ used to indicate the transfer of the SM context to a new I-SMF or SMF set. It
47
+ also allows the SMF to update the SMF-derived CN (Core Network) assisted RAN (Radio
48
+ Access Network) parameters tuning in the AMF (Access and Mobility Management Function).
49
+ Additionally, it can report DDN (Downlink Data Notification) failures and provide
50
+ target DNAI (Data Network Access Identifier) information for the current or next
51
+ PDU session.
52
+ - The basic principles for online charging are defined in TS 32.240 [1]. This information
53
+ is referenced in section 5.2.1 of the document, which is part of the '5.2 Online
54
+ charging scenario' chapter.
55
+ - source_sentence: What are the possible values for the 'ReportingLevel' enumeration,
56
+ and what do they indicate?
57
+ sentences:
58
+ - If protected User Plane (UP) messages reach the SN before the SN has received
59
+ the SN Counter value in the SN Reconfiguration Complete message, the SN chooses
60
+ the first unused KSN key of the UE to establish the security association. This
61
+ ensures that communication can proceed securely even if the SN Counter value has
62
+ not yet been received. Once the SN Counter value is received, the SN verifies
63
+ it to ensure there is no KSN mismatch.
64
+ - 'The ''ReportingLevel'' enumeration has three possible values: ''SER_ID_LEVEL'',
65
+ ''RAT_GR_LEVEL'', and ''SPON_CON_LEVEL''. ''SER_ID_LEVEL'' indicates that usage
66
+ should be reported at the service ID and rating group combination level. ''RAT_GR_LEVEL''
67
+ indicates that usage should be reported at the rating group level. ''SPON_CON_LEVEL''
68
+ indicates that usage should be reported at the sponsor identity and rating group
69
+ combination level. These levels help in categorizing and reporting usage data
70
+ based on different granularities.'
71
+ - Structured data types in the Nudr_GroupIDmap Service API are more complex than
72
+ simple data types. While simple data types represent single values like integers
73
+ or strings, structured data types are composed of multiple simple data types or
74
+ other structured data types, forming a more complex data structure. For example,
75
+ a structured data type might represent a user profile containing fields for name,
76
+ age, and address, each of which could be a simple data type. This allows for the
77
+ representation of more intricate and hierarchical data within the API.
78
+ - source_sentence: What is the purpose of the Intermediate Spending Limit Report Request
79
+ procedure described in the document?
80
+ sentences:
81
+ - The Resource URI variables defined in table 6.1.3.8.2-1 for the 'sm-data' resource
82
+ serve to dynamically construct the URI based on specific parameters. These variables
83
+ include {apiRoot}, <apiVersion>, and {supi}. The {apiRoot} variable specifies
84
+ the base URL of the API, <apiVersion> indicates the version of the API to be used,
85
+ and {supi} represents the Subscription Permanent Identifier, which is used to
86
+ uniquely identify the subscriber. These variables ensure that the URI is correctly
87
+ formatted and points to the appropriate resource for the given subscriber and
88
+ API version.
89
+ - The purpose of the Intermediate Spending Limit Report Request procedure is to
90
+ allow the PCF (Policy Control Function) to request the status of additional policy
91
+ counters available at the CHF (Charging Function) or to remove the request for
92
+ the status of policy counters. The PCF can modify the list of subscribed policy
93
+ counters based on its policy decisions, and the CHF responds by providing the
94
+ policy counter status, optionally including pending statuses and their activation
95
+ times, for the requested policy counters.
96
+ - When ABC online charging is employed, the TDF uses Debit / Reserve Units Request[Initial],
97
+ update, or termination to convey charging information related to the detected
98
+ application traffic. The OCS responds with Debit / Reserve Units Response, which
99
+ includes quotas for rating groups or instructions on handling the application
100
+ traffic (e.g., terminate, continue, reroute). The TDF must request a quota before
101
+ service delivery. If only certain quotas are authorized by the OCS (e.g., due
102
+ to insufficient credit), the rating groups without authorized quotas are handled
103
+ according to the received Result Code value. The quota supervision mechanism is
104
+ further described in TS 32.299 [50].
105
+ - source_sentence: What types of data structures are supported by the GET request
106
+ body on the resource described in table 5.2.11.3.4-2, and how do they influence
107
+ the request?
108
+ sentences:
109
+ - In Direct Communication mode, the NF Service consumer can subscribe to status
110
+ change notifications of NF instances from the NRF. If the NF Service consumer
111
+ is notified by the NRF or detects by itself (e.g., through a lack of response
112
+ to a request) that the NF producer instance is no longer available, it selects
113
+ another available NF producer instance within the same NF Set. In Indirect Communication
114
+ mode, the SCP or NF Service consumer may also subscribe to status change notifications
115
+ from the NRF and select another NF producer instance within the same NF Set if
116
+ the original instance serving the UE becomes unavailable. The specific implementation
117
+ details of how the SCP detects the unavailability of an NF producer instance are
118
+ left to the implementation.
119
+ - The data structures supported by the GET request body on the resource are detailed
120
+ in table 5.2.11.3.4-2. These structures define the format and content of the data
121
+ that can be sent in the request body. They might include fields such as 'filterCriteria',
122
+ 'sortOrder', or 'pagination', which influence how the server processes the request
123
+ and returns the appropriate data.
124
+ - 'The specific triggers on the Ro interface that can lead to the termination of
125
+ the IMS service include: 1) Reception of an unsuccessful Operation Result different
126
+ from DIAMETER_CREDIT_CONTROL_NOT_APPLICABLE in the Debit/Reserve Units Response
127
+ message. 2) Reception of an unsuccessful Result Code different from DIAMETER_CREDIT_CONTROL_NOT_APPLICABLE
128
+ within the multiple units operation in the Debit/Reserve Units Response message
129
+ when only one instance of the multiple units operation field is used. 3) Execution
130
+ of the termination action procedure as defined in TS 32.299 when only one instance
131
+ of the Multiple Unit Operation field is used. 4) Execution of the failure handling
132
+ procedures when the Failure Action is set to ''Terminate'' or ''Retry & Terminate''.
133
+ 5) Reception in the IMS-GWF of an Abort-Session-Request message from OCS.'
134
+ pipeline_tag: sentence-similarity
135
+ library_name: sentence-transformers
136
+ metrics:
137
+ - cosine_accuracy@1
138
+ - cosine_accuracy@3
139
+ - cosine_accuracy@5
140
+ - cosine_accuracy@10
141
+ - cosine_precision@1
142
+ - cosine_precision@3
143
+ - cosine_precision@5
144
+ - cosine_precision@10
145
+ - cosine_recall@1
146
+ - cosine_recall@3
147
+ - cosine_recall@5
148
+ - cosine_recall@10
149
+ - cosine_ndcg@10
150
+ - cosine_mrr@10
151
+ - cosine_map@100
152
+ model-index:
153
+ - name: BGE_base_3gpp-qa-v2_Matryoshka
154
+ results:
155
+ - task:
156
+ type: information-retrieval
157
+ name: Information Retrieval
158
+ dataset:
159
+ name: dim 768
160
+ type: dim_768
161
+ metrics:
162
+ - type: cosine_accuracy@1
163
+ value: 0.8347103013864849
164
+ name: Cosine Accuracy@1
165
+ - type: cosine_accuracy@3
166
+ value: 0.9628129405256866
167
+ name: Cosine Accuracy@3
168
+ - type: cosine_accuracy@5
169
+ value: 0.9806391748898128
170
+ name: Cosine Accuracy@5
171
+ - type: cosine_accuracy@10
172
+ value: 0.9927196159954319
173
+ name: Cosine Accuracy@10
174
+ - type: cosine_precision@1
175
+ value: 0.8347103013864849
176
+ name: Cosine Precision@1
177
+ - type: cosine_precision@3
178
+ value: 0.32093764684189546
179
+ name: Cosine Precision@3
180
+ - type: cosine_precision@5
181
+ value: 0.1961278349779626
182
+ name: Cosine Precision@5
183
+ - type: cosine_precision@10
184
+ value: 0.09927196159954321
185
+ name: Cosine Precision@10
186
+ - type: cosine_recall@1
187
+ value: 0.8347103013864849
188
+ name: Cosine Recall@1
189
+ - type: cosine_recall@3
190
+ value: 0.9628129405256866
191
+ name: Cosine Recall@3
192
+ - type: cosine_recall@5
193
+ value: 0.9806391748898128
194
+ name: Cosine Recall@5
195
+ - type: cosine_recall@10
196
+ value: 0.9927196159954319
197
+ name: Cosine Recall@10
198
+ - type: cosine_ndcg@10
199
+ value: 0.9235193716202091
200
+ name: Cosine Ndcg@10
201
+ - type: cosine_mrr@10
202
+ value: 0.9002603606826465
203
+ name: Cosine Mrr@10
204
+ - type: cosine_map@100
205
+ value: 0.9006611894428589
206
+ name: Cosine Map@100
207
+ - task:
208
+ type: information-retrieval
209
+ name: Information Retrieval
210
+ dataset:
211
+ name: dim 512
212
+ type: dim_512
213
+ metrics:
214
+ - type: cosine_accuracy@1
215
+ value: 0.8341214467978801
216
+ name: Cosine Accuracy@1
217
+ - type: cosine_accuracy@3
218
+ value: 0.9630270694669973
219
+ name: Cosine Accuracy@3
220
+ - type: cosine_accuracy@5
221
+ value: 0.980835459752681
222
+ name: Cosine Accuracy@5
223
+ - type: cosine_accuracy@10
224
+ value: 0.9925947074463339
225
+ name: Cosine Accuracy@10
226
+ - type: cosine_precision@1
227
+ value: 0.8341214467978801
228
+ name: Cosine Precision@1
229
+ - type: cosine_precision@3
230
+ value: 0.32100902315566576
231
+ name: Cosine Precision@3
232
+ - type: cosine_precision@5
233
+ value: 0.19616709195053625
234
+ name: Cosine Precision@5
235
+ - type: cosine_precision@10
236
+ value: 0.09925947074463341
237
+ name: Cosine Precision@10
238
+ - type: cosine_recall@1
239
+ value: 0.8341214467978801
240
+ name: Cosine Recall@1
241
+ - type: cosine_recall@3
242
+ value: 0.9630270694669973
243
+ name: Cosine Recall@3
244
+ - type: cosine_recall@5
245
+ value: 0.980835459752681
246
+ name: Cosine Recall@5
247
+ - type: cosine_recall@10
248
+ value: 0.9925947074463339
249
+ name: Cosine Recall@10
250
+ - type: cosine_ndcg@10
251
+ value: 0.9232781516394674
252
+ name: Cosine Ndcg@10
253
+ - type: cosine_mrr@10
254
+ value: 0.8999735171216805
255
+ name: Cosine Mrr@10
256
+ - type: cosine_map@100
257
+ value: 0.9003855301087177
258
+ name: Cosine Map@100
259
+ - task:
260
+ type: information-retrieval
261
+ name: Information Retrieval
262
+ dataset:
263
+ name: dim 256
264
+ type: dim_256
265
+ metrics:
266
+ - type: cosine_accuracy@1
267
+ value: 0.8326047001302618
268
+ name: Cosine Accuracy@1
269
+ - type: cosine_accuracy@3
270
+ value: 0.9624382148783927
271
+ name: Cosine Accuracy@3
272
+ - type: cosine_accuracy@5
273
+ value: 0.9801930729287486
274
+ name: Cosine Accuracy@5
275
+ - type: cosine_accuracy@10
276
+ value: 0.9922913581128102
277
+ name: Cosine Accuracy@10
278
+ - type: cosine_precision@1
279
+ value: 0.8326047001302618
280
+ name: Cosine Precision@1
281
+ - type: cosine_precision@3
282
+ value: 0.3208127382927975
283
+ name: Cosine Precision@3
284
+ - type: cosine_precision@5
285
+ value: 0.19603861458574973
286
+ name: Cosine Precision@5
287
+ - type: cosine_precision@10
288
+ value: 0.09922913581128105
289
+ name: Cosine Precision@10
290
+ - type: cosine_recall@1
291
+ value: 0.8326047001302618
292
+ name: Cosine Recall@1
293
+ - type: cosine_recall@3
294
+ value: 0.9624382148783927
295
+ name: Cosine Recall@3
296
+ - type: cosine_recall@5
297
+ value: 0.9801930729287486
298
+ name: Cosine Recall@5
299
+ - type: cosine_recall@10
300
+ value: 0.9922913581128102
301
+ name: Cosine Recall@10
302
+ - type: cosine_ndcg@10
303
+ value: 0.9223721780180253
304
+ name: Cosine Ndcg@10
305
+ - type: cosine_mrr@10
306
+ value: 0.898869719250338
307
+ name: Cosine Mrr@10
308
+ - type: cosine_map@100
309
+ value: 0.8993021227310489
310
+ name: Cosine Map@100
311
+ - task:
312
+ type: information-retrieval
313
+ name: Information Retrieval
314
+ dataset:
315
+ name: dim 128
316
+ type: dim_128
317
+ metrics:
318
+ - type: cosine_accuracy@1
319
+ value: 0.8294462982459271
320
+ name: Cosine Accuracy@1
321
+ - type: cosine_accuracy@3
322
+ value: 0.9610642208383148
323
+ name: Cosine Accuracy@3
324
+ - type: cosine_accuracy@5
325
+ value: 0.9796399064970289
326
+ name: Cosine Accuracy@5
327
+ - type: cosine_accuracy@10
328
+ value: 0.991720347602648
329
+ name: Cosine Accuracy@10
330
+ - type: cosine_precision@1
331
+ value: 0.8294462982459271
332
+ name: Cosine Precision@1
333
+ - type: cosine_precision@3
334
+ value: 0.3203547402794382
335
+ name: Cosine Precision@3
336
+ - type: cosine_precision@5
337
+ value: 0.19592798129940583
338
+ name: Cosine Precision@5
339
+ - type: cosine_precision@10
340
+ value: 0.09917203476026483
341
+ name: Cosine Precision@10
342
+ - type: cosine_recall@1
343
+ value: 0.8294462982459271
344
+ name: Cosine Recall@1
345
+ - type: cosine_recall@3
346
+ value: 0.9610642208383148
347
+ name: Cosine Recall@3
348
+ - type: cosine_recall@5
349
+ value: 0.9796399064970289
350
+ name: Cosine Recall@5
351
+ - type: cosine_recall@10
352
+ value: 0.991720347602648
353
+ name: Cosine Recall@10
354
+ - type: cosine_ndcg@10
355
+ value: 0.9204835891487085
356
+ name: Cosine Ndcg@10
357
+ - type: cosine_mrr@10
358
+ value: 0.8965493659262566
359
+ name: Cosine Mrr@10
360
+ - type: cosine_map@100
361
+ value: 0.897020544909686
362
+ name: Cosine Map@100
363
+ - task:
364
+ type: information-retrieval
365
+ name: Information Retrieval
366
+ dataset:
367
+ name: dim 64
368
+ type: dim_64
369
+ metrics:
370
+ - type: cosine_accuracy@1
371
+ value: 0.8210595813779198
372
+ name: Cosine Accuracy@1
373
+ - type: cosine_accuracy@3
374
+ value: 0.9574775610713585
375
+ name: Cosine Accuracy@3
376
+ - type: cosine_accuracy@5
377
+ value: 0.9771595795935119
378
+ name: Cosine Accuracy@5
379
+ - type: cosine_accuracy@10
380
+ value: 0.9906497028960939
381
+ name: Cosine Accuracy@10
382
+ - type: cosine_precision@1
383
+ value: 0.8210595813779198
384
+ name: Cosine Precision@1
385
+ - type: cosine_precision@3
386
+ value: 0.3191591870237861
387
+ name: Cosine Precision@3
388
+ - type: cosine_precision@5
389
+ value: 0.19543191591870243
390
+ name: Cosine Precision@5
391
+ - type: cosine_precision@10
392
+ value: 0.09906497028960942
393
+ name: Cosine Precision@10
394
+ - type: cosine_recall@1
395
+ value: 0.8210595813779198
396
+ name: Cosine Recall@1
397
+ - type: cosine_recall@3
398
+ value: 0.9574775610713585
399
+ name: Cosine Recall@3
400
+ - type: cosine_recall@5
401
+ value: 0.9771595795935119
402
+ name: Cosine Recall@5
403
+ - type: cosine_recall@10
404
+ value: 0.9906497028960939
405
+ name: Cosine Recall@10
406
+ - type: cosine_ndcg@10
407
+ value: 0.9158816707476002
408
+ name: Cosine Ndcg@10
409
+ - type: cosine_mrr@10
410
+ value: 0.8908051588080549
411
+ name: Cosine Mrr@10
412
+ - type: cosine_map@100
413
+ value: 0.8913320555914594
414
+ name: Cosine Map@100
415
+ ---
416
 
417
+ # BGE_base_3gpp-qa-v2_Matryoshka
418
 
419
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
420
+
421
+ ## Model Details
422
+
423
+ ### Model Description
424
+ - **Model Type:** Sentence Transformer
425
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
426
+ - **Maximum Sequence Length:** 512 tokens
427
+ - **Output Dimensionality:** 768 dimensions
428
+ - **Similarity Function:** Cosine Similarity
429
+ - **Training Dataset:**
430
+ - json
431
+ - **Language:** en
432
+ - **License:** apache-2.0
433
+
434
+ ### Model Sources
435
+
436
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
437
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
438
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
439
+
440
+ ### Full Model Architecture
441
+
442
+ ```
443
+ SentenceTransformer(
444
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
445
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
446
+ (2): Normalize()
447
+ )
448
+ ```
449
+
450
+ ## Usage
451
+
452
+ ### Direct Usage (Sentence Transformers)
453
+
454
+ First install the Sentence Transformers library:
455
+
456
+ ```bash
457
+ pip install -U sentence-transformers
458
+ ```
459
+
460
+ Then you can load this model and run inference.
461
+ ```python
462
+ from sentence_transformers import SentenceTransformer
463
+
464
+ # Download from the 🤗 Hub
465
+ model = SentenceTransformer("iris49/3gpp-embedding-model-v0")
466
+ # Run inference
467
+ sentences = [
468
+ 'What types of data structures are supported by the GET request body on the resource described in table 5.2.11.3.4-2, and how do they influence the request?',
469
+ "The data structures supported by the GET request body on the resource are detailed in table 5.2.11.3.4-2. These structures define the format and content of the data that can be sent in the request body. They might include fields such as 'filterCriteria', 'sortOrder', or 'pagination', which influence how the server processes the request and returns the appropriate data.",
470
+ "The specific triggers on the Ro interface that can lead to the termination of the IMS service include: 1) Reception of an unsuccessful Operation Result different from DIAMETER_CREDIT_CONTROL_NOT_APPLICABLE in the Debit/Reserve Units Response message. 2) Reception of an unsuccessful Result Code different from DIAMETER_CREDIT_CONTROL_NOT_APPLICABLE within the multiple units operation in the Debit/Reserve Units Response message when only one instance of the multiple units operation field is used. 3) Execution of the termination action procedure as defined in TS 32.299 when only one instance of the Multiple Unit Operation field is used. 4) Execution of the failure handling procedures when the Failure Action is set to 'Terminate' or 'Retry & Terminate'. 5) Reception in the IMS-GWF of an Abort-Session-Request message from OCS.",
471
+ ]
472
+ embeddings = model.encode(sentences)
473
+ print(embeddings.shape)
474
+ # [3, 768]
475
+
476
+ # Get the similarity scores for the embeddings
477
+ similarities = model.similarity(embeddings, embeddings)
478
+ print(similarities.shape)
479
+ # [3, 3]
480
+ ```
481
+
482
+ <!--
483
+ ### Direct Usage (Transformers)
484
+
485
+ <details><summary>Click to see the direct usage in Transformers</summary>
486
+
487
+ </details>
488
+ -->
489
+
490
+ <!--
491
+ ### Downstream Usage (Sentence Transformers)
492
+
493
+ You can finetune this model on your own dataset.
494
+
495
+ <details><summary>Click to expand</summary>
496
+
497
+ </details>
498
+ -->
499
+
500
+ <!--
501
+ ### Out-of-Scope Use
502
+
503
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
504
+ -->
505
+
506
+ ## Evaluation
507
+
508
+ ### Metrics
509
+
510
+ #### Information Retrieval
511
+
512
+ * Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
513
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
514
+
515
+ | Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
516
+ |:--------------------|:-----------|:-----------|:-----------|:-----------|:-----------|
517
+ | cosine_accuracy@1 | 0.8347 | 0.8341 | 0.8326 | 0.8294 | 0.8211 |
518
+ | cosine_accuracy@3 | 0.9628 | 0.963 | 0.9624 | 0.9611 | 0.9575 |
519
+ | cosine_accuracy@5 | 0.9806 | 0.9808 | 0.9802 | 0.9796 | 0.9772 |
520
+ | cosine_accuracy@10 | 0.9927 | 0.9926 | 0.9923 | 0.9917 | 0.9906 |
521
+ | cosine_precision@1 | 0.8347 | 0.8341 | 0.8326 | 0.8294 | 0.8211 |
522
+ | cosine_precision@3 | 0.3209 | 0.321 | 0.3208 | 0.3204 | 0.3192 |
523
+ | cosine_precision@5 | 0.1961 | 0.1962 | 0.196 | 0.1959 | 0.1954 |
524
+ | cosine_precision@10 | 0.0993 | 0.0993 | 0.0992 | 0.0992 | 0.0991 |
525
+ | cosine_recall@1 | 0.8347 | 0.8341 | 0.8326 | 0.8294 | 0.8211 |
526
+ | cosine_recall@3 | 0.9628 | 0.963 | 0.9624 | 0.9611 | 0.9575 |
527
+ | cosine_recall@5 | 0.9806 | 0.9808 | 0.9802 | 0.9796 | 0.9772 |
528
+ | cosine_recall@10 | 0.9927 | 0.9926 | 0.9923 | 0.9917 | 0.9906 |
529
+ | **cosine_ndcg@10** | **0.9235** | **0.9233** | **0.9224** | **0.9205** | **0.9159** |
530
+ | cosine_mrr@10 | 0.9003 | 0.9 | 0.8989 | 0.8965 | 0.8908 |
531
+ | cosine_map@100 | 0.9007 | 0.9004 | 0.8993 | 0.897 | 0.8913 |
532
+
533
+ <!--
534
+ ## Bias, Risks and Limitations
535
+
536
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
537
+ -->
538
+
539
+ <!--
540
+ ### Recommendations
541
+
542
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
543
+ -->
544
+
545
+ ## Training Details
546
+
547
+ ### Training Dataset
548
+
549
+ #### json
550
+
551
+ * Dataset: json
552
+ * Size: 56,041 training samples
553
+ * Columns: <code>anchor</code> and <code>positive</code>
554
+ * Approximate statistics based on the first 1000 samples:
555
+ | | anchor | positive |
556
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
557
+ | type | string | string |
558
+ | details | <ul><li>min: 15 tokens</li><li>mean: 30.56 tokens</li><li>max: 66 tokens</li></ul> | <ul><li>min: 42 tokens</li><li>mean: 109.65 tokens</li><li>max: 298 tokens</li></ul> |
559
+ * Samples:
560
+ | anchor | positive |
561
+ |:------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
562
+ | <code>What does the 'dataStatProps' attribute represent in the 'AnalyticsMetadataInfo' type, and what is its data type?</code> | <code>The 'dataStatProps' attribute in the 'AnalyticsMetadataInfo' type represents a list of dataset statistical properties of the data used to generate the analytics. It is defined as an optional attribute with a data type of 'array(DatasetStatisticalProperty)' and a cardinality of 1..N, meaning it can contain one or more elements.</code> |
563
+ | <code>Why is it important to have standardized methods for resource management in the Nudm_SubscriberDataManagement Service API?</code> | <code>Standardized methods for resource management in the Nudm_SubscriberDataManagement Service API are important because they ensure uniformity, predictability, and compatibility across different implementations and systems. This standardization facilitates seamless integration, reduces errors, and enhances the efficiency of managing subscriber data, which is critical for maintaining reliable communication services.</code> |
564
+ | <code>What is the purpose of the Nsmf_PDUSession_SMContextStatusNotify service operation in the context of I-SMF context transfer?</code> | <code>The Nsmf_PDUSession_SMContextStatusNotify service operation is used by the SMF (Session Management Function) to notify its consumers about the status of an SM (Session Management) context related to a PDU (Packet Data Unit) Session. In the context of I-SMF (Intermediate SMF) context transfer, this service operation is used to indicate the transfer of the SM context to a new I-SMF or SMF set. It also allows the SMF to update the SMF-derived CN (Core Network) assisted RAN (Radio Access Network) parameters tuning in the AMF (Access and Mobility Management Function). Additionally, it can report DDN (Downlink Data Notification) failures and provide target DNAI (Data Network Access Identifier) information for the current or next PDU session.</code> |
565
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
566
+ ```json
567
+ {
568
+ "loss": "MultipleNegativesRankingLoss",
569
+ "matryoshka_dims": [
570
+ 768,
571
+ 512,
572
+ 256,
573
+ 128,
574
+ 64
575
+ ],
576
+ "matryoshka_weights": [
577
+ 1,
578
+ 1,
579
+ 1,
580
+ 1,
581
+ 1
582
+ ],
583
+ "n_dims_per_step": -1
584
+ }
585
+ ```
586
+
587
+ ### Training Hyperparameters
588
+ #### Non-Default Hyperparameters
589
+
590
+ - `eval_strategy`: epoch
591
+ - `per_device_train_batch_size`: 32
592
+ - `per_device_eval_batch_size`: 16
593
+ - `gradient_accumulation_steps`: 16
594
+ - `learning_rate`: 2e-05
595
+ - `num_train_epochs`: 4
596
+ - `lr_scheduler_type`: cosine
597
+ - `warmup_ratio`: 0.1
598
+ - `fp16`: True
599
+ - `load_best_model_at_end`: True
600
+ - `optim`: adamw_torch_fused
601
+ - `batch_sampler`: no_duplicates
602
+
603
+ #### All Hyperparameters
604
+ <details><summary>Click to expand</summary>
605
+
606
+ - `overwrite_output_dir`: False
607
+ - `do_predict`: False
608
+ - `eval_strategy`: epoch
609
+ - `prediction_loss_only`: True
610
+ - `per_device_train_batch_size`: 32
611
+ - `per_device_eval_batch_size`: 16
612
+ - `per_gpu_train_batch_size`: None
613
+ - `per_gpu_eval_batch_size`: None
614
+ - `gradient_accumulation_steps`: 16
615
+ - `eval_accumulation_steps`: None
616
+ - `learning_rate`: 2e-05
617
+ - `weight_decay`: 0.0
618
+ - `adam_beta1`: 0.9
619
+ - `adam_beta2`: 0.999
620
+ - `adam_epsilon`: 1e-08
621
+ - `max_grad_norm`: 1.0
622
+ - `num_train_epochs`: 4
623
+ - `max_steps`: -1
624
+ - `lr_scheduler_type`: cosine
625
+ - `lr_scheduler_kwargs`: {}
626
+ - `warmup_ratio`: 0.1
627
+ - `warmup_steps`: 0
628
+ - `log_level`: passive
629
+ - `log_level_replica`: warning
630
+ - `log_on_each_node`: True
631
+ - `logging_nan_inf_filter`: True
632
+ - `save_safetensors`: True
633
+ - `save_on_each_node`: False
634
+ - `save_only_model`: False
635
+ - `restore_callback_states_from_checkpoint`: False
636
+ - `no_cuda`: False
637
+ - `use_cpu`: False
638
+ - `use_mps_device`: False
639
+ - `seed`: 42
640
+ - `data_seed`: None
641
+ - `jit_mode_eval`: False
642
+ - `use_ipex`: False
643
+ - `bf16`: False
644
+ - `fp16`: True
645
+ - `fp16_opt_level`: O1
646
+ - `half_precision_backend`: auto
647
+ - `bf16_full_eval`: False
648
+ - `fp16_full_eval`: False
649
+ - `tf32`: None
650
+ - `local_rank`: 0
651
+ - `ddp_backend`: None
652
+ - `tpu_num_cores`: None
653
+ - `tpu_metrics_debug`: False
654
+ - `debug`: []
655
+ - `dataloader_drop_last`: False
656
+ - `dataloader_num_workers`: 0
657
+ - `dataloader_prefetch_factor`: None
658
+ - `past_index`: -1
659
+ - `disable_tqdm`: False
660
+ - `remove_unused_columns`: True
661
+ - `label_names`: None
662
+ - `load_best_model_at_end`: True
663
+ - `ignore_data_skip`: False
664
+ - `fsdp`: []
665
+ - `fsdp_min_num_params`: 0
666
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
667
+ - `fsdp_transformer_layer_cls_to_wrap`: None
668
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
669
+ - `deepspeed`: None
670
+ - `label_smoothing_factor`: 0.0
671
+ - `optim`: adamw_torch_fused
672
+ - `optim_args`: None
673
+ - `adafactor`: False
674
+ - `group_by_length`: False
675
+ - `length_column_name`: length
676
+ - `ddp_find_unused_parameters`: None
677
+ - `ddp_bucket_cap_mb`: None
678
+ - `ddp_broadcast_buffers`: False
679
+ - `dataloader_pin_memory`: True
680
+ - `dataloader_persistent_workers`: False
681
+ - `skip_memory_metrics`: True
682
+ - `use_legacy_prediction_loop`: False
683
+ - `push_to_hub`: False
684
+ - `resume_from_checkpoint`: None
685
+ - `hub_model_id`: None
686
+ - `hub_strategy`: every_save
687
+ - `hub_private_repo`: False
688
+ - `hub_always_push`: False
689
+ - `gradient_checkpointing`: False
690
+ - `gradient_checkpointing_kwargs`: None
691
+ - `include_inputs_for_metrics`: False
692
+ - `eval_do_concat_batches`: True
693
+ - `fp16_backend`: auto
694
+ - `push_to_hub_model_id`: None
695
+ - `push_to_hub_organization`: None
696
+ - `mp_parameters`:
697
+ - `auto_find_batch_size`: False
698
+ - `full_determinism`: False
699
+ - `torchdynamo`: None
700
+ - `ray_scope`: last
701
+ - `ddp_timeout`: 1800
702
+ - `torch_compile`: False
703
+ - `torch_compile_backend`: None
704
+ - `torch_compile_mode`: None
705
+ - `dispatch_batches`: None
706
+ - `split_batches`: None
707
+ - `include_tokens_per_second`: False
708
+ - `include_num_input_tokens_seen`: False
709
+ - `neftune_noise_alpha`: None
710
+ - `optim_target_modules`: None
711
+ - `batch_eval_metrics`: False
712
+ - `prompts`: None
713
+ - `batch_sampler`: no_duplicates
714
+ - `multi_dataset_batch_sampler`: proportional
715
+
716
+ </details>
717
+
718
+ ### Training Logs
719
+ | Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
720
+ |:----------:|:-------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
721
+ | 0.0913 | 10 | 1.4273 | - | - | - | - | - |
722
+ | 0.1826 | 20 | 0.5399 | - | - | - | - | - |
723
+ | 0.2740 | 30 | 0.1252 | - | - | - | - | - |
724
+ | 0.3653 | 40 | 0.0625 | - | - | - | - | - |
725
+ | 0.4566 | 50 | 0.0507 | - | - | - | - | - |
726
+ | 0.5479 | 60 | 0.0366 | - | - | - | - | - |
727
+ | 0.6393 | 70 | 0.029 | - | - | - | - | - |
728
+ | 0.7306 | 80 | 0.0239 | - | - | - | - | - |
729
+ | 0.8219 | 90 | 0.0252 | - | - | - | - | - |
730
+ | 0.9132 | 100 | 0.0237 | - | - | - | - | - |
731
+ | 0.9954 | 109 | - | 0.9199 | 0.9195 | 0.9180 | 0.9150 | 0.9081 |
732
+ | 1.0046 | 110 | 0.026 | - | - | - | - | - |
733
+ | 1.0959 | 120 | 0.017 | - | - | - | - | - |
734
+ | 1.1872 | 130 | 0.02 | - | - | - | - | - |
735
+ | 1.2785 | 140 | 0.0125 | - | - | - | - | - |
736
+ | 1.3699 | 150 | 0.0134 | - | - | - | - | - |
737
+ | 1.4612 | 160 | 0.0128 | - | - | - | - | - |
738
+ | 1.5525 | 170 | 0.0123 | - | - | - | - | - |
739
+ | 1.6438 | 180 | 0.0097 | - | - | - | - | - |
740
+ | 1.7352 | 190 | 0.0101 | - | - | - | - | - |
741
+ | 1.8265 | 200 | 0.0124 | - | - | - | - | - |
742
+ | 1.9178 | 210 | 0.0116 | - | - | - | - | - |
743
+ | 2.0 | 219 | - | 0.9220 | 0.9216 | 0.9206 | 0.9184 | 0.9130 |
744
+ | 2.0091 | 220 | 0.012 | - | - | - | - | - |
745
+ | 2.1005 | 230 | 0.0111 | - | - | - | - | - |
746
+ | 2.1918 | 240 | 0.0101 | - | - | - | - | - |
747
+ | 2.2831 | 250 | 0.0101 | - | - | - | - | - |
748
+ | 2.3744 | 260 | 0.009 | - | - | - | - | - |
749
+ | 2.4658 | 270 | 0.0103 | - | - | - | - | - |
750
+ | 2.5571 | 280 | 0.009 | - | - | - | - | - |
751
+ | 2.6484 | 290 | 0.0083 | - | - | - | - | - |
752
+ | 2.7397 | 300 | 0.0076 | - | - | - | - | - |
753
+ | 2.8311 | 310 | 0.0093 | - | - | - | - | - |
754
+ | 2.9224 | 320 | 0.0104 | - | - | - | - | - |
755
+ | 2.9954 | 328 | - | 0.9234 | 0.9230 | 0.9221 | 0.9201 | 0.9156 |
756
+ | 3.0137 | 330 | 0.0104 | - | - | - | - | - |
757
+ | 3.1050 | 340 | 0.0089 | - | - | - | - | - |
758
+ | 3.1963 | 350 | 0.0084 | - | - | - | - | - |
759
+ | 3.2877 | 360 | 0.0082 | - | - | - | - | - |
760
+ | 3.3790 | 370 | 0.0089 | - | - | - | - | - |
761
+ | 3.4703 | 380 | 0.0083 | - | - | - | - | - |
762
+ | 3.5616 | 390 | 0.0061 | - | - | - | - | - |
763
+ | 3.6530 | 400 | 0.0065 | - | - | - | - | - |
764
+ | 3.7443 | 410 | 0.0063 | - | - | - | - | - |
765
+ | 3.8356 | 420 | 0.0084 | - | - | - | - | - |
766
+ | 3.9269 | 430 | 0.0083 | - | - | - | - | - |
767
+ | **3.9817** | **436** | **-** | **0.9235** | **0.9233** | **0.9224** | **0.9205** | **0.9159** |
768
+
769
+ * The bold row denotes the saved checkpoint.
770
+
771
+ ### Framework Versions
772
+ - Python: 3.11.11
773
+ - Sentence Transformers: 3.3.1
774
+ - Transformers: 4.41.2
775
+ - PyTorch: 2.1.2+cu121
776
+ - Accelerate: 1.2.1
777
+ - Datasets: 2.19.1
778
+ - Tokenizers: 0.19.1
779
+
780
+ ## Citation
781
+
782
+ ### BibTeX
783
+
784
+ #### Sentence Transformers
785
+ ```bibtex
786
+ @inproceedings{reimers-2019-sentence-bert,
787
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
788
+ author = "Reimers, Nils and Gurevych, Iryna",
789
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
790
+ month = "11",
791
+ year = "2019",
792
+ publisher = "Association for Computational Linguistics",
793
+ url = "https://arxiv.org/abs/1908.10084",
794
+ }
795
+ ```
796
+
797
+ #### MatryoshkaLoss
798
+ ```bibtex
799
+ @misc{kusupati2024matryoshka,
800
+ title={Matryoshka Representation Learning},
801
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
802
+ year={2024},
803
+ eprint={2205.13147},
804
+ archivePrefix={arXiv},
805
+ primaryClass={cs.LG}
806
+ }
807
+ ```
808
+
809
+ #### MultipleNegativesRankingLoss
810
+ ```bibtex
811
+ @misc{henderson2017efficient,
812
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
813
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
814
+ year={2017},
815
+ eprint={1705.00652},
816
+ archivePrefix={arXiv},
817
+ primaryClass={cs.CL}
818
+ }
819
+ ```
820
+
821
+ <!--
822
+ ## Glossary
823
+
824
+ *Clearly define terms in order to be accessible across audiences.*
825
+ -->
826
+
827
+ <!--
828
+ ## Model Card Authors
829
+
830
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
831
+ -->
832
+
833
+ <!--
834
+ ## Model Card Contact
835
+
836
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
837
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/content/drive/MyDrive/my_3gpp_model_study/sentence_transformer_w_qa_v2_full",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.41.2",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec4faa50a6dabb8a103c310999b4cdf73deb1705f52bfc37c198e76efd5b03de
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "max_length": 512,
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_to_multiple_of": null,
53
+ "pad_token": "[PAD]",
54
+ "pad_token_type_id": 0,
55
+ "padding_side": "right",
56
+ "sep_token": "[SEP]",
57
+ "stride": 0,
58
+ "strip_accents": null,
59
+ "tokenize_chinese_chars": true,
60
+ "tokenizer_class": "BertTokenizer",
61
+ "truncation_side": "right",
62
+ "truncation_strategy": "longest_first",
63
+ "unk_token": "[UNK]"
64
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff