bbmb commited on
Commit
a3df1fe
·
verified ·
1 Parent(s): 2d73958

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,1032 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-base-en-v1.5
3
+ language:
4
+ - en
5
+ library_name: sentence-transformers
6
+ license: apache-2.0
7
+ metrics:
8
+ - cosine_accuracy@1
9
+ - cosine_accuracy@3
10
+ - cosine_accuracy@5
11
+ - cosine_accuracy@10
12
+ - cosine_precision@1
13
+ - cosine_precision@3
14
+ - cosine_precision@5
15
+ - cosine_precision@10
16
+ - cosine_recall@1
17
+ - cosine_recall@3
18
+ - cosine_recall@5
19
+ - cosine_recall@10
20
+ - cosine_ndcg@10
21
+ - cosine_mrr@10
22
+ - cosine_map@100
23
+ pipeline_tag: sentence-similarity
24
+ tags:
25
+ - sentence-transformers
26
+ - sentence-similarity
27
+ - feature-extraction
28
+ - generated_from_trainer
29
+ - dataset_size:3078
30
+ - loss:MatryoshkaLoss
31
+ - loss:MultipleNegativesRankingLoss
32
+ widget:
33
+ - source_sentence: '[ q_{\text{ut}} = \frac{1}{2} \rho g B N_{\gamma} + c N_{c} +
34
+ (p_{q} + \rho g D_{f}) N_{q} \quad \text{[SI]} \quad (36.1a) ] [ q_{\text{ut}}
35
+ = \frac{1}{2} \gamma B N_{\gamma} + c N_{c} + (p_{q} + \gamma D_{f}) N_{q} \quad
36
+ \text{[U.S.]} \quad (36.1b) ]
37
+
38
+
39
+ Various researchers have made improvements on the theory supporting this equation,
40
+ leading to somewhat different terms and sophistication in evaluating (N_0), (N_c),
41
+ and (N_g). The approaches differ in the assumptions made of the shape of the failure
42
+ zone beneath the footing. However, the general form of the equation is the same
43
+ in most cases.
44
+
45
+
46
+ Figure 36.2 and Table 36.2 can be used to evaluate the capacity factors (N_0),
47
+ (N_c), and (N_g) in Equation 36.1. Alternatively, Table 36.3 can be used. The
48
+ bearing capacity factors in Table 36.2 are based on Terzaghi''s 1943 studies.
49
+ The values in Table 36.3 are based on Meyerhof''s 1955 studies and others, and
50
+ have been widely used. Other values are also in use.
51
+
52
+
53
+ Equation 36.1 is appropriate for a foundation in a continuous wall footing. Corrections,
54
+ called shape factors, for various footing geometries are presented in Table 36.4
55
+ and Table 36.5 using the parameters identified in Figure 36.3. The bearing capacity
56
+ factors (N_c) and (N_0) are multiplied by the appropriate shape factors when they
57
+ are used in Equation 36.1.
58
+
59
+
60
+ Several researchers have recommended corrections to (N_0) to account for footing
61
+ depth. (Corrections to (N_0) for footing depth have also been suggested. No corrections
62
+ to (N_c) for footing depth have been suggested.) There is considerable variation
63
+ in the method of calculating this correction if it is used at all. A multiplicative
64
+ correction factor, (d_c), which is used most often, has the form:
65
+
66
+
67
+ [ d_{c} = 1 + \frac{K D_{f}}{B} ]
68
+
69
+
70
+ (K) is a constant for which values of 0.2 and 0.4 have been proposed. The depth
71
+ factor correction is applied to (N_0) along with the shape factor correction in
72
+ Equation 36.1. Once the ultimate bearing capacity is determined, it is corrected
73
+ by the overburden, giving the net bearing capacity. This is the net pressure the
74
+ soil can support beyond the pressure applied by the existing overburden.
75
+
76
+
77
+ [ q_{\text{net}} = q_{\text{ult}} - \rho g D_{f} \quad \text{[SI]} \quad (36.3a)
78
+ ] [ q_{\text{net}} = q_{\text{ut}} - \gamma D_{f} \quad \text{[U.S.]} \quad (36.3b)
79
+ ]
80
+
81
+
82
+ [ \begin{array}{r l}{{\mathrm{[U.S.]}}}&{{}36.3(b)}\end{array} ]
83
+
84
+
85
+ [ q_{\text{net}} = q_{\text{ult}} - \gamma D_{f} ]
86
+
87
+
88
+ Figure 36.2: Terzaghi Bearing Capacity Factors'
89
+ sentences:
90
+ - What does the net bearing capacity represent in foundation engineering?
91
+ - Can anyone explain the difference between ductility and percent elongation?
92
+ - How do you compute the inverse of a 3x3 matrix?
93
+ - source_sentence: 'Backwashing with filtered water pumped back through the filter
94
+ from the bottom to the top expands the sand layer by 30-50%, which dislodges trapped
95
+ material. Backwashing for 3-5 minutes at a rate of 8-15 gpm/ft² (5.4-10 L/s-m²)
96
+ is a typical specification. The head loss is reduced to approximately 1 ft (0.3
97
+ m) after washing. Experience has shown that supplementary agitation of the filter
98
+ media is necessary to prevent "caking" and "mudballs" in almost all installations.
99
+ Prior to backwashing, the filter material may be expanded by an air prewash volume
100
+ of 1-8 (2-5 typical) times the sand filter volume per minute for 2-10 minutes
101
+ (3-5 minutes typical). Alternatively, turbulence in the filter material may be
102
+ encouraged during backwashing with an air wash or with rotating hydraulic surface
103
+ jets.
104
+
105
+
106
+ During backwashing, the water in the filter housing will rise at a rate of 1-3
107
+ ft/min (0.5-1.5 cm/s). This rise should not exceed the settling velocity of the
108
+ smallest particle that is to be retained in the filter. The wash water, which
109
+ is collected in troughs for disposal, constitutes approximately 1-5% of the total
110
+ processed water. The total water used is approximately 75-100 gal/ft² (3-4 kL/m²).
111
+ The actual amount of backwash water is given by the equation:
112
+
113
+
114
+ $$ V = A_{\text{filter}} \cdot (\text{rate of rise}) \cdot t_{\text{backwash}}
115
+ $$
116
+
117
+
118
+ The temperature of the water used in backwashing is important since viscosity
119
+ changes with temperature (the effect of temperature on water density is negligible).
120
+ Water at 40°F (4°C) is more viscous than water at 70°F (21°C). Therefore, media
121
+ particles may be expanded to the same extent using lower upflow rates at lower
122
+ backwash temperatures. ```markdown
123
+
124
+
125
+ 26. Other Filtration Methods
126
+
127
+
128
+ Pressure (sand) filters for water supply treatment operate similarly to rapid
129
+ sand filters except that incoming water is typically pressurized to 25-75 psig
130
+ (170-520 kPa gage). Single media filter rates are typically 4-5 gpm/ft² (1.4-14
131
+ L/s-m²), with 2-10 gpm/ft² (2.7-3.4 L/s-m² typical), while dual media filters
132
+ run at 1.5 to 2.0 times these rates. Pressure filters are not used in large installations.
133
+
134
+
135
+ Ultrafilters are membranes that act as sieves to retain turbidity, microorganisms,
136
+ and large organic molecules that are THM precursors, while allowing water, salts,
137
+ and small molecules to pass through. Ultrafiltration is effective in removing
138
+ particles ranging in size of 0.001 to 10 µm. A pressure of 15-75 psig (100-500
139
+ kPa) is required to drive the water through the membrane.
140
+
141
+
142
+ Biofilm filtration (biofilm process) uses microorganisms to remove selected contaminants
143
+ (e.g., aromatics and other hydrocarbons). The operation of biofilters is similar
144
+ to trickling filters used in wastewater processing. Sand filter facilities are
145
+ relatively easy to modify—sand is replaced with gravel in the 4-14 mm size range,
146
+ application rates are decreased, and exposure to chlorine from incoming and backwash
147
+ water is eliminated. While the maximum may never be used, a maximum backwash rate
148
+ of 20 gpm/ft² (14 L/s-m²) should be provided for. A µm is the same as a micron.'
149
+ sentences:
150
+ - How do I calculate the total water used for backwashing?
151
+ - How do I calculate flow rate if the water depth is 5 ft and channel width is 8
152
+ ft?
153
+ - What is the formula for estimating the percent time spent following on highways?
154
+ - source_sentence: 'Here is the LaTeX representation of the angles and the radius
155
+ of the circle:
156
+
157
+
158
+ \begin{align} \alpha &= \angle PQR \ \beta &= \angle QNR \ \gamma &= \angle RPN
159
+ \ \end{align}
160
+
161
+
162
+ \begin{align} a &= \text{radius of the circle} \ I &= \text{line segment} \ \end{align}
163
+
164
+
165
+ The figure also includes a dashed line representing a chord and a tangent line
166
+ from point P to the circle, with a point of tangency labeled ''T''. The tangent
167
+ line is perpendicular to the radius of the circle at point T.
168
+
169
+
170
+ Figure 79.5 Tangent and Chord Offset Geometry
171
+
172
+
173
+ The short chord distance is
174
+
175
+
176
+ [ \mathrm{NQ} = C = 2R \sin \alpha ] [ \mathrm{NP} = (2R \sin \alpha) \cos \alpha
177
+ = C \cos \alpha ] [ \mathrm{PQ} = (2R \sin \alpha) \sin \alpha = 2R \sin^2 \alpha
178
+ ]
179
+
180
+
181
+ \tag{79.23} \tag{79.24} \tag{79.25} ```
182
+
183
+
184
+ 7. Curve Layout By Chord Offset
185
+
186
+
187
+ The chord offset method is a third method for laying out horizontal curves. This
188
+ method is also suitable for short curves. The method is named for the way in which
189
+ the measurements are made, which is by measuring distances along the main chord
190
+ from the instrument location at PC.
191
+
192
+
193
+ [ \mathrm{NR} = \mathrm{chord~distance} = \mathrm{NQ}\cos\left({\frac{I}{2}} -
194
+ \alpha\right) ]
195
+
196
+
197
+ [ \sqrt{2} = (2R\sin\alpha)\cos\left(\frac{I}{2} - \alpha\right) = C\cos\left(\frac{I}{2}
198
+ - \alpha\right) = (I - \alpha)^{2} ]
199
+
200
+
201
+ [ \mathrm{RQ} = \mathrm{chord~offset} = \mathrm{NQ}~\sin\left({\frac{I}{2}} -
202
+ \alpha\right) ]
203
+
204
+
205
+ [ = (2R\sin\alpha)\sin\left({\frac{I}{2}} - \alpha\right) = C\sin\left({\frac{I}{2}}
206
+ - \alpha\right) ]
207
+
208
+
209
+ [ 79.27 ]
210
+
211
+
212
+ 8. Horizontal Curves Through Points
213
+
214
+
215
+ Occasionally, it is necessary to design a horizontal curve to pass through a specific
216
+ point. The following procedure can be used. (Refer to Fig. 79.6.)
217
+
218
+
219
+ Step 1: Calculate ( \alpha ) and ( m ) from ( x ) and ( y ). (If ( x ) and ( m
220
+ ) are known, skip this step.) [ \alpha = \arctan\left(\frac{y}{x}\right) ] [ m
221
+ = \sqrt{x^{2} + y^{2}} ]
222
+
223
+
224
+ Step 2: Calculate ( y ). Since ( 90^\circ + \frac{I}{2} + \alpha = 180^\circ ),
225
+ [ \gamma = 90^\circ - \frac{I}{2} - \alpha ]
226
+
227
+
228
+ Step 3: Calculate ( \phi ). [ \phi = 180^\circ - \arcsin\left(\frac{\sin\gamma}{\cos\left(\frac{I}{2}\right)}\right)
229
+ ] [ = 180^\circ - \gamma - \phi ]
230
+
231
+
232
+ Step 4: Calculate ( O ). (Refer to Eq. 79.32)'
233
+ sentences:
234
+ - What's the difference between horizontal and vertical parabolas in their equations?
235
+ - What does the distance of 50 ft represent in the wave illustration?
236
+ - What is the relationship between tangent lines and radius in circular geometry?
237
+ - source_sentence: 'Description: The image provided is not clear enough to discern
238
+ any specific details, text, or formulas. It appears to be a blurred image with
239
+ no distinguishable content. Therefore, I cannot extract any formulas or provide
240
+ a description of the image content.
241
+
242
+
243
+ Unfortunately, it is extremely difficult to prove compensatory fraud (i.e., fraud
244
+ for which damages are available). Proving fraud requires showing beyond a reasonable
245
+ doubt (a) a reckless or intentional misstatement of a material fact, (b) an intention
246
+ to deceive, (c) it resulted in misleading the innocent party to contract, and
247
+ (d) it was to the innocent party''s detriment. For example, if an engineer claims
248
+ to have experience in designing steel buildings but actually has none, the court
249
+ might consider the misrepresentation a fraudulent action. If, however, the engineer
250
+ has some experience, but an insufficient amount to do an adequate job, the engineer
251
+ probably will not be considered to have acted fraudulently.
252
+
253
+
254
+ Torts
255
+
256
+
257
+ A tort is a civil wrong committed by one person causing lamage to another person
258
+ or person''s property, emoional well-being, or reputation.11 It is a breach of
259
+ the ights of an individual to be secure in person or propxty. In order to correct
260
+ the wrong, a civil lawsuit (tort iction or civil complaint) is brought by the
261
+ alleged njured party (the plaintiff) against the defendant. To be a valid tort
262
+ action (i.e., lawsuit), there must have been injury (i.e., damage). Generally,
263
+ there will be no contract between the two parties, so the tort action annot claim
264
+ a breach of contract. 12 Cort law is concerned with compensation for the injury,
265
+ not punishment. Therefore, tort awards usually consist
266
+
267
+
268
+ f general, compensatory, and special damages and arely include punitive and exemplary
269
+ damages. (See Damages" for definitions of these damages.)
270
+
271
+
272
+ Strict Liability In Tort
273
+
274
+
275
+ itrict liability in tort means that the injured party wins f the injury can be
276
+ proven. It is not necessary to prove egligence, breach of explicit or implicit
277
+ warranty, or he existence of a contract (privity of contract). Strict ability
278
+ in tort is most commonly encountered in prodct liability cases. A defect in a
279
+ product, regardless of ow the defect got there, is sufficient to create strict
280
+ ability in tort.
281
+
282
+
283
+ lase law surrounding defective products has developed nd refined the following
284
+ requirements for winning a trict liability in tort case. The following points
285
+ must e proved.
286
+
287
+
288
+ The difference between a civil tort (lausuit) and a criminal lausuit is ie alleged
289
+ injured party. A crime is a wrong against society. A iminal lawsuit is brought
290
+ by the state against a defendant.
291
+
292
+
293
+ It is possible for an injury to be both a breach of contract and a tort.
294
+
295
+
296
+ ippose an owner has an agreement with a contractor to construct a ilding, and
297
+ the contract requires the contractor to comply with all ate and federal safety
298
+ regulations. If the owner is subsequently jured on a stairway because there was
299
+ no guardrail, the injury could · recoverable both as a tort and as a breach of
300
+ contract. If a third irty unrelated to the contract was injured, however, that
301
+ party could cover only through a tort action. · The product was defective in manufacture,
302
+ design, labeling, and so on.
303
+
304
+
305
+ The product was defective when used.
306
+
307
+
308
+ The defect rendered the product unreasonably dangerous.
309
+
310
+
311
+ The defect caused the injury. .
312
+
313
+
314
+ The specific use of the product that caused the damage was reasonably foreseeable.
315
+
316
+
317
+ Manufacturing And Design Liability'
318
+ sentences:
319
+ - What factors influence the instantaneous center of rotation in welded structures?
320
+ - How do you establish if fraud has occurred in a contract?
321
+ - How do you calculate the probability of multiple events happening?
322
+ - source_sentence: '9. Area
323
+
324
+
325
+ Equation 9.35 calculates the area, ( A ), bounded by ( x = a ), ( x = b ), ( f_1(x)
326
+ ) above, and ( f_2(x) ) below. (Note: ( f_2(x) = 0 ) if the area is bounded by
327
+ the x-axis.) This is illustrated in Fig. 9.1. [ A = \int_{a}^{b} \left( f_{1}(x)
328
+ - f_{2}(x) \right) \, dx \qquad \qquad (9.35) ] Figure 9.1 Area Between Two Curves
329
+
330
+
331
+ Description: The image shows a graph with two curves labeled f1(x) and f2(x).
332
+ The graph is plotted on a Cartesian coordinate system with an x-axis and a y-axis.
333
+ There are two vertical dashed lines intersecting the x-axis at points labeled
334
+ ''a'' and ''b''. The curve f1(x) is above the line y = 0 and the curve f2(x) is
335
+ below the line y = 0. The area between the two curves from x = a to x = b is shaded,
336
+ indicating a region of interest or calculation.
337
+
338
+
339
+ The LaTeX representation of the curves is not provided in the image, so I cannot
340
+ write them in LaTeX form. However, if the curves were described by functions,
341
+ they could be represented as follows:
342
+
343
+
344
+ f1(x) could be represented as ( f_1(x) = ax^2 + bx + c ) for some constants a,
345
+ b, and c.
346
+
347
+
348
+ f2(x) could be represented as ( f_2(x) = -ax^2 - bx - c ) for some constants a,
349
+ b, and c.
350
+
351
+
352
+ The area between the curves from x = a to x = b could be calculated using the
353
+ integral of the difference between the two functions over the interval [a, b].
354
+
355
+
356
+ Description: The image provided is not clear enough to describe in detail or to
357
+ extract any formulas. The text is not legible, and no other discernible features
358
+ can be identified.
359
+
360
+
361
+ Find the area between the x-axis and the parabola ( y = x^2 ) in the interval
362
+ ([0, 4]).
363
+
364
+
365
+ Description: The image shows a graph with a curve that represents a function y
366
+ = x^2. There is a vertical dashed line at x = 4, indicating a point of interest
367
+ or a specific value on the x-axis. The graph is plotted on a Cartesian coordinate
368
+ system with the x-axis labeled ''x'' and the y-axis labeled ''y''. The curve is
369
+ a parabola that opens upwards, showing that as x increases, y increases at an
370
+ increasing rate. The point where x = 4 is marked on the x-axis, and the corresponding
371
+ y-value on the curve is not explicitly shown but can be inferred from the equation
372
+ y = x^2.
373
+
374
+
375
+ Solution: Referring to Eq. 9.35, [ f_{1}(x) = x^{2} \quad \text{and} \quad f_{2}(x)
376
+ = 0 ] Thus, [ A = \int_{a}^{b} \left( f_1(x) - f_2(x) \right) dx = \int_{0}^{4}
377
+ x^2 \, dx = \left[ \frac{x^3}{3} \right]_{0}^{4} = \frac{64}{3} ] ...
378
+
379
+
380
+ 10. Arc Length'
381
+ sentences:
382
+ - Can you show me how to find the area using the integral of the difference of two
383
+ functions?
384
+ - Can you explain how to calculate the force BC using trigonometric components?
385
+ - What is the minimum requirement for steel area in slab reinforcement according
386
+ to ACI guidelines?
387
+ model-index:
388
+ - name: deep learning project
389
+ results:
390
+ - task:
391
+ type: information-retrieval
392
+ name: Information Retrieval
393
+ dataset:
394
+ name: dim 768
395
+ type: dim_768
396
+ metrics:
397
+ - type: cosine_accuracy@1
398
+ value: 0.2543859649122807
399
+ name: Cosine Accuracy@1
400
+ - type: cosine_accuracy@3
401
+ value: 0.5789473684210527
402
+ name: Cosine Accuracy@3
403
+ - type: cosine_accuracy@5
404
+ value: 0.7017543859649122
405
+ name: Cosine Accuracy@5
406
+ - type: cosine_accuracy@10
407
+ value: 0.7982456140350878
408
+ name: Cosine Accuracy@10
409
+ - type: cosine_precision@1
410
+ value: 0.2543859649122807
411
+ name: Cosine Precision@1
412
+ - type: cosine_precision@3
413
+ value: 0.19298245614035087
414
+ name: Cosine Precision@3
415
+ - type: cosine_precision@5
416
+ value: 0.14035087719298245
417
+ name: Cosine Precision@5
418
+ - type: cosine_precision@10
419
+ value: 0.07982456140350876
420
+ name: Cosine Precision@10
421
+ - type: cosine_recall@1
422
+ value: 0.2543859649122807
423
+ name: Cosine Recall@1
424
+ - type: cosine_recall@3
425
+ value: 0.5789473684210527
426
+ name: Cosine Recall@3
427
+ - type: cosine_recall@5
428
+ value: 0.7017543859649122
429
+ name: Cosine Recall@5
430
+ - type: cosine_recall@10
431
+ value: 0.7982456140350878
432
+ name: Cosine Recall@10
433
+ - type: cosine_ndcg@10
434
+ value: 0.5289463979794752
435
+ name: Cosine Ndcg@10
436
+ - type: cosine_mrr@10
437
+ value: 0.4422630650700826
438
+ name: Cosine Mrr@10
439
+ - type: cosine_map@100
440
+ value: 0.45071327302764325
441
+ name: Cosine Map@100
442
+ - task:
443
+ type: information-retrieval
444
+ name: Information Retrieval
445
+ dataset:
446
+ name: dim 512
447
+ type: dim_512
448
+ metrics:
449
+ - type: cosine_accuracy@1
450
+ value: 0.2631578947368421
451
+ name: Cosine Accuracy@1
452
+ - type: cosine_accuracy@3
453
+ value: 0.5760233918128655
454
+ name: Cosine Accuracy@3
455
+ - type: cosine_accuracy@5
456
+ value: 0.695906432748538
457
+ name: Cosine Accuracy@5
458
+ - type: cosine_accuracy@10
459
+ value: 0.783625730994152
460
+ name: Cosine Accuracy@10
461
+ - type: cosine_precision@1
462
+ value: 0.2631578947368421
463
+ name: Cosine Precision@1
464
+ - type: cosine_precision@3
465
+ value: 0.19200779727095513
466
+ name: Cosine Precision@3
467
+ - type: cosine_precision@5
468
+ value: 0.13918128654970757
469
+ name: Cosine Precision@5
470
+ - type: cosine_precision@10
471
+ value: 0.0783625730994152
472
+ name: Cosine Precision@10
473
+ - type: cosine_recall@1
474
+ value: 0.2631578947368421
475
+ name: Cosine Recall@1
476
+ - type: cosine_recall@3
477
+ value: 0.5760233918128655
478
+ name: Cosine Recall@3
479
+ - type: cosine_recall@5
480
+ value: 0.695906432748538
481
+ name: Cosine Recall@5
482
+ - type: cosine_recall@10
483
+ value: 0.783625730994152
484
+ name: Cosine Recall@10
485
+ - type: cosine_ndcg@10
486
+ value: 0.525405284677311
487
+ name: Cosine Ndcg@10
488
+ - type: cosine_mrr@10
489
+ value: 0.4422096908939014
490
+ name: Cosine Mrr@10
491
+ - type: cosine_map@100
492
+ value: 0.45077185641932777
493
+ name: Cosine Map@100
494
+ - task:
495
+ type: information-retrieval
496
+ name: Information Retrieval
497
+ dataset:
498
+ name: dim 256
499
+ type: dim_256
500
+ metrics:
501
+ - type: cosine_accuracy@1
502
+ value: 0.260233918128655
503
+ name: Cosine Accuracy@1
504
+ - type: cosine_accuracy@3
505
+ value: 0.5526315789473685
506
+ name: Cosine Accuracy@3
507
+ - type: cosine_accuracy@5
508
+ value: 0.6754385964912281
509
+ name: Cosine Accuracy@5
510
+ - type: cosine_accuracy@10
511
+ value: 0.7573099415204678
512
+ name: Cosine Accuracy@10
513
+ - type: cosine_precision@1
514
+ value: 0.260233918128655
515
+ name: Cosine Precision@1
516
+ - type: cosine_precision@3
517
+ value: 0.18421052631578946
518
+ name: Cosine Precision@3
519
+ - type: cosine_precision@5
520
+ value: 0.1350877192982456
521
+ name: Cosine Precision@5
522
+ - type: cosine_precision@10
523
+ value: 0.07573099415204677
524
+ name: Cosine Precision@10
525
+ - type: cosine_recall@1
526
+ value: 0.260233918128655
527
+ name: Cosine Recall@1
528
+ - type: cosine_recall@3
529
+ value: 0.5526315789473685
530
+ name: Cosine Recall@3
531
+ - type: cosine_recall@5
532
+ value: 0.6754385964912281
533
+ name: Cosine Recall@5
534
+ - type: cosine_recall@10
535
+ value: 0.7573099415204678
536
+ name: Cosine Recall@10
537
+ - type: cosine_ndcg@10
538
+ value: 0.5082788808907895
539
+ name: Cosine Ndcg@10
540
+ - type: cosine_mrr@10
541
+ value: 0.4281189083820665
542
+ name: Cosine Mrr@10
543
+ - type: cosine_map@100
544
+ value: 0.4372871346521922
545
+ name: Cosine Map@100
546
+ - task:
547
+ type: information-retrieval
548
+ name: Information Retrieval
549
+ dataset:
550
+ name: dim 128
551
+ type: dim_128
552
+ metrics:
553
+ - type: cosine_accuracy@1
554
+ value: 0.2134502923976608
555
+ name: Cosine Accuracy@1
556
+ - type: cosine_accuracy@3
557
+ value: 0.5116959064327485
558
+ name: Cosine Accuracy@3
559
+ - type: cosine_accuracy@5
560
+ value: 0.6403508771929824
561
+ name: Cosine Accuracy@5
562
+ - type: cosine_accuracy@10
563
+ value: 0.7368421052631579
564
+ name: Cosine Accuracy@10
565
+ - type: cosine_precision@1
566
+ value: 0.2134502923976608
567
+ name: Cosine Precision@1
568
+ - type: cosine_precision@3
569
+ value: 0.1705653021442495
570
+ name: Cosine Precision@3
571
+ - type: cosine_precision@5
572
+ value: 0.12807017543859647
573
+ name: Cosine Precision@5
574
+ - type: cosine_precision@10
575
+ value: 0.07368421052631578
576
+ name: Cosine Precision@10
577
+ - type: cosine_recall@1
578
+ value: 0.2134502923976608
579
+ name: Cosine Recall@1
580
+ - type: cosine_recall@3
581
+ value: 0.5116959064327485
582
+ name: Cosine Recall@3
583
+ - type: cosine_recall@5
584
+ value: 0.6403508771929824
585
+ name: Cosine Recall@5
586
+ - type: cosine_recall@10
587
+ value: 0.7368421052631579
588
+ name: Cosine Recall@10
589
+ - type: cosine_ndcg@10
590
+ value: 0.4726924534205871
591
+ name: Cosine Ndcg@10
592
+ - type: cosine_mrr@10
593
+ value: 0.3880070546737214
594
+ name: Cosine Mrr@10
595
+ - type: cosine_map@100
596
+ value: 0.39701781193586744
597
+ name: Cosine Map@100
598
+ - task:
599
+ type: information-retrieval
600
+ name: Information Retrieval
601
+ dataset:
602
+ name: dim 64
603
+ type: dim_64
604
+ metrics:
605
+ - type: cosine_accuracy@1
606
+ value: 0.1871345029239766
607
+ name: Cosine Accuracy@1
608
+ - type: cosine_accuracy@3
609
+ value: 0.47076023391812866
610
+ name: Cosine Accuracy@3
611
+ - type: cosine_accuracy@5
612
+ value: 0.5789473684210527
613
+ name: Cosine Accuracy@5
614
+ - type: cosine_accuracy@10
615
+ value: 0.6695906432748538
616
+ name: Cosine Accuracy@10
617
+ - type: cosine_precision@1
618
+ value: 0.1871345029239766
619
+ name: Cosine Precision@1
620
+ - type: cosine_precision@3
621
+ value: 0.15692007797270952
622
+ name: Cosine Precision@3
623
+ - type: cosine_precision@5
624
+ value: 0.11578947368421051
625
+ name: Cosine Precision@5
626
+ - type: cosine_precision@10
627
+ value: 0.06695906432748537
628
+ name: Cosine Precision@10
629
+ - type: cosine_recall@1
630
+ value: 0.1871345029239766
631
+ name: Cosine Recall@1
632
+ - type: cosine_recall@3
633
+ value: 0.47076023391812866
634
+ name: Cosine Recall@3
635
+ - type: cosine_recall@5
636
+ value: 0.5789473684210527
637
+ name: Cosine Recall@5
638
+ - type: cosine_recall@10
639
+ value: 0.6695906432748538
640
+ name: Cosine Recall@10
641
+ - type: cosine_ndcg@10
642
+ value: 0.42447214920635656
643
+ name: Cosine Ndcg@10
644
+ - type: cosine_mrr@10
645
+ value: 0.3461802654785111
646
+ name: Cosine Mrr@10
647
+ - type: cosine_map@100
648
+ value: 0.3562882551304709
649
+ name: Cosine Map@100
650
+ ---
651
+
652
+ # deep learning project
653
+
654
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
655
+
656
+ ## Model Details
657
+
658
+ ### Model Description
659
+ - **Model Type:** Sentence Transformer
660
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
661
+ - **Maximum Sequence Length:** 512 tokens
662
+ - **Output Dimensionality:** 768 dimensions
663
+ - **Similarity Function:** Cosine Similarity
664
+ - **Training Dataset:**
665
+ - json
666
+ - **Language:** en
667
+ - **License:** apache-2.0
668
+
669
+ ### Model Sources
670
+
671
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
672
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
673
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
674
+
675
+ ### Full Model Architecture
676
+
677
+ ```
678
+ SentenceTransformer(
679
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
680
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
681
+ (2): Normalize()
682
+ )
683
+ ```
684
+
685
+ ## Usage
686
+
687
+ ### Direct Usage (Sentence Transformers)
688
+
689
+ First install the Sentence Transformers library:
690
+
691
+ ```bash
692
+ pip install -U sentence-transformers
693
+ ```
694
+
695
+ Then you can load this model and run inference.
696
+ ```python
697
+ from sentence_transformers import SentenceTransformer
698
+
699
+ # Download from the 🤗 Hub
700
+ model = SentenceTransformer("bbmb/deep-learning-for-embedding-model-ssilwal-qpham6")
701
+ # Run inference
702
+ sentences = [
703
+ "9. Area\n\nEquation 9.35 calculates the area, ( A ), bounded by ( x = a ), ( x = b ), ( f_1(x) ) above, and ( f_2(x) ) below. (Note: ( f_2(x) = 0 ) if the area is bounded by the x-axis.) This is illustrated in Fig. 9.1. [ A = \\int_{a}^{b} \\left( f_{1}(x) - f_{2}(x) \\right) \\, dx \\qquad \\qquad (9.35) ] Figure 9.1 Area Between Two Curves\n\nDescription: The image shows a graph with two curves labeled f1(x) and f2(x). The graph is plotted on a Cartesian coordinate system with an x-axis and a y-axis. There are two vertical dashed lines intersecting the x-axis at points labeled 'a' and 'b'. The curve f1(x) is above the line y = 0 and the curve f2(x) is below the line y = 0. The area between the two curves from x = a to x = b is shaded, indicating a region of interest or calculation.\n\nThe LaTeX representation of the curves is not provided in the image, so I cannot write them in LaTeX form. However, if the curves were described by functions, they could be represented as follows:\n\nf1(x) could be represented as ( f_1(x) = ax^2 + bx + c ) for some constants a, b, and c.\n\nf2(x) could be represented as ( f_2(x) = -ax^2 - bx - c ) for some constants a, b, and c.\n\nThe area between the curves from x = a to x = b could be calculated using the integral of the difference between the two functions over the interval [a, b].\n\nDescription: The image provided is not clear enough to describe in detail or to extract any formulas. The text is not legible, and no other discernible features can be identified.\n\nFind the area between the x-axis and the parabola ( y = x^2 ) in the interval ([0, 4]).\n\nDescription: The image shows a graph with a curve that represents a function y = x^2. There is a vertical dashed line at x = 4, indicating a point of interest or a specific value on the x-axis. The graph is plotted on a Cartesian coordinate system with the x-axis labeled 'x' and the y-axis labeled 'y'. The curve is a parabola that opens upwards, showing that as x increases, y increases at an increasing rate. The point where x = 4 is marked on the x-axis, and the corresponding y-value on the curve is not explicitly shown but can be inferred from the equation y = x^2.\n\nSolution: Referring to Eq. 9.35, [ f_{1}(x) = x^{2} \\quad \\text{and} \\quad f_{2}(x) = 0 ] Thus, [ A = \\int_{a}^{b} \\left( f_1(x) - f_2(x) \\right) dx = \\int_{0}^{4} x^2 \\, dx = \\left[ \\frac{x^3}{3} \\right]_{0}^{4} = \\frac{64}{3} ] ...\n\n10. Arc Length",
704
+ 'Can you show me how to find the area using the integral of the difference of two functions?',
705
+ 'What is the minimum requirement for steel area in slab reinforcement according to ACI guidelines?',
706
+ ]
707
+ embeddings = model.encode(sentences)
708
+ print(embeddings.shape)
709
+ # [3, 768]
710
+
711
+ # Get the similarity scores for the embeddings
712
+ similarities = model.similarity(embeddings, embeddings)
713
+ print(similarities.shape)
714
+ # [3, 3]
715
+ ```
716
+
717
+ <!--
718
+ ### Direct Usage (Transformers)
719
+
720
+ <details><summary>Click to see the direct usage in Transformers</summary>
721
+
722
+ </details>
723
+ -->
724
+
725
+ <!--
726
+ ### Downstream Usage (Sentence Transformers)
727
+
728
+ You can finetune this model on your own dataset.
729
+
730
+ <details><summary>Click to expand</summary>
731
+
732
+ </details>
733
+ -->
734
+
735
+ <!--
736
+ ### Out-of-Scope Use
737
+
738
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
739
+ -->
740
+
741
+ ## Evaluation
742
+
743
+ ### Metrics
744
+
745
+ #### Information Retrieval
746
+
747
+ * Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
748
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
749
+
750
+ | Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
751
+ |:--------------------|:-----------|:-----------|:-----------|:-----------|:-----------|
752
+ | cosine_accuracy@1 | 0.2544 | 0.2632 | 0.2602 | 0.2135 | 0.1871 |
753
+ | cosine_accuracy@3 | 0.5789 | 0.576 | 0.5526 | 0.5117 | 0.4708 |
754
+ | cosine_accuracy@5 | 0.7018 | 0.6959 | 0.6754 | 0.6404 | 0.5789 |
755
+ | cosine_accuracy@10 | 0.7982 | 0.7836 | 0.7573 | 0.7368 | 0.6696 |
756
+ | cosine_precision@1 | 0.2544 | 0.2632 | 0.2602 | 0.2135 | 0.1871 |
757
+ | cosine_precision@3 | 0.193 | 0.192 | 0.1842 | 0.1706 | 0.1569 |
758
+ | cosine_precision@5 | 0.1404 | 0.1392 | 0.1351 | 0.1281 | 0.1158 |
759
+ | cosine_precision@10 | 0.0798 | 0.0784 | 0.0757 | 0.0737 | 0.067 |
760
+ | cosine_recall@1 | 0.2544 | 0.2632 | 0.2602 | 0.2135 | 0.1871 |
761
+ | cosine_recall@3 | 0.5789 | 0.576 | 0.5526 | 0.5117 | 0.4708 |
762
+ | cosine_recall@5 | 0.7018 | 0.6959 | 0.6754 | 0.6404 | 0.5789 |
763
+ | cosine_recall@10 | 0.7982 | 0.7836 | 0.7573 | 0.7368 | 0.6696 |
764
+ | **cosine_ndcg@10** | **0.5289** | **0.5254** | **0.5083** | **0.4727** | **0.4245** |
765
+ | cosine_mrr@10 | 0.4423 | 0.4422 | 0.4281 | 0.388 | 0.3462 |
766
+ | cosine_map@100 | 0.4507 | 0.4508 | 0.4373 | 0.397 | 0.3563 |
767
+
768
+ <!--
769
+ ## Bias, Risks and Limitations
770
+
771
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
772
+ -->
773
+
774
+ <!--
775
+ ### Recommendations
776
+
777
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
778
+ -->
779
+
780
+ ## Training Details
781
+
782
+ ### Training Dataset
783
+
784
+ #### json
785
+
786
+ * Dataset: json
787
+ * Size: 3,078 training samples
788
+ * Columns: <code>positive</code> and <code>anchor</code>
789
+ * Approximate statistics based on the first 1000 samples:
790
+ | | positive | anchor |
791
+ |:--------|:-------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
792
+ | type | string | string |
793
+ | details | <ul><li>min: 117 tokens</li><li>mean: 508.1 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 15.93 tokens</li><li>max: 28 tokens</li></ul> |
794
+ * Samples:
795
+ | positive | anchor |
796
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------|
797
+ | <code>The PHF is used to convert hourly volumes to flow rates and represents the hourly variation in traffic flow. If the demand volume is measured in 15 min increments, it is unnecessary to use the PHF to convert to flow rates.<br><br>Therefore, since two-lane highway analysis is based on demand flow rates for a peak 15 min period within the analysis hour (usually the peak hour), the PHF in Equation 73.22 and Equation 73.23 is given a value of 1.00.<br><br>The average travel speed in the analysis direction, ( ATS_d ), is estimated from the FFS, the demand flow rate, the opposing flow rate, and the adjustment factor for the percentage of no-passing zones in the analysis direction, ( f_{np} ), as given in HCM Exh. 15-15. Equation 73.24 only applies to Class I and Class III two-lane highways.<br><br>[ \mathrm{ATS}{d} = \mathrm{FFS} - 0.0076(v{d,s} + v_{o,s}) - f_{\mathrm{np},s} \quad (73.24) ]<br><br>If the PTSF methodology is used, the formula for the demand flow rate, ( v_{i, \text{ATS}} ), is the same, although di...</code> | <code>What is the formula for estimating the percent time spent following on highways?</code> |
798
+ | <code>However, if the initial point on the limb is close to the critical point (i.e., the nose of the curve), then a small change in the specific energy (such as might be caused by a small variation in the channel floor) will cause a large change in depth. That is why severe turbulence commonly occurs near points of critical flow. Given that ( 4 \, \text{ft/sec} ) (or ( 1.2 \, \text{m/s} )) of water flows in a ( 7 \, \text{ft} ) (or ( 2.1 \, \text{m} )) wide, ( 6 \, \text{ft} ) (or ( 1.8 \, \text{m} )) deep open channel, the flow encounters a ( 1.0 \, \text{ft} ) (or ( 0.3 \, \text{m} )) step in the channel bottom. What is the depth of flow above the step? Actually, specific energy curves are typically plotted for flow per unit width, ( q = \frac{Q}{w} ). If that is the case, a jump from one limb to the other could take place if the width were allowed to change as well as the depth. A rise in the channel bottom does not always produce a drop in the water surface. Only if the flow is initiall...</code> | <code>What happens to the water depth when it encounters a step in a channel?</code> |
799
+ | <code>The shear strength, ( S ) or ( S_{ys} ), of a material is the maximum shear stress that the material can support without yielding in shear. (The ultimate shear strength, ( S_{us} ), is rarely encountered.) For ductile materials, maximum shear stress theory predicts the shear strength as one-half of the tensile yield strength. A more accurate relationship is derived from the distortion energy theory (also known as von Mises theory).<br><br>Figure 43.16: Uniform Bar in Torsion<br><br>Description: The image shows a diagram of a mechanical system with a cylindrical object, a rod, and a spring. There are two forces acting on the system: one is the weight of the rod, labeled 'L', acting downwards, and the other is the spring force, labeled 'T', acting upwards. The rod is shown to be in equilibrium, with the spring force balancing the weight of the rod. The distance from the pivot point to the center of mass of the rod is labeled 'r'. There is also a variable 'y' indicating the vertical displacement of t...</code> | <code>Can you explain what maximum shear stress theory is?</code> |
800
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
801
+ ```json
802
+ {
803
+ "loss": "MultipleNegativesRankingLoss",
804
+ "matryoshka_dims": [
805
+ 768,
806
+ 512,
807
+ 256,
808
+ 128,
809
+ 64
810
+ ],
811
+ "matryoshka_weights": [
812
+ 1,
813
+ 1,
814
+ 1,
815
+ 1,
816
+ 1
817
+ ],
818
+ "n_dims_per_step": -1
819
+ }
820
+ ```
821
+
822
+ ### Training Hyperparameters
823
+ #### Non-Default Hyperparameters
824
+
825
+ - `eval_strategy`: epoch
826
+ - `per_device_train_batch_size`: 32
827
+ - `per_device_eval_batch_size`: 16
828
+ - `gradient_accumulation_steps`: 16
829
+ - `learning_rate`: 2e-05
830
+ - `num_train_epochs`: 4
831
+ - `lr_scheduler_type`: cosine
832
+ - `warmup_ratio`: 0.1
833
+ - `bf16`: True
834
+ - `tf32`: True
835
+ - `load_best_model_at_end`: True
836
+ - `optim`: adamw_torch_fused
837
+ - `batch_sampler`: no_duplicates
838
+
839
+ #### All Hyperparameters
840
+ <details><summary>Click to expand</summary>
841
+
842
+ - `overwrite_output_dir`: False
843
+ - `do_predict`: False
844
+ - `eval_strategy`: epoch
845
+ - `prediction_loss_only`: True
846
+ - `per_device_train_batch_size`: 32
847
+ - `per_device_eval_batch_size`: 16
848
+ - `per_gpu_train_batch_size`: None
849
+ - `per_gpu_eval_batch_size`: None
850
+ - `gradient_accumulation_steps`: 16
851
+ - `eval_accumulation_steps`: None
852
+ - `learning_rate`: 2e-05
853
+ - `weight_decay`: 0.0
854
+ - `adam_beta1`: 0.9
855
+ - `adam_beta2`: 0.999
856
+ - `adam_epsilon`: 1e-08
857
+ - `max_grad_norm`: 1.0
858
+ - `num_train_epochs`: 4
859
+ - `max_steps`: -1
860
+ - `lr_scheduler_type`: cosine
861
+ - `lr_scheduler_kwargs`: {}
862
+ - `warmup_ratio`: 0.1
863
+ - `warmup_steps`: 0
864
+ - `log_level`: passive
865
+ - `log_level_replica`: warning
866
+ - `log_on_each_node`: True
867
+ - `logging_nan_inf_filter`: True
868
+ - `save_safetensors`: True
869
+ - `save_on_each_node`: False
870
+ - `save_only_model`: False
871
+ - `restore_callback_states_from_checkpoint`: False
872
+ - `no_cuda`: False
873
+ - `use_cpu`: False
874
+ - `use_mps_device`: False
875
+ - `seed`: 42
876
+ - `data_seed`: None
877
+ - `jit_mode_eval`: False
878
+ - `use_ipex`: False
879
+ - `bf16`: True
880
+ - `fp16`: False
881
+ - `fp16_opt_level`: O1
882
+ - `half_precision_backend`: auto
883
+ - `bf16_full_eval`: False
884
+ - `fp16_full_eval`: False
885
+ - `tf32`: True
886
+ - `local_rank`: 0
887
+ - `ddp_backend`: None
888
+ - `tpu_num_cores`: None
889
+ - `tpu_metrics_debug`: False
890
+ - `debug`: []
891
+ - `dataloader_drop_last`: False
892
+ - `dataloader_num_workers`: 0
893
+ - `dataloader_prefetch_factor`: None
894
+ - `past_index`: -1
895
+ - `disable_tqdm`: False
896
+ - `remove_unused_columns`: True
897
+ - `label_names`: None
898
+ - `load_best_model_at_end`: True
899
+ - `ignore_data_skip`: False
900
+ - `fsdp`: []
901
+ - `fsdp_min_num_params`: 0
902
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
903
+ - `fsdp_transformer_layer_cls_to_wrap`: None
904
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
905
+ - `deepspeed`: None
906
+ - `label_smoothing_factor`: 0.0
907
+ - `optim`: adamw_torch_fused
908
+ - `optim_args`: None
909
+ - `adafactor`: False
910
+ - `group_by_length`: False
911
+ - `length_column_name`: length
912
+ - `ddp_find_unused_parameters`: None
913
+ - `ddp_bucket_cap_mb`: None
914
+ - `ddp_broadcast_buffers`: False
915
+ - `dataloader_pin_memory`: True
916
+ - `dataloader_persistent_workers`: False
917
+ - `skip_memory_metrics`: True
918
+ - `use_legacy_prediction_loop`: False
919
+ - `push_to_hub`: False
920
+ - `resume_from_checkpoint`: None
921
+ - `hub_model_id`: None
922
+ - `hub_strategy`: every_save
923
+ - `hub_private_repo`: False
924
+ - `hub_always_push`: False
925
+ - `gradient_checkpointing`: False
926
+ - `gradient_checkpointing_kwargs`: None
927
+ - `include_inputs_for_metrics`: False
928
+ - `eval_do_concat_batches`: True
929
+ - `fp16_backend`: auto
930
+ - `push_to_hub_model_id`: None
931
+ - `push_to_hub_organization`: None
932
+ - `mp_parameters`:
933
+ - `auto_find_batch_size`: False
934
+ - `full_determinism`: False
935
+ - `torchdynamo`: None
936
+ - `ray_scope`: last
937
+ - `ddp_timeout`: 1800
938
+ - `torch_compile`: False
939
+ - `torch_compile_backend`: None
940
+ - `torch_compile_mode`: None
941
+ - `dispatch_batches`: None
942
+ - `split_batches`: None
943
+ - `include_tokens_per_second`: False
944
+ - `include_num_input_tokens_seen`: False
945
+ - `neftune_noise_alpha`: None
946
+ - `optim_target_modules`: None
947
+ - `batch_eval_metrics`: False
948
+ - `prompts`: None
949
+ - `batch_sampler`: no_duplicates
950
+ - `multi_dataset_batch_sampler`: proportional
951
+
952
+ </details>
953
+
954
+ ### Training Logs
955
+ | Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
956
+ |:----------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
957
+ | 0.9897 | 6 | - | 0.5417 | 0.5428 | 0.5145 | 0.4630 | 0.3945 |
958
+ | 1.6495 | 10 | 3.7867 | - | - | - | - | - |
959
+ | 1.9794 | 12 | - | 0.5269 | 0.5206 | 0.4992 | 0.4751 | 0.4082 |
960
+ | **2.9691** | **18** | **-** | **0.5298** | **0.5238** | **0.5107** | **0.4761** | **0.4268** |
961
+ | 3.2990 | 20 | 1.9199 | - | - | - | - | - |
962
+ | 3.9588 | 24 | - | 0.5289 | 0.5254 | 0.5083 | 0.4727 | 0.4245 |
963
+
964
+ * The bold row denotes the saved checkpoint.
965
+
966
+ ### Framework Versions
967
+ - Python: 3.10.12
968
+ - Sentence Transformers: 3.3.1
969
+ - Transformers: 4.41.2
970
+ - PyTorch: 2.1.2+cu121
971
+ - Accelerate: 0.34.2
972
+ - Datasets: 2.19.1
973
+ - Tokenizers: 0.19.1
974
+
975
+ ## Citation
976
+
977
+ ### BibTeX
978
+
979
+ #### Sentence Transformers
980
+ ```bibtex
981
+ @inproceedings{reimers-2019-sentence-bert,
982
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
983
+ author = "Reimers, Nils and Gurevych, Iryna",
984
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
985
+ month = "11",
986
+ year = "2019",
987
+ publisher = "Association for Computational Linguistics",
988
+ url = "https://arxiv.org/abs/1908.10084",
989
+ }
990
+ ```
991
+
992
+ #### MatryoshkaLoss
993
+ ```bibtex
994
+ @misc{kusupati2024matryoshka,
995
+ title={Matryoshka Representation Learning},
996
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
997
+ year={2024},
998
+ eprint={2205.13147},
999
+ archivePrefix={arXiv},
1000
+ primaryClass={cs.LG}
1001
+ }
1002
+ ```
1003
+
1004
+ #### MultipleNegativesRankingLoss
1005
+ ```bibtex
1006
+ @misc{henderson2017efficient,
1007
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
1008
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
1009
+ year={2017},
1010
+ eprint={1705.00652},
1011
+ archivePrefix={arXiv},
1012
+ primaryClass={cs.CL}
1013
+ }
1014
+ ```
1015
+
1016
+ <!--
1017
+ ## Glossary
1018
+
1019
+ *Clearly define terms in order to be accessible across audiences.*
1020
+ -->
1021
+
1022
+ <!--
1023
+ ## Model Card Authors
1024
+
1025
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
1026
+ -->
1027
+
1028
+ <!--
1029
+ ## Model Card Contact
1030
+
1031
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
1032
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.41.2",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e7de8f53df682a86cc9d75b689b3ab609fe702597f2738029baee6dea93791ca
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff