bbmb's picture
Add new SentenceTransformer model
a3df1fe verified
|
raw
history blame
45.9 kB
metadata
base_model: BAAI/bge-base-en-v1.5
language:
  - en
library_name: sentence-transformers
license: apache-2.0
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:3078
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      [ q_{\text{ut}} = \frac{1}{2} \rho g B N_{\gamma} + c N_{c} + (p_{q} +
      \rho g D_{f}) N_{q} \quad \text{[SI]} \quad (36.1a) ] [ q_{\text{ut}} =
      \frac{1}{2} \gamma B N_{\gamma} + c N_{c} + (p_{q} + \gamma D_{f}) N_{q}
      \quad \text{[U.S.]} \quad (36.1b) ]


      Various researchers have made improvements on the theory supporting this
      equation, leading to somewhat different terms and sophistication in
      evaluating (N_0), (N_c), and (N_g). The approaches differ in the
      assumptions made of the shape of the failure zone beneath the footing.
      However, the general form of the equation is the same in most cases.


      Figure 36.2 and Table 36.2 can be used to evaluate the capacity factors
      (N_0), (N_c), and (N_g) in Equation 36.1. Alternatively, Table 36.3 can be
      used. The bearing capacity factors in Table 36.2 are based on Terzaghi's
      1943 studies. The values in Table 36.3 are based on Meyerhof's 1955
      studies and others, and have been widely used. Other values are also in
      use.


      Equation 36.1 is appropriate for a foundation in a continuous wall
      footing. Corrections, called shape factors, for various footing geometries
      are presented in Table 36.4 and Table 36.5 using the parameters identified
      in Figure 36.3. The bearing capacity factors (N_c) and (N_0) are
      multiplied by the appropriate shape factors when they are used in Equation
      36.1.


      Several researchers have recommended corrections to (N_0) to account for
      footing depth. (Corrections to (N_0) for footing depth have also been
      suggested. No corrections to (N_c) for footing depth have been suggested.)
      There is considerable variation in the method of calculating this
      correction if it is used at all. A multiplicative correction factor,
      (d_c), which is used most often, has the form:


      [ d_{c} = 1 + \frac{K D_{f}}{B} ]


      (K) is a constant for which values of 0.2 and 0.4 have been proposed. The
      depth factor correction is applied to (N_0) along with the shape factor
      correction in Equation 36.1. Once the ultimate bearing capacity is
      determined, it is corrected by the overburden, giving the net bearing
      capacity. This is the net pressure the soil can support beyond the
      pressure applied by the existing overburden.


      [ q_{\text{net}} = q_{\text{ult}} - \rho g D_{f} \quad \text{[SI]} \quad
      (36.3a) ] [ q_{\text{net}} = q_{\text{ut}} - \gamma D_{f} \quad
      \text{[U.S.]} \quad (36.3b) ]


      [ \begin{array}{r l}{{\mathrm{[U.S.]}}}&{{}36.3(b)}\end{array} ]


      [ q_{\text{net}} = q_{\text{ult}} - \gamma D_{f} ]


      Figure 36.2: Terzaghi Bearing Capacity Factors
    sentences:
      - What does the net bearing capacity represent in foundation engineering?
      - >-
        Can anyone explain the difference between ductility and percent
        elongation?
      - How do you compute the inverse of a 3x3 matrix?
  - source_sentence: >-
      Backwashing with filtered water pumped back through the filter from the
      bottom to the top expands the sand layer by 30-50%, which dislodges
      trapped material. Backwashing for 3-5 minutes at a rate of 8-15 gpm/ft²
      (5.4-10 L/s-m²) is a typical specification. The head loss is reduced to
      approximately 1 ft (0.3 m) after washing. Experience has shown that
      supplementary agitation of the filter media is necessary to prevent
      "caking" and "mudballs" in almost all installations. Prior to backwashing,
      the filter material may be expanded by an air prewash volume of 1-8 (2-5
      typical) times the sand filter volume per minute for 2-10 minutes (3-5
      minutes typical). Alternatively, turbulence in the filter material may be
      encouraged during backwashing with an air wash or with rotating hydraulic
      surface jets.


      During backwashing, the water in the filter housing will rise at a rate of
      1-3 ft/min (0.5-1.5 cm/s). This rise should not exceed the settling
      velocity of the smallest particle that is to be retained in the filter.
      The wash water, which is collected in troughs for disposal, constitutes
      approximately 1-5% of the total processed water. The total water used is
      approximately 75-100 gal/ft² (3-4 kL/m²). The actual amount of backwash
      water is given by the equation:


      $$ V = A_{\text{filter}} \cdot (\text{rate of rise}) \cdot
      t_{\text{backwash}} $$


      The temperature of the water used in backwashing is important since
      viscosity changes with temperature (the effect of temperature on water
      density is negligible). Water at 40°F (4°C) is more viscous than water at
      70°F (21°C). Therefore, media particles may be expanded to the same extent
      using lower upflow rates at lower backwash temperatures. ```markdown


      26. Other Filtration Methods


      Pressure (sand) filters for water supply treatment operate similarly to
      rapid sand filters except that incoming water is typically pressurized to
      25-75 psig (170-520 kPa gage). Single media filter rates are typically 4-5
      gpm/ft² (1.4-14 L/s-m²), with 2-10 gpm/ft² (2.7-3.4 L/s-m² typical), while
      dual media filters run at 1.5 to 2.0 times these rates. Pressure filters
      are not used in large installations.


      Ultrafilters are membranes that act as sieves to retain turbidity,
      microorganisms, and large organic molecules that are THM precursors, while
      allowing water, salts, and small molecules to pass through.
      Ultrafiltration is effective in removing particles ranging in size of
      0.001 to 10 µm. A pressure of 15-75 psig (100-500 kPa) is required to
      drive the water through the membrane.


      Biofilm filtration (biofilm process) uses microorganisms to remove
      selected contaminants (e.g., aromatics and other hydrocarbons). The
      operation of biofilters is similar to trickling filters used in wastewater
      processing. Sand filter facilities are relatively easy to modify—sand is
      replaced with gravel in the 4-14 mm size range, application rates are
      decreased, and exposure to chlorine from incoming and backwash water is
      eliminated. While the maximum may never be used, a maximum backwash rate
      of 20 gpm/ft² (14 L/s-m²) should be provided for. A µm is the same as a
      micron.
    sentences:
      - How do I calculate the total water used for backwashing?
      - >-
        How do I calculate flow rate if the water depth is 5 ft and channel
        width is 8 ft?
      - >-
        What is the formula for estimating the percent time spent following on
        highways?
  - source_sentence: >-
      Here is the LaTeX representation of the angles and the radius of the
      circle:


      \begin{align} \alpha &= \angle PQR \ \beta &= \angle QNR \ \gamma &=
      \angle RPN \ \end{align}


      \begin{align} a &= \text{radius of the circle} \ I &= \text{line segment}
      \ \end{align}


      The figure also includes a dashed line representing a chord and a tangent
      line from point P to the circle, with a point of tangency labeled 'T'. The
      tangent line is perpendicular to the radius of the circle at point T.


      Figure 79.5 Tangent and Chord Offset Geometry


      The short chord distance is


      [ \mathrm{NQ} = C = 2R \sin \alpha ] [ \mathrm{NP} = (2R \sin \alpha) \cos
      \alpha = C \cos \alpha ] [ \mathrm{PQ} = (2R \sin \alpha) \sin \alpha = 2R
      \sin^2 \alpha ]


      \tag{79.23} \tag{79.24} \tag{79.25} ```


      7. Curve Layout By Chord Offset


      The chord offset method is a third method for laying out horizontal
      curves. This method is also suitable for short curves. The method is named
      for the way in which the measurements are made, which is by measuring
      distances along the main chord from the instrument location at PC.


      [ \mathrm{NR} = \mathrm{chord~distance} =
      \mathrm{NQ}\cos\left({\frac{I}{2}} - \alpha\right) ]


      [ \sqrt{2} = (2R\sin\alpha)\cos\left(\frac{I}{2} - \alpha\right) =
      C\cos\left(\frac{I}{2} - \alpha\right) = (I - \alpha)^{2} ]


      [ \mathrm{RQ} = \mathrm{chord~offset} =
      \mathrm{NQ}~\sin\left({\frac{I}{2}} - \alpha\right) ]


      [ = (2R\sin\alpha)\sin\left({\frac{I}{2}} - \alpha\right) =
      C\sin\left({\frac{I}{2}} - \alpha\right) ]


      [ 79.27 ]


      8. Horizontal Curves Through Points


      Occasionally, it is necessary to design a horizontal curve to pass through
      a specific point. The following procedure can be used. (Refer to Fig.
      79.6.)


      Step 1: Calculate ( \alpha ) and ( m ) from ( x ) and ( y ). (If ( x ) and
      ( m ) are known, skip this step.) [ \alpha =
      \arctan\left(\frac{y}{x}\right) ] [ m = \sqrt{x^{2} + y^{2}} ]


      Step 2: Calculate ( y ). Since ( 90^\circ + \frac{I}{2} + \alpha =
      180^\circ ), [ \gamma = 90^\circ - \frac{I}{2} - \alpha ]


      Step 3: Calculate ( \phi ). [ \phi = 180^\circ -
      \arcsin\left(\frac{\sin\gamma}{\cos\left(\frac{I}{2}\right)}\right) ] [ =
      180^\circ - \gamma - \phi ]


      Step 4: Calculate ( O ). (Refer to Eq. 79.32)
    sentences:
      - >-
        What's the difference between horizontal and vertical parabolas in their
        equations?
      - What does the distance of 50 ft represent in the wave illustration?
      - >-
        What is the relationship between tangent lines and radius in circular
        geometry?
  - source_sentence: >-
      Description: The image provided is not clear enough to discern any
      specific details, text, or formulas. It appears to be a blurred image with
      no distinguishable content. Therefore, I cannot extract any formulas or
      provide a description of the image content.


      Unfortunately, it is extremely difficult to prove compensatory fraud
      (i.e., fraud for which damages are available). Proving fraud requires
      showing beyond a reasonable doubt (a) a reckless or intentional
      misstatement of a material fact, (b) an intention to deceive, (c) it
      resulted in misleading the innocent party to contract, and (d) it was to
      the innocent party's detriment. For example, if an engineer claims to have
      experience in designing steel buildings but actually has none, the court
      might consider the misrepresentation a fraudulent action. If, however, the
      engineer has some experience, but an insufficient amount to do an adequate
      job, the engineer probably will not be considered to have acted
      fraudulently.


      Torts


      A tort is a civil wrong committed by one person causing lamage to another
      person or person's property, emoional well-being, or reputation.11 It is a
      breach of the ights of an individual to be secure in person or propxty. In
      order to correct the wrong, a civil lawsuit (tort iction or civil
      complaint) is brought by the alleged njured party (the plaintiff) against
      the defendant. To be a valid tort action (i.e., lawsuit), there must have
      been injury (i.e., damage). Generally, there will be no contract between
      the two parties, so the tort action annot claim a breach of contract. 12
      Cort law is concerned with compensation for the injury, not punishment.
      Therefore, tort awards usually consist


      f general, compensatory, and special damages and arely include punitive
      and exemplary damages. (See Damages" for definitions of these damages.)


      Strict Liability In Tort


      itrict liability in tort means that the injured party wins f the injury
      can be proven. It is not necessary to prove egligence, breach of explicit
      or implicit warranty, or he existence of a contract (privity of contract).
      Strict ability in tort is most commonly encountered in prodct liability
      cases. A defect in a product, regardless of ow the defect got there, is
      sufficient to create strict ability in tort.


      lase law surrounding defective products has developed nd refined the
      following requirements for winning a trict liability in tort case. The
      following points must e proved.


      The difference between a civil tort (lausuit) and a criminal lausuit is ie
      alleged injured party. A crime is a wrong against society. A iminal
      lawsuit is brought by the state against a defendant.


      It is possible for an injury to be both a breach of contract and a tort.


      ippose an owner has an agreement with a contractor to construct a ilding,
      and the contract requires the contractor to comply with all ate and
      federal safety regulations. If the owner is subsequently jured on a
      stairway because there was no guardrail, the injury could · recoverable
      both as a tort and as a breach of contract. If a third irty unrelated to
      the contract was injured, however, that party could cover only through a
      tort action. · The product was defective in manufacture, design, labeling,
      and so on.


      The product was defective when used.


      The defect rendered the product unreasonably dangerous.


      The defect caused the injury. .


      The specific use of the product that caused the damage was reasonably
      foreseeable.


      Manufacturing And Design Liability
    sentences:
      - >-
        What factors influence the instantaneous center of rotation in welded
        structures?
      - How do you establish if fraud has occurred in a contract?
      - How do you calculate the probability of multiple events happening?
  - source_sentence: >-
      9. Area


      Equation 9.35 calculates the area, ( A ), bounded by ( x = a ), ( x = b ),
      ( f_1(x) ) above, and ( f_2(x) ) below. (Note: ( f_2(x) = 0 ) if the area
      is bounded by the x-axis.) This is illustrated in Fig. 9.1. [ A =
      \int_{a}^{b} \left( f_{1}(x) - f_{2}(x) \right) \, dx \qquad \qquad (9.35)
      ] Figure 9.1 Area Between Two Curves


      Description: The image shows a graph with two curves labeled f1(x) and
      f2(x). The graph is plotted on a Cartesian coordinate system with an
      x-axis and a y-axis. There are two vertical dashed lines intersecting the
      x-axis at points labeled 'a' and 'b'. The curve f1(x) is above the line y
      = 0 and the curve f2(x) is below the line y = 0. The area between the two
      curves from x = a to x = b is shaded, indicating a region of interest or
      calculation.


      The LaTeX representation of the curves is not provided in the image, so I
      cannot write them in LaTeX form. However, if the curves were described by
      functions, they could be represented as follows:


      f1(x) could be represented as ( f_1(x) = ax^2 + bx + c ) for some
      constants a, b, and c.


      f2(x) could be represented as ( f_2(x) = -ax^2 - bx - c ) for some
      constants a, b, and c.


      The area between the curves from x = a to x = b could be calculated using
      the integral of the difference between the two functions over the interval
      [a, b].


      Description: The image provided is not clear enough to describe in detail
      or to extract any formulas. The text is not legible, and no other
      discernible features can be identified.


      Find the area between the x-axis and the parabola ( y = x^2 ) in the
      interval ([0, 4]).


      Description: The image shows a graph with a curve that represents a
      function y = x^2. There is a vertical dashed line at x = 4, indicating a
      point of interest or a specific value on the x-axis. The graph is plotted
      on a Cartesian coordinate system with the x-axis labeled 'x' and the
      y-axis labeled 'y'. The curve is a parabola that opens upwards, showing
      that as x increases, y increases at an increasing rate. The point where x
      = 4 is marked on the x-axis, and the corresponding y-value on the curve is
      not explicitly shown but can be inferred from the equation y = x^2.


      Solution: Referring to Eq. 9.35, [ f_{1}(x) = x^{2} \quad \text{and} \quad
      f_{2}(x) = 0 ] Thus, [ A = \int_{a}^{b} \left( f_1(x) - f_2(x) \right) dx
      = \int_{0}^{4} x^2 \, dx = \left[ \frac{x^3}{3} \right]_{0}^{4} =
      \frac{64}{3} ] ...


      10. Arc Length
    sentences:
      - >-
        Can you show me how to find the area using the integral of the
        difference of two functions?
      - >-
        Can you explain how to calculate the force BC using trigonometric
        components?
      - >-
        What is the minimum requirement for steel area in slab reinforcement
        according to ACI guidelines?
model-index:
  - name: deep learning project
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@1
            value: 0.2543859649122807
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.5789473684210527
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.7017543859649122
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.7982456140350878
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.2543859649122807
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.19298245614035087
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.14035087719298245
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.07982456140350876
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.2543859649122807
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.5789473684210527
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.7017543859649122
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.7982456140350878
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.5289463979794752
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.4422630650700826
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.45071327302764325
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@1
            value: 0.2631578947368421
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.5760233918128655
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.695906432748538
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.783625730994152
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.2631578947368421
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.19200779727095513
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.13918128654970757
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.0783625730994152
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.2631578947368421
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.5760233918128655
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.695906432748538
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.783625730994152
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.525405284677311
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.4422096908939014
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.45077185641932777
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@1
            value: 0.260233918128655
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.5526315789473685
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.6754385964912281
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.7573099415204678
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.260233918128655
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.18421052631578946
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.1350877192982456
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.07573099415204677
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.260233918128655
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.5526315789473685
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.6754385964912281
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.7573099415204678
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.5082788808907895
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.4281189083820665
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.4372871346521922
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy@1
            value: 0.2134502923976608
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.5116959064327485
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.6403508771929824
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.7368421052631579
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.2134502923976608
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.1705653021442495
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.12807017543859647
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.07368421052631578
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.2134502923976608
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.5116959064327485
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.6403508771929824
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.7368421052631579
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.4726924534205871
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.3880070546737214
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.39701781193586744
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy@1
            value: 0.1871345029239766
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.47076023391812866
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.5789473684210527
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.6695906432748538
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.1871345029239766
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.15692007797270952
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.11578947368421051
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.06695906432748537
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.1871345029239766
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.47076023391812866
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.5789473684210527
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.6695906432748538
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.42447214920635656
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.3461802654785111
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.3562882551304709
            name: Cosine Map@100

deep learning project

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("bbmb/deep-learning-for-embedding-model-ssilwal-qpham6")
# Run inference
sentences = [
    "9. Area\n\nEquation 9.35 calculates the area, ( A ), bounded by ( x = a ), ( x = b ), ( f_1(x) ) above, and ( f_2(x) ) below. (Note: ( f_2(x) = 0 ) if the area is bounded by the x-axis.) This is illustrated in Fig. 9.1. [ A = \\int_{a}^{b} \\left( f_{1}(x) - f_{2}(x) \\right) \\, dx \\qquad \\qquad (9.35) ] Figure 9.1 Area Between Two Curves\n\nDescription: The image shows a graph with two curves labeled f1(x) and f2(x). The graph is plotted on a Cartesian coordinate system with an x-axis and a y-axis. There are two vertical dashed lines intersecting the x-axis at points labeled 'a' and 'b'. The curve f1(x) is above the line y = 0 and the curve f2(x) is below the line y = 0. The area between the two curves from x = a to x = b is shaded, indicating a region of interest or calculation.\n\nThe LaTeX representation of the curves is not provided in the image, so I cannot write them in LaTeX form. However, if the curves were described by functions, they could be represented as follows:\n\nf1(x) could be represented as ( f_1(x) = ax^2 + bx + c ) for some constants a, b, and c.\n\nf2(x) could be represented as ( f_2(x) = -ax^2 - bx - c ) for some constants a, b, and c.\n\nThe area between the curves from x = a to x = b could be calculated using the integral of the difference between the two functions over the interval [a, b].\n\nDescription: The image provided is not clear enough to describe in detail or to extract any formulas. The text is not legible, and no other discernible features can be identified.\n\nFind the area between the x-axis and the parabola ( y = x^2 ) in the interval ([0, 4]).\n\nDescription: The image shows a graph with a curve that represents a function y = x^2. There is a vertical dashed line at x = 4, indicating a point of interest or a specific value on the x-axis. The graph is plotted on a Cartesian coordinate system with the x-axis labeled 'x' and the y-axis labeled 'y'. The curve is a parabola that opens upwards, showing that as x increases, y increases at an increasing rate. The point where x = 4 is marked on the x-axis, and the corresponding y-value on the curve is not explicitly shown but can be inferred from the equation y = x^2.\n\nSolution: Referring to Eq. 9.35, [ f_{1}(x) = x^{2} \\quad \\text{and} \\quad f_{2}(x) = 0 ] Thus, [ A = \\int_{a}^{b} \\left( f_1(x) - f_2(x) \\right) dx = \\int_{0}^{4} x^2 \\, dx = \\left[ \\frac{x^3}{3} \\right]_{0}^{4} = \\frac{64}{3} ] ...\n\n10. Arc Length",
    'Can you show me how to find the area using the integral of the difference of two functions?',
    'What is the minimum requirement for steel area in slab reinforcement according to ACI guidelines?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric dim_768 dim_512 dim_256 dim_128 dim_64
cosine_accuracy@1 0.2544 0.2632 0.2602 0.2135 0.1871
cosine_accuracy@3 0.5789 0.576 0.5526 0.5117 0.4708
cosine_accuracy@5 0.7018 0.6959 0.6754 0.6404 0.5789
cosine_accuracy@10 0.7982 0.7836 0.7573 0.7368 0.6696
cosine_precision@1 0.2544 0.2632 0.2602 0.2135 0.1871
cosine_precision@3 0.193 0.192 0.1842 0.1706 0.1569
cosine_precision@5 0.1404 0.1392 0.1351 0.1281 0.1158
cosine_precision@10 0.0798 0.0784 0.0757 0.0737 0.067
cosine_recall@1 0.2544 0.2632 0.2602 0.2135 0.1871
cosine_recall@3 0.5789 0.576 0.5526 0.5117 0.4708
cosine_recall@5 0.7018 0.6959 0.6754 0.6404 0.5789
cosine_recall@10 0.7982 0.7836 0.7573 0.7368 0.6696
cosine_ndcg@10 0.5289 0.5254 0.5083 0.4727 0.4245
cosine_mrr@10 0.4423 0.4422 0.4281 0.388 0.3462
cosine_map@100 0.4507 0.4508 0.4373 0.397 0.3563

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 3,078 training samples
  • Columns: positive and anchor
  • Approximate statistics based on the first 1000 samples:
    positive anchor
    type string string
    details
    • min: 117 tokens
    • mean: 508.1 tokens
    • max: 512 tokens
    • min: 8 tokens
    • mean: 15.93 tokens
    • max: 28 tokens
  • Samples:
    positive anchor
    The PHF is used to convert hourly volumes to flow rates and represents the hourly variation in traffic flow. If the demand volume is measured in 15 min increments, it is unnecessary to use the PHF to convert to flow rates.

    Therefore, since two-lane highway analysis is based on demand flow rates for a peak 15 min period within the analysis hour (usually the peak hour), the PHF in Equation 73.22 and Equation 73.23 is given a value of 1.00.

    The average travel speed in the analysis direction, ( ATS_d ), is estimated from the FFS, the demand flow rate, the opposing flow rate, and the adjustment factor for the percentage of no-passing zones in the analysis direction, ( f_{np} ), as given in HCM Exh. 15-15. Equation 73.24 only applies to Class I and Class III two-lane highways.

    [ \mathrm{ATS}{d} = \mathrm{FFS} - 0.0076(v{d,s} + v_{o,s}) - f_{\mathrm{np},s} \quad (73.24) ]

    If the PTSF methodology is used, the formula for the demand flow rate, ( v_{i, \text{ATS}} ), is the same, although di...
    What is the formula for estimating the percent time spent following on highways?
    However, if the initial point on the limb is close to the critical point (i.e., the nose of the curve), then a small change in the specific energy (such as might be caused by a small variation in the channel floor) will cause a large change in depth. That is why severe turbulence commonly occurs near points of critical flow. Given that ( 4 , \text{ft/sec} ) (or ( 1.2 , \text{m/s} )) of water flows in a ( 7 , \text{ft} ) (or ( 2.1 , \text{m} )) wide, ( 6 , \text{ft} ) (or ( 1.8 , \text{m} )) deep open channel, the flow encounters a ( 1.0 , \text{ft} ) (or ( 0.3 , \text{m} )) step in the channel bottom. What is the depth of flow above the step? Actually, specific energy curves are typically plotted for flow per unit width, ( q = \frac{Q}{w} ). If that is the case, a jump from one limb to the other could take place if the width were allowed to change as well as the depth. A rise in the channel bottom does not always produce a drop in the water surface. Only if the flow is initiall... What happens to the water depth when it encounters a step in a channel?
    The shear strength, ( S ) or ( S_{ys} ), of a material is the maximum shear stress that the material can support without yielding in shear. (The ultimate shear strength, ( S_{us} ), is rarely encountered.) For ductile materials, maximum shear stress theory predicts the shear strength as one-half of the tensile yield strength. A more accurate relationship is derived from the distortion energy theory (also known as von Mises theory).

    Figure 43.16: Uniform Bar in Torsion

    Description: The image shows a diagram of a mechanical system with a cylindrical object, a rod, and a spring. There are two forces acting on the system: one is the weight of the rod, labeled 'L', acting downwards, and the other is the spring force, labeled 'T', acting upwards. The rod is shown to be in equilibrium, with the spring force balancing the weight of the rod. The distance from the pivot point to the center of mass of the rod is labeled 'r'. There is also a variable 'y' indicating the vertical displacement of t...
    Can you explain what maximum shear stress theory is?
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_256_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
0.9897 6 - 0.5417 0.5428 0.5145 0.4630 0.3945
1.6495 10 3.7867 - - - - -
1.9794 12 - 0.5269 0.5206 0.4992 0.4751 0.4082
2.9691 18 - 0.5298 0.5238 0.5107 0.4761 0.4268
3.2990 20 1.9199 - - - - -
3.9588 24 - 0.5289 0.5254 0.5083 0.4727 0.4245
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.41.2
  • PyTorch: 2.1.2+cu121
  • Accelerate: 0.34.2
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}