wjunwei commited on
Commit
dc4821e
1 Parent(s): 7a54906

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,553 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: avsolatorio/GIST-Embedding-v0
3
+ datasets: []
4
+ language: []
5
+ library_name: sentence-transformers
6
+ pipeline_tag: sentence-similarity
7
+ tags:
8
+ - sentence-transformers
9
+ - sentence-similarity
10
+ - feature-extraction
11
+ - generated_from_trainer
12
+ - dataset_size:5864
13
+ - loss:ContrastiveTensionLoss
14
+ widget:
15
+ - source_sentence: pyrex blue cup rectangular plastic cover pc pack original genuine
16
+ pyrex made in the usa genuine pyrex replacement lid nonporous surface does not
17
+ absorb food odors flavors or stains refrigerator microwave and toprack dishwasher
18
+ safe will not fit anchor hocking products included pyrex pc blue cup rectangle
19
+ plastic lids lids only containers not included
20
+ sentences:
21
+ - 'kpywzer vintage leather sling bag bags for men women backpack shoulder messenger
22
+ crossbody outdoor travel hiking camping tactical chest pack daypack brown '
23
+ - dove beauty bar original moisturizing bar made with moisturizing cream gentle
24
+ for soft skin count oz each lock in moisture dove original beauty bar is the
25
+ classic white bar cleanser you trust to nourish and pamper skin and give you softer
26
+ smoother more radiantlooking skin effective cleanser gently wash away dirt and
27
+ germs for use every day on body face and hands this dove beauty bar creates luxurious
28
+ lather and wont dry your skin like ordinary bar soaps no dermatologist recommended
29
+ formulated with moisturizing cream dove moisturizing beauty bar and gentle skin
30
+ cleanser leaves skin instantly soft and smooth with lasting nourishment versatile
31
+ cleanser the unique formulation for effective skin care can be enjoyed by the
32
+ entire family as this allinone cleanser nourishes your face body and hands for
33
+ beautiful skin thoughtfully made this beauty bar is certified crueltyfree by peta
34
+ and sulfate free cleansers so you can feel good about switching from ordinary
35
+ bar soap to dove beauty bar care with your cleanser our mission is to help the
36
+ next generation of women develop a positive relationship with the way they look
37
+ helping them raise their selfesteem
38
+ - professional stainless steel pots and pans setply piece induction cookware setinduction
39
+ oven broiler safe f pots and pans silver stainless steel professional cookware
40
+ set geanbioso ply stainless steel pots and pans set combines the epitome in rapid
41
+ heating heat retention and cooking stability each piece features cast stainless
42
+ steel handles durable rivets measurement markings easytopour edges and allcooktop
43
+ compatibility superior qualitypots and pans are crafted from heavyduty ply clad
44
+ stainless steelthe outer layer is magneticconducting stainless steel the middle
45
+ is layer aluminum alloycore for even heat transfer throughout the panthe pot
46
+ is formed in one piece riveted links safe and solid healthy and environmentally
47
+ friendly physically nonstick and the service life of up to years or more suitable
48
+ for all cooktopsgeanbioso ply stainless steel cookware set is suitable for all
49
+ cooktops and is compatible with oven electric gas ceramic halogen and induction
50
+ cooktops it is convenient to put the pans in the dishwasheroven safe up to polished
51
+ technique easytoclean cookware setthis stainless steel pans and pots set interior
52
+ surface used mirror polished technique it make stainless steel cookware sets effectively
53
+ avoid deformation warping and rust easy to wash and quick cleanup this kitchen
54
+ set not only have long use life also make housework much more easier and decorate
55
+ your kitchen looks shining and high class this fabulous set consists of basics
56
+ and more the quart covered stockpot with lid takes care of everything from oatmeal
57
+ to bechamel sauce also included are and inch open skillets that turn out perfect
58
+ grilled cheese hamburgers and crepesand an quart covered stockpot for stews and
59
+ soups dishwashersafe this set has no coatings to flake off or plastic to melt
60
+ and will surely be family businessmotivated by our year familyhistory in kitchen
61
+ supplywe setout to multigenerational familyowned manufacturers in the usand western
62
+ europe to buildsuperior culinary toolsour aim is to enhance the experience of
63
+ your culinary creation and let them become your best kitchen companion
64
+ - source_sentence: logitech g lightspeed wireless gaming headset with suspension headband
65
+ lightsync rgb blue voce mic technology and prog audio drivers black total freedom
66
+ with up to meter wireless range and lightspeed wireless audio transmission keep
67
+ playing for up to hours of battery life play in stereo on playstationr note if
68
+ the size of the earbud tips does not match the size of your ear canals or the
69
+ headset is not worn properly in your ears you may not obtain the correct sound
70
+ qualities or call performance change the earbud tips to ones that fit more snugly
71
+ in your earspecific uses for product gamingpersonalstreaming personalize your
72
+ headset lighting across the full spectrum m colors play in colors with frontfacing
73
+ dualzone lightsync rgb lighting and choose from preset animations or create your
74
+ own with g hub software colorful reversible suspension headbands are designed
75
+ for comfort during long play sessions advanced mic filters that make your voice
76
+ sound richer cleaner and more professional customize with g hub and find your
77
+ sound hear every audio cue with breathtaking clarity and get immerse in your game
78
+ prog drivers are designed to significantly reduce distortion and reproduce precise
79
+ consistent rich sound quality soft duallayer memory foam that conforms to your
80
+ head and reduces stress points for longlasting comfort console compatible playstation playstation nintendo
81
+ switch in dock stereo sound only via usb wireless dongle frequency response hz khz
82
+ sentences:
83
+ - 'levis kids pajamas '
84
+ - bright dream placemats washable easy to clean pvc placemat for kitchen table heatresistand
85
+ woven vinyl hard table mats x inches set of blue placemats pvc and polyesterx
86
+ inch xcm dont put tomato juice on the placemats its hard to clean uptemperature
87
+ should not exceed washable and easy to cleanhand wash in warm soapy water and
88
+ wipe dryif there is stains stuch in the placematuse a brush and soapy water to
89
+ clean it no machine wash ecofriendly kitchen accessorieswashablenonfadingnonstainwipe
90
+ clean protect your table from scratches and stains liquid can go through placemats
91
+ clean it when finishedperfect additon to your dinner tablebeautiful stylish placemats
92
+ to addmore fun to your kitchen table
93
+ - outdoor portable volleyball net system adjustable height poles with soft volleyball
94
+ ball pump hammer boundary line and carry bag for backyard beach lawn for this
95
+ volleyball net set easy conversion for the height design for men height for
96
+ coplay height and for women height standard volleyball net sturdy nylon
97
+ net with strengthen polyester ropes knit together this net could durable inch
98
+ piece steel poles with black powdercoating for the total height is official instant
99
+ set up designed for easy use the makes each assembly step very simple set
100
+ up this set just in minutes volleyball set for backyardlawnbeach this set includes
101
+ standard volleyball net steel poles official size and weight volleyball balls
102
+ and inflation pump with needle boundary line and a carry bag patent pending
103
+ - source_sentence: king queen apron set wedding engagement gifts for couples funny
104
+ kitchen gifts for couples husband wife boyfriend girlfriend him her bridal shower
105
+ gifts unique valentines day gifts ideas perfect gift for couples searching for
106
+ a couples gift that is guaranteed to put glowing smiles on the faces of that special
107
+ pair look no further whats better than give the couple a great matching apron
108
+ of king queen to help them spending the happy cooking time together beautiful
109
+ gift for couples parents friends fantastic gift idea for fathers day mothers day
110
+ birthday valentines day anniversaries weddings or christmas superior quality practical couples
111
+ aprons are made of denselywoven polyester soft breathable fabric to be used and
112
+ enjoyed for years to come designed with adjustable neck waist straps comfortably
113
+ fit all body shapes sizes these aprons dont just look cute they add extra fun
114
+ to your daily routine quality time with your partner just got way better premium
115
+ quality design and printing carefully crafted for those who appreciate distinctive
116
+ elegance and quality our peel proof graphics are machine washable our aprons are
117
+ built to last and waterproof and deformable these matching aprons make a perfect
118
+ idea for valentines day gifts christmas gifts thanksgiving gifts birthday gifts
119
+ couple gifts engagement wedding anniversary gifts or housewarming gifts gifts
120
+ for couples who have everything king and queen apron gift set is made of denselywoven
121
+ polyester fabrics do waterproof and stainproof it can be worn while doing housework
122
+ gardening painting and baking our king and queen matching aprons come packaged
123
+ in a beautiful box ready to be gifted want to be the hit of the party add to cart
124
+ hasslefree guarantee never worry about not having a suitable choice for engagement
125
+ wedding anniversary gifts valentines day gifts christmas gifts housewarming gifts king queen
126
+ aprons gift set is perfect couples gifts your girlfriend boyfriend wife husband
127
+ partner parents best friend and family members will have a great laugh we take
128
+ pride in our king queen aprons with pockets and back these up with a lifetime
129
+ guarantee
130
+ sentences:
131
+ - 'maven thread workout headbands for women wide headbands for women sports running
132
+ headband for exercise gym hairband athletic workouts and yoga thick non slip
133
+ sweatbands for women pack fashionable functional these headbands for women make
134
+ the perfect accessory for a killer workout outfit or for adorable casual wear
135
+ whether youre doing yoga playing sports running hitting the gym or simply lounging
136
+ at home our headband will keep your hair out of the way and keep you looking fresh
137
+ nomess sweatband a girls gotta sweat and when she does our thick headband accessories
138
+ are ready for it this stretchy band features nonslip grip and moisturewicking
139
+ material so no matter the sport youre playing or the exercise youre doing you
140
+ can give it your all this hairband will wrap snugly around your head but wont
141
+ be too tight cute designs our headbands come in a variety of cute modern designs
142
+ that both girls and women love whether youre looking for an accessory to match
143
+ your favorite sweats or a headband to spruce up your ponytail were sure to have
144
+ a look youll love this basics set features darkgray headband lightgray headband navy
145
+ headband and black headband style for any wear whether youre a nurse trying to
146
+ keep her hair pulled back at work a runner pounding the pavement a biker hitting
147
+ the trails or just someone looking to add a boho or athletic touch to your outfit
148
+ these wraps are for you the cloth will keep thin straight curly or even thick
149
+ hair back if youre going to get sweaty toss an extra headband in your purse comfort quality
150
+ combined at maven thread we take pride in creating highquality products that are
151
+ both stylish and durable our headbands are made of comfortable fabric with of
152
+ blend of polyester rayon and spandex this fabric gives our bands fourway stretch
153
+ for onesizefitsall sizing we also offer sweatbands in widths and '
154
+ - 'disney minnie mouse girls short sleeve tshirt and shorts set for toddlers and
155
+ big kids '
156
+ - 'gymboree boys and toddler linen chino shorts '
157
+ - source_sentence: 'ntbay flannel fuzzy toddler blanket fluffy warm and lightweight
158
+ reversible stripes design baby plush blanket x inches pink flannel package include x
159
+ toddler blanket x inches product features our cozy baby blanket is made of flannel
160
+ fabric which is skin friendly and soft the lightweight toddler blanket is not
161
+ so thick but it can keep your baby warm and does not feel the pressure of the
162
+ blanket weight product function the size of the baby blanket is suitable and lightweight
163
+ which means that you can easily fold it up and pack it in bags while traveling
164
+ our baby blanket can be a multipurpose for baby stroller covers car seat covers
165
+ and all seasons applicable easy care machine wash cold gentle cycle only nonchlorine
166
+ bleach when needed tumble dry low cool iron if needed satisfactory sales service
167
+ we are so confident in the quality of our product that we offer days return and
168
+ replacement service we also provide it with a life time free customer service
169
+ if you find that our products have quality problems just contact us and we will
170
+ solve the problem for you immediately '
171
+ sentences:
172
+ - neutrogena original fragrancefree facial cleansing bar with glycerin pure transparent
173
+ gentle face wash bar soap free of harsh detergents dyes hardeners oz ounce neutrogena
174
+ facial cleansing bar in original fragrancefree formula for gentle daily facial
175
+ cleansing from a dermatologistrecommended skin care brand this pure cleansing
176
+ bar is designed for all skin types and lathers up to remove excess oil and gently
177
+ cleanse skin featuring the original neutrogena formula the face soap bar is glycerinrich
178
+ to attract moisture and contains no harsh detergents dyes or hardeners that can
179
+ irritate the skin our face cleansing bar is exceptionally cleanrinsing without
180
+ leaving behind any poreclogging residue for clear healthierlooking skin the pure
181
+ and gentle formula of this facial soap bar is fragrancefree hypoallergenic and
182
+ suitable for daily use
183
+ - 'french toast girls big vneck jumper blue gold plaid '
184
+ - philips norelco shaver rechargeable cordless electric shaver with popup trimmer
185
+ x with comfortcut blades you get a clean shave thats comfortable on your skin
186
+ rounded caps shield selfsharpening blades to gently cut hair just above skin
187
+ level and help the shaver glide smoothly over your skin d flex heads move independently
188
+ in directions to automatically adjust to the curves of your face neck and jaw
189
+ line popup trimmer for mustache and sideburns finish your look with the builtin
190
+ trimmer its ideal for maintaining your mustache and trimming your sideburns at
191
+ the touch of a button you can pop the heads open for an easy and thorough clean
192
+ under the tap up to minutes of cordless runtime thats about shaves or plug
193
+ it in for instant continuous power
194
+ - source_sentence: magic the gathering modern horizons bundle gift edition deluxe
195
+ bundle with collector booster play boosters fullart lands exclusive accessories
196
+ give the gift of magicthe perfect present for any magic fan this gift bundle is
197
+ full of exclusive accessories and the coolest cards from modern horizons including
198
+ a collector booster packed with rare cards shiny foils and altart altframe cards
199
+ moderns never been more marvelousintroducing a heaping helping of exciting cards
200
+ for modern one of magics most celebrated formats plus the return of competitive
201
+ favorites theres something for every fan to love in modern horizons a collector
202
+ booster full of treasuresboth powerful and flashy the included collector booster
203
+ is packed with only the coolest cards and is the only place fans may find foiletched
204
+ cards textured foil cards or serialized cards printed with a unique number boosters
205
+ for playing building decksplay boosters are the best packs for playing magic
206
+ with friends and are fun to open with a possibility of multiple rares and at least shining
207
+ foil card in every pack fullart lands special alternateart cardevery mh bundle
208
+ gift edition also comes with traditional foil promo card featuring bundleexclusive
209
+ alternate art and of the included land cards feature stunning fullart foil nonfoil
210
+ exclusive accessorieseach gift bundle also comes with exclusive accessories including
211
+ a special die to track your life total as you play and a sturdy box to store everything
212
+ in bundle gift edition contents collector booster play boosters traditional
213
+ foil altart card land cards foil nonfoil spindown life counter card storage
214
+ box and reference cards
215
+ sentences:
216
+ - dove men care in bar cleanser for body face and shaving extra fresh body and
217
+ facial cleanser more moisturizing than bar soap to clean and hydrate skin ounce
218
+ pack of hydrate your skin unlike other regular bar soap that can dry out skin
219
+ dove mencare extra fresh body and face bar deeply cleans and hydrates skin leaving
220
+ skin feeling smooth and comfortable from the dermatologist recommended mens brand
221
+ this face and body bar is formulated with more moisturizer than ordinary soap
222
+ bar and washes off easily to leave skin feeling healthy in for bar cleanser
223
+ for body face and shaving dove mencare extra fresh body and face bar formulation
224
+ is so hydrating and gentle on skin that you can also use it to wash your face
225
+ or to shave refreshing clean feel clean and invigorated with this hydrating body
226
+ cleanser and face bar and enjoy its crisp refreshing scent of mandarin citrus
227
+ patchouli and rosemary designed for men mens skin care is different men sweat
228
+ more have thicker skin and lose hydration faster dove mencare is specifically
229
+ developed for mens skin to protect it against skin dryness pledge for paternity
230
+ from the dermatologist recommended mens brand were committed to making paid paternity
231
+ leave the new standard
232
+ - backdrop stand xft photo video studio adjustable backdrop stand for parties wedding
233
+ photography advertising display note balloons colorful decorations are not included
234
+ meaningful family gathering cheerful and memorable birthday party romantic and
235
+ warm wedding professional photography attractive advertising display and other
236
+ uses just get ft backdrop stands adjustable stable backdrop stand loading capacity
237
+ up to lbkg good flexibility with width ftft height ft ft heavy duty spring clamp
238
+ and backdrop elastic string clip to holds curtains canvas muslin projector screen
239
+ or seamless paper prevent background slippage photography weight sandbag to stabilize
240
+ the backdrop support system note this bag comes empty its capacity is about lbub
241
+ the kit includes pcs crossbar parts pcs tripod bracket pcs spring clamp pcs
242
+ backdrop elastic string clip pcs sand bag pc carry bag
243
+ - 'burts bees baby baby girls mittens noscratch mitts organic cotton set of '
244
+ ---
245
+
246
+ # SentenceTransformer based on avsolatorio/GIST-Embedding-v0
247
+
248
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [avsolatorio/GIST-Embedding-v0](https://huggingface.co/avsolatorio/GIST-Embedding-v0). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
249
+
250
+ ## Model Details
251
+
252
+ ### Model Description
253
+ - **Model Type:** Sentence Transformer
254
+ - **Base model:** [avsolatorio/GIST-Embedding-v0](https://huggingface.co/avsolatorio/GIST-Embedding-v0) <!-- at revision bf6b2e55e92f510a570ad4d7d2da2ec8cd22590c -->
255
+ - **Maximum Sequence Length:** 512 tokens
256
+ - **Output Dimensionality:** 768 tokens
257
+ - **Similarity Function:** Cosine Similarity
258
+ <!-- - **Training Dataset:** Unknown -->
259
+ <!-- - **Language:** Unknown -->
260
+ <!-- - **License:** Unknown -->
261
+
262
+ ### Model Sources
263
+
264
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
265
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
266
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
267
+
268
+ ### Full Model Architecture
269
+
270
+ ```
271
+ SentenceTransformer(
272
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
273
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
274
+ (2): Normalize()
275
+ )
276
+ ```
277
+
278
+ ## Usage
279
+
280
+ ### Direct Usage (Sentence Transformers)
281
+
282
+ First install the Sentence Transformers library:
283
+
284
+ ```bash
285
+ pip install -U sentence-transformers
286
+ ```
287
+
288
+ Then you can load this model and run inference.
289
+ ```python
290
+ from sentence_transformers import SentenceTransformer
291
+
292
+ # Download from the 🤗 Hub
293
+ model = SentenceTransformer("wjunwei/ecommerce_text_embedding")
294
+ # Run inference
295
+ sentences = [
296
+ 'magic the gathering modern horizons bundle gift edition deluxe bundle with collector booster play boosters fullart lands exclusive accessories give the gift of magicthe perfect present for any magic fan this gift bundle is full of exclusive accessories and the coolest cards from modern horizons including a collector booster packed with rare cards shiny foils and altart altframe cards moderns never been more marvelousintroducing a heaping helping of exciting cards for modern one of magics most celebrated formats plus the return of competitive favorites theres something for every fan to love in modern horizons a collector booster full of treasuresboth powerful and flashy the included collector booster is packed with only the coolest cards and is the only place fans may find foiletched cards textured foil cards or serialized cards printed with a unique number boosters for playing building decksplay boosters are the best packs for playing magic with friends and are fun to open with a possibility of multiple rares and at least shining foil card in every pack fullart lands special alternateart cardevery mh bundle gift edition also comes with traditional foil promo card featuring bundleexclusive alternate art and of the included land cards feature stunning fullart foil nonfoil exclusive accessorieseach gift bundle also comes with exclusive accessories including a special die to track your life total as you play and a sturdy box to store everything in bundle gift edition contents collector booster play boosters traditional foil altart card land cards foil nonfoil spindown life counter card storage box and reference cards',
297
+ 'backdrop stand xft photo video studio adjustable backdrop stand for parties wedding photography advertising display note balloons colorful decorations are not included meaningful family gathering cheerful and memorable birthday party romantic and warm wedding professional photography attractive advertising display and other uses just get ft backdrop stands adjustable stable backdrop stand loading capacity up to lbkg good flexibility with width ftft height ft ft heavy duty spring clamp and backdrop elastic string clip to holds curtains canvas muslin projector screen or seamless paper prevent background slippage photography weight sandbag to stabilize the backdrop support system note this bag comes empty its capacity is about lbub the kit includes pcs crossbar parts pcs tripod bracket pcs spring clamp pcs backdrop elastic string clip pcs sand bag pc carry bag',
298
+ 'burts bees baby baby girls mittens noscratch mitts organic cotton set of ',
299
+ ]
300
+ embeddings = model.encode(sentences)
301
+ print(embeddings.shape)
302
+ # [3, 768]
303
+
304
+ # Get the similarity scores for the embeddings
305
+ similarities = model.similarity(embeddings, embeddings)
306
+ print(similarities.shape)
307
+ # [3, 3]
308
+ ```
309
+
310
+ <!--
311
+ ### Direct Usage (Transformers)
312
+
313
+ <details><summary>Click to see the direct usage in Transformers</summary>
314
+
315
+ </details>
316
+ -->
317
+
318
+ <!--
319
+ ### Downstream Usage (Sentence Transformers)
320
+
321
+ You can finetune this model on your own dataset.
322
+
323
+ <details><summary>Click to expand</summary>
324
+
325
+ </details>
326
+ -->
327
+
328
+ <!--
329
+ ### Out-of-Scope Use
330
+
331
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
332
+ -->
333
+
334
+ <!--
335
+ ## Bias, Risks and Limitations
336
+
337
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
338
+ -->
339
+
340
+ <!--
341
+ ### Recommendations
342
+
343
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
344
+ -->
345
+
346
+ ## Training Details
347
+
348
+ ### Training Dataset
349
+
350
+ #### Unnamed Dataset
351
+
352
+
353
+ * Size: 5,864 training samples
354
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
355
+ * Approximate statistics based on the first 1000 samples:
356
+ | | sentence_0 | sentence_1 | label |
357
+ |:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------|
358
+ | type | string | string | int |
359
+ | details | <ul><li>min: 5 tokens</li><li>mean: 144.3 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 143.08 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>0: ~87.50%</li><li>1: ~12.50%</li></ul> |
360
+ * Samples:
361
+ | sentence_0 | sentence_1 | label |
362
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
363
+ | <code>philips norelco shaver rechargeable electric shaver with popup trimmer s comfortcut blades get a clean shave thats comfortable on your skin rounded blade caps shield selfsharpening blades to gently cut hair just above skin level and help the shaver glide smoothly over your skin experience a convenient clean shave with heads that flex and float in directions the head adjusts to the curves of your face ensuring smooth contact with your skin without a lot of pressure d flex heads follow your faces contours for a clean shave popup trimmer for mustache and sideburns finish your look with the builtin trimmer ideal for maintaining your mustache and trimming your sideburns onetouch open for easy cleaning experience a convenient clean shave with heads that flex and float in directions the head adjusts to the curves of your face ensuring smooth contact with your skin without a lot of pressure minutes of cordless shaving from an hour charge thats about shaves or plug it in for instant continuous power</code> | <code>philips norelco shaver rechargeable electric shaver with popup trimmer s comfortcut blades get a clean shave thats comfortable on your skin rounded blade caps shield selfsharpening blades to gently cut hair just above skin level and help the shaver glide smoothly over your skin experience a convenient clean shave with heads that flex and float in directions the head adjusts to the curves of your face ensuring smooth contact with your skin without a lot of pressure d flex heads follow your faces contours for a clean shave popup trimmer for mustache and sideburns finish your look with the builtin trimmer ideal for maintaining your mustache and trimming your sideburns onetouch open for easy cleaning experience a convenient clean shave with heads that flex and float in directions the head adjusts to the curves of your face ensuring smooth contact with your skin without a lot of pressure minutes of cordless shaving from an hour charge thats about shaves or plug it in for instant continuous power</code> | <code>1</code> |
364
+ | <code>speedo girls swimsuit one piece thin straps </code> | <code>lace tulle flower girl long dress for wedding oneck princess dresses long sleeve pageant party gown </code> | <code>0</code> |
365
+ | <code>pyrex blue cup rectangular plastic cover pc pack original genuine pyrex made in the usa genuine pyrex replacement lid nonporous surface does not absorb food odors flavors or stains refrigerator microwave and toprack dishwasher safe will not fit anchor hocking products included pyrex pc blue cup rectangle plastic lids lids only containers not included</code> | <code>kpywzer vintage leather sling bag bags for men women backpack shoulder messenger crossbody outdoor travel hiking camping tactical chest pack daypack brown </code> | <code>0</code> |
366
+ * Loss: [<code>ContrastiveTensionLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastivetensionloss)
367
+
368
+ ### Training Hyperparameters
369
+ #### Non-Default Hyperparameters
370
+
371
+ - `num_train_epochs`: 5
372
+ - `multi_dataset_batch_sampler`: round_robin
373
+
374
+ #### All Hyperparameters
375
+ <details><summary>Click to expand</summary>
376
+
377
+ - `overwrite_output_dir`: False
378
+ - `do_predict`: False
379
+ - `eval_strategy`: no
380
+ - `prediction_loss_only`: True
381
+ - `per_device_train_batch_size`: 8
382
+ - `per_device_eval_batch_size`: 8
383
+ - `per_gpu_train_batch_size`: None
384
+ - `per_gpu_eval_batch_size`: None
385
+ - `gradient_accumulation_steps`: 1
386
+ - `eval_accumulation_steps`: None
387
+ - `learning_rate`: 5e-05
388
+ - `weight_decay`: 0.0
389
+ - `adam_beta1`: 0.9
390
+ - `adam_beta2`: 0.999
391
+ - `adam_epsilon`: 1e-08
392
+ - `max_grad_norm`: 1
393
+ - `num_train_epochs`: 5
394
+ - `max_steps`: -1
395
+ - `lr_scheduler_type`: linear
396
+ - `lr_scheduler_kwargs`: {}
397
+ - `warmup_ratio`: 0.0
398
+ - `warmup_steps`: 0
399
+ - `log_level`: passive
400
+ - `log_level_replica`: warning
401
+ - `log_on_each_node`: True
402
+ - `logging_nan_inf_filter`: True
403
+ - `save_safetensors`: True
404
+ - `save_on_each_node`: False
405
+ - `save_only_model`: False
406
+ - `restore_callback_states_from_checkpoint`: False
407
+ - `no_cuda`: False
408
+ - `use_cpu`: False
409
+ - `use_mps_device`: False
410
+ - `seed`: 42
411
+ - `data_seed`: None
412
+ - `jit_mode_eval`: False
413
+ - `use_ipex`: False
414
+ - `bf16`: False
415
+ - `fp16`: False
416
+ - `fp16_opt_level`: O1
417
+ - `half_precision_backend`: auto
418
+ - `bf16_full_eval`: False
419
+ - `fp16_full_eval`: False
420
+ - `tf32`: None
421
+ - `local_rank`: 0
422
+ - `ddp_backend`: None
423
+ - `tpu_num_cores`: None
424
+ - `tpu_metrics_debug`: False
425
+ - `debug`: []
426
+ - `dataloader_drop_last`: False
427
+ - `dataloader_num_workers`: 0
428
+ - `dataloader_prefetch_factor`: None
429
+ - `past_index`: -1
430
+ - `disable_tqdm`: False
431
+ - `remove_unused_columns`: True
432
+ - `label_names`: None
433
+ - `load_best_model_at_end`: False
434
+ - `ignore_data_skip`: False
435
+ - `fsdp`: []
436
+ - `fsdp_min_num_params`: 0
437
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
438
+ - `fsdp_transformer_layer_cls_to_wrap`: None
439
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
440
+ - `deepspeed`: None
441
+ - `label_smoothing_factor`: 0.0
442
+ - `optim`: adamw_torch
443
+ - `optim_args`: None
444
+ - `adafactor`: False
445
+ - `group_by_length`: False
446
+ - `length_column_name`: length
447
+ - `ddp_find_unused_parameters`: None
448
+ - `ddp_bucket_cap_mb`: None
449
+ - `ddp_broadcast_buffers`: False
450
+ - `dataloader_pin_memory`: True
451
+ - `dataloader_persistent_workers`: False
452
+ - `skip_memory_metrics`: True
453
+ - `use_legacy_prediction_loop`: False
454
+ - `push_to_hub`: False
455
+ - `resume_from_checkpoint`: None
456
+ - `hub_model_id`: None
457
+ - `hub_strategy`: every_save
458
+ - `hub_private_repo`: False
459
+ - `hub_always_push`: False
460
+ - `gradient_checkpointing`: False
461
+ - `gradient_checkpointing_kwargs`: None
462
+ - `include_inputs_for_metrics`: False
463
+ - `eval_do_concat_batches`: True
464
+ - `fp16_backend`: auto
465
+ - `push_to_hub_model_id`: None
466
+ - `push_to_hub_organization`: None
467
+ - `mp_parameters`:
468
+ - `auto_find_batch_size`: False
469
+ - `full_determinism`: False
470
+ - `torchdynamo`: None
471
+ - `ray_scope`: last
472
+ - `ddp_timeout`: 1800
473
+ - `torch_compile`: False
474
+ - `torch_compile_backend`: None
475
+ - `torch_compile_mode`: None
476
+ - `dispatch_batches`: None
477
+ - `split_batches`: None
478
+ - `include_tokens_per_second`: False
479
+ - `include_num_input_tokens_seen`: False
480
+ - `neftune_noise_alpha`: None
481
+ - `optim_target_modules`: None
482
+ - `batch_eval_metrics`: False
483
+ - `batch_sampler`: batch_sampler
484
+ - `multi_dataset_batch_sampler`: round_robin
485
+
486
+ </details>
487
+
488
+ ### Training Logs
489
+ | Epoch | Step | Training Loss |
490
+ |:------:|:----:|:-------------:|
491
+ | 0.6821 | 500 | 7.4403 |
492
+ | 1.3643 | 1000 | 4.8536 |
493
+ | 2.0464 | 1500 | 3.8646 |
494
+ | 2.7285 | 2000 | 3.8877 |
495
+ | 3.4106 | 2500 | 3.8789 |
496
+ | 4.0928 | 3000 | 3.8052 |
497
+ | 4.7749 | 3500 | 3.8385 |
498
+
499
+
500
+ ### Framework Versions
501
+ - Python: 3.10.12
502
+ - Sentence Transformers: 3.0.1
503
+ - Transformers: 4.41.2
504
+ - PyTorch: 2.3.0+cu121
505
+ - Accelerate: 0.31.0
506
+ - Datasets: 2.20.0
507
+ - Tokenizers: 0.19.1
508
+
509
+ ## Citation
510
+
511
+ ### BibTeX
512
+
513
+ #### Sentence Transformers
514
+ ```bibtex
515
+ @inproceedings{reimers-2019-sentence-bert,
516
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
517
+ author = "Reimers, Nils and Gurevych, Iryna",
518
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
519
+ month = "11",
520
+ year = "2019",
521
+ publisher = "Association for Computational Linguistics",
522
+ url = "https://arxiv.org/abs/1908.10084",
523
+ }
524
+ ```
525
+
526
+ #### ContrastiveTensionLoss
527
+ ```bibtex
528
+ @inproceedings{carlsson2021semantic,
529
+ title={Semantic Re-tuning with Contrastive Tension},
530
+ author={Fredrik Carlsson and Amaru Cuba Gyllensten and Evangelia Gogoulou and Erik Ylip{"a}{"a} Hellqvist and Magnus Sahlgren},
531
+ booktitle={International Conference on Learning Representations},
532
+ year={2021},
533
+ url={https://openreview.net/forum?id=Ov_sMNau-PF}
534
+ }
535
+ ```
536
+
537
+ <!--
538
+ ## Glossary
539
+
540
+ *Clearly define terms in order to be accessible across audiences.*
541
+ -->
542
+
543
+ <!--
544
+ ## Model Card Authors
545
+
546
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
547
+ -->
548
+
549
+ <!--
550
+ ## Model Card Contact
551
+
552
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
553
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "avsolatorio/GIST-Embedding-v0",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.41.2",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.3.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a8e1769362336ab67d958528542b4e985e2302c57e2758d94576226de94cd5d
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff