himanshu23099 commited on
Commit
a84c06b
·
verified ·
1 Parent(s): 267650a

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,749 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:3507
8
+ - loss:GISTEmbedLoss
9
+ base_model: BAAI/bge-small-en-v1.5
10
+ widget:
11
+ - source_sentence: What skills and traditions do the Akharas display?
12
+ sentences:
13
+ - "Are there specific vendors recommended for tent city booking?\n Yes, there are\
14
+ \ 7 approved vendors for setting up bookings in the Tent City for Kumbh Mela including\
15
+ \ : UP Tourism Tent Colony; Rishikul Kumbh Cottages; Aagman Maha Kumbh; Kumbh\
16
+ \ Village; Kumbh Camp India; Shivadya Kumbh Canvas. For more information about\
17
+ \ these vendors and their services, please click here"
18
+ - The Akharas display a wide range of skills and traditions that reflect their deep
19
+ spiritual heritage and ascetic practices. These include martial arts training,
20
+ such as wrestling, sword fighting, and the use of traditional weapons like tridents
21
+ (trishuls), maces (gada), and spears. Such skills symbolize their readiness to
22
+ protect Dharma and their spiritual communities. Additionally, Akharas emphasize
23
+ the tradition of Yoga and meditation, teaching various asanas and techniques for
24
+ self-discipline and spiritual growth. They also focus on Vedic rituals, chanting,
25
+ and sacred ceremonies to maintain their connection with the divine. Akharas uphold
26
+ the practice of 'Vairagya' or renunciation, where sadhus detach from worldly desires
27
+ to pursue a path of spiritual enlightenment. These traditions are on full display
28
+ during the Kumbh Mela, especially during the Shahi Snan, where the Naga Sadhus
29
+ lead the processions with their unique practices and skills.
30
+ - On a bright summer afternoon, the children gathered at the edge of the park, their
31
+ laughter echoing through the trees. They played games, running around with colorful
32
+ kites soaring high against the azure sky. Some kids chose to ride their bicycles
33
+ along the winding paths, while others set up a picnic with sandwiches and juice
34
+ boxes spread out on a checkered blanket. Nearby, a couple of dogs chased each
35
+ other joyfully, their tails wagging with uncontainable excitement as the scent
36
+ of fresh grass filled the air. The sun slowly dipped toward the horizon, casting
37
+ a warm golden glow, and everyone paused to watch the beauty of the sunset while
38
+ sharing stories, bonding over the simple joys of life. The day shimmered with
39
+ happiness, creating memories that would last long after the sun had set.
40
+ - source_sentence: Refund kab milega
41
+ sentences:
42
+ - "How late can I make changes to my booking before the tour date?\n Refunds and\
43
+ \ changes to bookings are subject to the following cancellation policy:\n \n 15\
44
+ \ days or more in advance: 90% of the booking amount will be refunded\n 10-15\
45
+ \ days in advance: 75% of the booking amount will be refunded\n 3-10 days in advance:\
46
+ \ 50% of the booking amount will be refunded\n Less than 3 days in advance: No\
47
+ \ refund\n \n Please make any changes or cancellations well in advance to avoid\
48
+ \ forfeiting your booking amount."
49
+ - "Is there any provision for women-only E-Rickshaws for added safety and comfort?\n\
50
+ \ No, there is no provision for women-only E-Rickshaws"
51
+ - 'Can I pay for the tour in installments?
52
+
53
+ No, the tour fee must be paid in full at the time of booking. Unfortunately, installment
54
+ plans are not available. Ensure that full payment is made to secure your booking
55
+ well in advance.'
56
+ - source_sentence: Are there any dedicated helpdesks or kiosks at the Airport for
57
+ information about transport to the Mela?
58
+ sentences:
59
+ - The forest is alive with the sounds of rustling leaves and chirping birds. As
60
+ the sun rises, a golden light filters through the trees, creating a magical atmosphere.
61
+ Walkers often find solace in nature, where the peaceful surroundings can soothe
62
+ the mind and inspire creativity. Each path taken may lead to a hidden waterfall
63
+ or a scenic overlook, inviting exploration and adventure.
64
+ - "What is Aarti\n In India, since ancient times, rivers are worshipped due to their\
65
+ \ importance to the human life. \n \n Likewise, in Tirathraj Prayagraj, Aartis’\
66
+ \ are performed on the banks of Ganga, Yamuna and at Sangam with great admiration,\
67
+ \ deep-rooted honor and devotion. In Prayagraj, Prayagraj Mela Authority and various\
68
+ \ other communities make grand arrangements for these Aartis.\n \n The Aartis\
69
+ \ are performed in the mornings and evenings, in which priests (Batuks), normally\
70
+ \ 5 to 7 in number, chant hymns with great fervor, holding meticulously designed\
71
+ \ lamps and worship the rivers with utmost devotion. \n \n The lamps held by the\
72
+ \ batuks represent the importance of panchtatva. On one hand, flames of the lamps\
73
+ \ signify bowing to the waters of the sacred rivers and on the other, the holy\
74
+ \ fumes emanating from the lamps appear to play the mystic of heaven on earth.\
75
+ \ \n List of Aliases: [['Prayag', 'Sangam'], ['Allahabad', 'PYG', 'Prayagraj'],\
76
+ \ ['Batuks', 'priests']]"
77
+ - Yes, there are people available to help you with transport information at the
78
+ airport. Tourist information centers would also be available across the city to
79
+ guide pilgrims to the Mela.
80
+ - source_sentence: Peeshwai Akhara time
81
+ sentences:
82
+ - "What is the connection between Akharas and Shahi Snan?\nAkharas are the central\
83
+ \ focus of the Shahi Snan during the Mahakumbh Mela. \U0001F549️\n \n The Akharas\
84
+ \ lead this ritual bath, with their Mahamandaleshwar taking the first dip in the\
85
+ \ sacred waters of the Sangam.\n \n The Akharas enter the bathing ghats in a grand\
86
+ \ procession, which includes chariots, elephants, horses, bands, and chanting\
87
+ \ saints and their followers."
88
+ - "When does Peshwai take place?\n The Peshwai of the Akharas is the first major\
89
+ \ attraction of the Mahakumbh. When the Akharas enter the Kumbh city with full\
90
+ \ grandeur, this is called the Peshwai. The Peshwai of each Akhara is conducted\
91
+ \ with proper rituals before the fair officially begins. \n List of Aliases:\
92
+ \ [['Peshwai', 'entry of Akharas with full grandeur', 'event', 'first major attraction\
93
+ \ of the Mahakumbh'], ['Akhada Darshan', 'Akharas'], , ['Akhand', 'Akhara', 'Kalpwasi\
94
+ \ Camp', 'Naga', 'Nagas', 'Sadhu', 'sadhus']]"
95
+ - Yes, towing services are available if your vehicle breaks down in the parking
96
+ lot.
97
+ - source_sentence: How long does it typically take to enter or exit the parking area
98
+ during peak times?
99
+ sentences:
100
+ - In a remote village, the annual kite festival attracts many visitors who come
101
+ to see the vibrant displays. The event showcases dozens of kites soaring high,
102
+ each crafted with unique designs. Local artisans prepare for months, selecting
103
+ colors and materials to make the best creations. Everyone enjoys the lively atmosphere
104
+ filled with music and laughter.
105
+ - 'What is the history and significance of the University of Allahabad?
106
+
107
+ Established in 1887, University of Allahabad is a prestigious educational institution.
108
+ It has a grand campus with prominent architectural structures:
109
+
110
+ The Science Faculty, formerly known as Muir Central College, is a notable building
111
+ showcasing Indo-Saracenic architecture. The structure includes a central 200 ft.
112
+ tower, and the interiors are adorned with marble and mosaic from Mirzapur.
113
+
114
+ The Arts Faculty and other buildings, constructed between 1910 and 1915, are renowned
115
+ for their architectural significance. It’s also historically significant as Rudyard
116
+ Kipling stayed here during 1888-89.'
117
+ - The time to enter or exit the parking area during peak times can vary based on
118
+ crowd density, time of day, and traffic management. Generally, it takes about
119
+ 2 to 10 minutes.
120
+ pipeline_tag: sentence-similarity
121
+ library_name: sentence-transformers
122
+ metrics:
123
+ - cosine_accuracy@1
124
+ - cosine_accuracy@5
125
+ - cosine_accuracy@10
126
+ - cosine_precision@1
127
+ - cosine_precision@5
128
+ - cosine_precision@10
129
+ - cosine_recall@1
130
+ - cosine_recall@5
131
+ - cosine_recall@10
132
+ - cosine_ndcg@5
133
+ - cosine_ndcg@10
134
+ - cosine_ndcg@100
135
+ - cosine_mrr@5
136
+ - cosine_mrr@10
137
+ - cosine_mrr@100
138
+ - cosine_map@100
139
+ model-index:
140
+ - name: SentenceTransformer based on BAAI/bge-small-en-v1.5
141
+ results:
142
+ - task:
143
+ type: information-retrieval
144
+ name: Information Retrieval
145
+ dataset:
146
+ name: val evaluator
147
+ type: val_evaluator
148
+ metrics:
149
+ - type: cosine_accuracy@1
150
+ value: 0.3443557582668187
151
+ name: Cosine Accuracy@1
152
+ - type: cosine_accuracy@5
153
+ value: 0.7229190421892816
154
+ name: Cosine Accuracy@5
155
+ - type: cosine_accuracy@10
156
+ value: 0.8038768529076397
157
+ name: Cosine Accuracy@10
158
+ - type: cosine_precision@1
159
+ value: 0.3443557582668187
160
+ name: Cosine Precision@1
161
+ - type: cosine_precision@5
162
+ value: 0.14458380843785631
163
+ name: Cosine Precision@5
164
+ - type: cosine_precision@10
165
+ value: 0.08038768529076395
166
+ name: Cosine Precision@10
167
+ - type: cosine_recall@1
168
+ value: 0.3443557582668187
169
+ name: Cosine Recall@1
170
+ - type: cosine_recall@5
171
+ value: 0.7229190421892816
172
+ name: Cosine Recall@5
173
+ - type: cosine_recall@10
174
+ value: 0.8038768529076397
175
+ name: Cosine Recall@10
176
+ - type: cosine_ndcg@5
177
+ value: 0.5504290811876199
178
+ name: Cosine Ndcg@5
179
+ - type: cosine_ndcg@10
180
+ value: 0.5765613499697346
181
+ name: Cosine Ndcg@10
182
+ - type: cosine_ndcg@100
183
+ value: 0.614171229811746
184
+ name: Cosine Ndcg@100
185
+ - type: cosine_mrr@5
186
+ value: 0.4926263778031162
187
+ name: Cosine Mrr@5
188
+ - type: cosine_mrr@10
189
+ value: 0.5033795768402376
190
+ name: Cosine Mrr@10
191
+ - type: cosine_mrr@100
192
+ value: 0.5113051664568566
193
+ name: Cosine Mrr@100
194
+ - type: cosine_map@100
195
+ value: 0.5113051664568576
196
+ name: Cosine Map@100
197
+ ---
198
+
199
+ # SentenceTransformer based on BAAI/bge-small-en-v1.5
200
+
201
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
202
+
203
+ ## Model Details
204
+
205
+ ### Model Description
206
+ - **Model Type:** Sentence Transformer
207
+ - **Base model:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) <!-- at revision 5c38ec7c405ec4b44b94cc5a9bb96e735b38267a -->
208
+ - **Maximum Sequence Length:** 512 tokens
209
+ - **Output Dimensionality:** 384 dimensions
210
+ - **Similarity Function:** Cosine Similarity
211
+ <!-- - **Training Dataset:** Unknown -->
212
+ <!-- - **Language:** Unknown -->
213
+ <!-- - **License:** Unknown -->
214
+
215
+ ### Model Sources
216
+
217
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
218
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
219
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
220
+
221
+ ### Full Model Architecture
222
+
223
+ ```
224
+ SentenceTransformer(
225
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
226
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
227
+ (2): Normalize()
228
+ )
229
+ ```
230
+
231
+ ## Usage
232
+
233
+ ### Direct Usage (Sentence Transformers)
234
+
235
+ First install the Sentence Transformers library:
236
+
237
+ ```bash
238
+ pip install -U sentence-transformers
239
+ ```
240
+
241
+ Then you can load this model and run inference.
242
+ ```python
243
+ from sentence_transformers import SentenceTransformer
244
+
245
+ # Download from the 🤗 Hub
246
+ model = SentenceTransformer("himanshu23099/bge_embedding_finetune_v2")
247
+ # Run inference
248
+ sentences = [
249
+ 'How long does it typically take to enter or exit the parking area during peak times?',
250
+ 'The time to enter or exit the parking area during peak times can vary based on crowd density, time of day, and traffic management. Generally, it takes about 2 to 10 minutes.',
251
+ 'In a remote village, the annual kite festival attracts many visitors who come to see the vibrant displays. The event showcases dozens of kites soaring high, each crafted with unique designs. Local artisans prepare for months, selecting colors and materials to make the best creations. Everyone enjoys the lively atmosphere filled with music and laughter.',
252
+ ]
253
+ embeddings = model.encode(sentences)
254
+ print(embeddings.shape)
255
+ # [3, 384]
256
+
257
+ # Get the similarity scores for the embeddings
258
+ similarities = model.similarity(embeddings, embeddings)
259
+ print(similarities.shape)
260
+ # [3, 3]
261
+ ```
262
+
263
+ <!--
264
+ ### Direct Usage (Transformers)
265
+
266
+ <details><summary>Click to see the direct usage in Transformers</summary>
267
+
268
+ </details>
269
+ -->
270
+
271
+ <!--
272
+ ### Downstream Usage (Sentence Transformers)
273
+
274
+ You can finetune this model on your own dataset.
275
+
276
+ <details><summary>Click to expand</summary>
277
+
278
+ </details>
279
+ -->
280
+
281
+ <!--
282
+ ### Out-of-Scope Use
283
+
284
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
285
+ -->
286
+
287
+ ## Evaluation
288
+
289
+ ### Metrics
290
+
291
+ #### Information Retrieval
292
+
293
+ * Dataset: `val_evaluator`
294
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
295
+
296
+ | Metric | Value |
297
+ |:--------------------|:-----------|
298
+ | cosine_accuracy@1 | 0.3444 |
299
+ | cosine_accuracy@5 | 0.7229 |
300
+ | cosine_accuracy@10 | 0.8039 |
301
+ | cosine_precision@1 | 0.3444 |
302
+ | cosine_precision@5 | 0.1446 |
303
+ | cosine_precision@10 | 0.0804 |
304
+ | cosine_recall@1 | 0.3444 |
305
+ | cosine_recall@5 | 0.7229 |
306
+ | cosine_recall@10 | 0.8039 |
307
+ | cosine_ndcg@5 | 0.5504 |
308
+ | cosine_ndcg@10 | 0.5766 |
309
+ | **cosine_ndcg@100** | **0.6142** |
310
+ | cosine_mrr@5 | 0.4926 |
311
+ | cosine_mrr@10 | 0.5034 |
312
+ | cosine_mrr@100 | 0.5113 |
313
+ | cosine_map@100 | 0.5113 |
314
+
315
+ <!--
316
+ ## Bias, Risks and Limitations
317
+
318
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
319
+ -->
320
+
321
+ <!--
322
+ ### Recommendations
323
+
324
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
325
+ -->
326
+
327
+ ## Training Details
328
+
329
+ ### Training Dataset
330
+
331
+ #### Unnamed Dataset
332
+
333
+
334
+ * Size: 3,507 training samples
335
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
336
+ * Approximate statistics based on the first 1000 samples:
337
+ | | anchor | positive | negative |
338
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
339
+ | type | string | string | string |
340
+ | details | <ul><li>min: 5 tokens</li><li>mean: 12.02 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 117.69 tokens</li><li>max: 504 tokens</li></ul> | <ul><li>min: 15 tokens</li><li>mean: 119.62 tokens</li><li>max: 422 tokens</li></ul> |
341
+ * Samples:
342
+ | anchor | positive | negative |
343
+ |:-------------------------------------------------------------------|||
344
+ | <code>Tour departs how city</code> | <code>What is the itinerary for 1-day Maihar tour?<br> Maihar tour departs from Hotel Ilawart, Prayagraj at 7:00 AM and includes visit to Maa Sharda Devi Temple located atop Trikoota Hill. For more details and booking, click here: https://bit.ly/3YBcbI6 <br> List of Aliases: [['Allahabad', 'PYG', 'Prayagraj']]</code> | <code>What one-day outstation tours are available from Prayagraj?<br>The one-day outstation tours from Prayagraj include destinations such as Ayodhya, Varanasi, Maihar, and Chitrakoot. These tours offer a quick yet enriching journey to some of the most significant spiritual and cultural sites near Prayagraj.<br><br>For more details, visit : https://bit.ly/4eWFRoH</code> |
345
+ | <code>How train for Prayag reach</code> | <code>Which airlines operate flights to Prayagraj?<br> Several airlines operate flights to Prayagraj, India. However, availability may depend on your location and the time of travel. Some of the airlines that typically operate flights to Prayagraj include:<br> <br> 1. Air India<br> 2. IndiGo<br> 3. SpiceJet<br> <br> For the most accurate and up-to-date information on train timings to Prayagraj, please visit the IRCTC website <https://www.irctc.co.in/nget/> <br> List of Aliases: [['Allahabad', 'PYG', 'Prayagraj']]</code> | <code>What is the best train route to Prayagraj from Ayodhya?<br>To travel by train from Ayodhya to Prayagraj, you can use the Indian Railways' services. Here is a general guide for the route:<br><br>1. Ayodhya Cantt (AY) to Prayagraj Junction (PRYJ) via Train No. 14203: This is one of the direct trains to Prayagraj from Ayodhya. It generally runs on Tuesday and Friday.<br><br>2. Ayodhya Cantt (AY) to Prayagraj Rambag (PRRB) via Train No. 14205: This train runs regularly and is another direct route to Prayagraj.<br><br>For the most accurate and up-to-date information on train timings to Prayagraj, please visit the IRCTC website <https://www.irctc.co.in/nget/></code> |
346
+ | <code>Why should one do the Prayagraj Panchkoshi Parikrama?</code> | <code>The Prayagraj Panchkoshi Parikrama is a deeply revered spiritual journey that offers multiple benefits to devotees. It is believed to grant blessings equivalent to visiting all sacred pilgrimage sites in India, providing divine grace and spiritual merit. The Parikrama route covers significant temples like the Dwadash Madhav temples, Akshayavat, and Mankameshwar, which are steeped in Hindu mythology and history, allowing pilgrims to connect with the spiritual and cultural heritage of Prayagraj. This circumambulation around sacred sites is also seen as a way to cleanse one's sins and progress towards Moksha (liberation from the cycle of birth and rebirth), making it a path of introspection and spiritual growth. The pilgrimage fosters unity among people from diverse backgrounds, offering a unique cultural exchange and shared spiritual experience. By participating, devotees also help revive an ancient tradition integral to the Kumbh Mela for centuries, reconnecting with age-old practices t...</code> | <code>Elevators are remarkable inventions that revolutionized how we navigate tall buildings. They provide a swift, efficient means of transportation between floors, making urban life more accessible. These mechanical wonders operate on a system of pulleys and counterweights, enabling them to carry heavy loads effortlessly. Safety features like emergency brakes and backup power systems ensure that passengers remain secure during their journey. Various designs and styles can be seen in buildings around the world, from sleek modern glass models to vintage models that evoke nostalgia. Elevators also highlight the advancement of engineering and technology over time, evolving from rudimentary designs to sophisticated machines with smart technology. They are essential in various settings, including residential, commercial, and industrial spaces, offering convenience and practicality. Their presence also allows for the efficient use of vertical space, fostering creativity in architectural designs a...</code> |
347
+ * Loss: [<code>GISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
348
+ ```json
349
+ {'guide': SentenceTransformer(
350
+ (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
351
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
352
+ (2): Normalize()
353
+ ), 'temperature': 0.01}
354
+ ```
355
+
356
+ ### Evaluation Dataset
357
+
358
+ #### Unnamed Dataset
359
+
360
+
361
+ * Size: 877 evaluation samples
362
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
363
+ * Approximate statistics based on the first 877 samples:
364
+ | | anchor | positive | negative |
365
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
366
+ | type | string | string | string |
367
+ | details | <ul><li>min: 4 tokens</li><li>mean: 12.13 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 117.82 tokens</li><li>max: 504 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 117.68 tokens</li><li>max: 422 tokens</li></ul> |
368
+ * Samples:
369
+ | anchor | positive | negative |
370
+ |:-------------------------------------------------------------------------------------------------------|||
371
+ | <code>Akhara means what</code> | <code>Is the word Akhara related to Akhand?<br> Many scholars believe that the word 'Akhara' originated from the word 'Akhand.' Initially, a group of armed ascetics was referred to as 'Akhand.' Over time, when these 'Akhand' groups evolved into centers for training in weaponry and martial arts, they came to be known as 'Akhara.' <br> List of Aliases: [['Akhand', 'Akhara', 'Kalpwasi Camp', 'Naga', 'Nagas', 'Sadhu', 'sadhus']]</code> | <code>Why did Adi Shankaracharya organize the Akharas?<br>According to the evidence available in the Akharas and the descriptions mentioned in their history, centuries ago, Adi Shankaracharya established these Akharas with the purpose of protecting Hindu temples and monasteries from foreign and non-believer invaders, as well as safeguarding the followers of Hinduism.<br> <br> Adi Shankaracharya believed that young saints should not only be proficient in scriptures (Shastra) but also in the art of weaponry (Shastra), so they could fulfill the duty of protecting the monasteries, temples, and their followers when necessary.</code> |
372
+ | <code>Why do so many people gather for this?</code> | <code>Millions gather for the Kumbh Mela due to its profound spiritual, cultural, and social significance. Rooted in ancient Hindu mythology, the Mela is believed to be an auspicious time when bathing in the sacred rivers—Ganga, Yamuna, and Saraswati—can cleanse sins and lead to spiritual liberation (Moksha). The event, occurring during rare celestial alignments, amplifies these spiritual benefits. It is a unique confluence of faith, where people from diverse backgrounds come together, creating a “mini-India” that fosters unity in diversity. \n The Mela also offers opportunities for spiritual learning through discourses by saints, religious rituals like Kalpvas, Deep Daan, and cultural performances. Moreover, the Kumbh Mela is a rare platform for connecting with spiritual leaders, experiencing religious tolerance, and participating in one of the world's largest peaceful gatherings, making it a must-attend event for millions seeking spiritual growth, community, and divine blessings.</code> | <code>In the bustling world of urban development, architects and city planners often seek innovative solutions to optimize living spaces. The integration of green spaces within urban environments not only enhances aesthetic appeal but also significantly improves residents' quality of life. Vertical gardens, rooftops, and community parks play a crucial role in providing habitats for local wildlife while promoting biodiversity in densely populated areas. <br><br>Furthermore, advancements in sustainable technology, such as solar panels and rainwater harvesting systems, are being incorporated into these designs, offering environmentally friendly alternatives that reduce utility costs for residents. Public art installations also contribute to community identity, fostering a sense of belonging among citizens. <br><br>Collaborative efforts between various stakeholders—governments, private sectors, and local communities—are essential to ensure these projects reflect the needs and desires of the people. The succ...</code> |
373
+ | <code>Do parking charges vary between different parking zones or proximity to the Mela grounds?</code> | <code>No, the parking charges are standardized and remain the same throughout, regardless of the parking zone or proximity to the Mela grounds. Charges are fixed at ₹5 for cycles, ₹15 for two-wheelers, ₹65 for 3-4 wheelers, and ₹260 for buses and heavy vehicles for 24 hours.</code> | <code>The ancient art of pottery involves molding clay into various shapes before firing it in a kiln. Traditionally, artisans use hand tools and techniques passed down through generations. Each region often has its own distinctive styles, resulting in a rich diversity of forms, glazes, and colors. Pottery can serve practical purposes, such as in cooking and storage, while also being a medium for artistic expression and cultural storytelling.</code> |
374
+ * Loss: [<code>GISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
375
+ ```json
376
+ {'guide': SentenceTransformer(
377
+ (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
378
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
379
+ (2): Normalize()
380
+ ), 'temperature': 0.01}
381
+ ```
382
+
383
+ ### Training Hyperparameters
384
+ #### Non-Default Hyperparameters
385
+
386
+ - `eval_strategy`: steps
387
+ - `per_device_train_batch_size`: 16
388
+ - `gradient_accumulation_steps`: 2
389
+ - `learning_rate`: 1e-05
390
+ - `weight_decay`: 0.01
391
+ - `num_train_epochs`: 30
392
+ - `warmup_ratio`: 0.1
393
+ - `load_best_model_at_end`: True
394
+
395
+ #### All Hyperparameters
396
+ <details><summary>Click to expand</summary>
397
+
398
+ - `overwrite_output_dir`: False
399
+ - `do_predict`: False
400
+ - `eval_strategy`: steps
401
+ - `prediction_loss_only`: True
402
+ - `per_device_train_batch_size`: 16
403
+ - `per_device_eval_batch_size`: 8
404
+ - `per_gpu_train_batch_size`: None
405
+ - `per_gpu_eval_batch_size`: None
406
+ - `gradient_accumulation_steps`: 2
407
+ - `eval_accumulation_steps`: None
408
+ - `torch_empty_cache_steps`: None
409
+ - `learning_rate`: 1e-05
410
+ - `weight_decay`: 0.01
411
+ - `adam_beta1`: 0.9
412
+ - `adam_beta2`: 0.999
413
+ - `adam_epsilon`: 1e-08
414
+ - `max_grad_norm`: 1.0
415
+ - `num_train_epochs`: 30
416
+ - `max_steps`: -1
417
+ - `lr_scheduler_type`: linear
418
+ - `lr_scheduler_kwargs`: {}
419
+ - `warmup_ratio`: 0.1
420
+ - `warmup_steps`: 0
421
+ - `log_level`: passive
422
+ - `log_level_replica`: warning
423
+ - `log_on_each_node`: True
424
+ - `logging_nan_inf_filter`: True
425
+ - `save_safetensors`: True
426
+ - `save_on_each_node`: False
427
+ - `save_only_model`: False
428
+ - `restore_callback_states_from_checkpoint`: False
429
+ - `no_cuda`: False
430
+ - `use_cpu`: False
431
+ - `use_mps_device`: False
432
+ - `seed`: 42
433
+ - `data_seed`: None
434
+ - `jit_mode_eval`: False
435
+ - `use_ipex`: False
436
+ - `bf16`: False
437
+ - `fp16`: False
438
+ - `fp16_opt_level`: O1
439
+ - `half_precision_backend`: auto
440
+ - `bf16_full_eval`: False
441
+ - `fp16_full_eval`: False
442
+ - `tf32`: None
443
+ - `local_rank`: 0
444
+ - `ddp_backend`: None
445
+ - `tpu_num_cores`: None
446
+ - `tpu_metrics_debug`: False
447
+ - `debug`: []
448
+ - `dataloader_drop_last`: False
449
+ - `dataloader_num_workers`: 0
450
+ - `dataloader_prefetch_factor`: None
451
+ - `past_index`: -1
452
+ - `disable_tqdm`: False
453
+ - `remove_unused_columns`: True
454
+ - `label_names`: None
455
+ - `load_best_model_at_end`: True
456
+ - `ignore_data_skip`: False
457
+ - `fsdp`: []
458
+ - `fsdp_min_num_params`: 0
459
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
460
+ - `fsdp_transformer_layer_cls_to_wrap`: None
461
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
462
+ - `deepspeed`: None
463
+ - `label_smoothing_factor`: 0.0
464
+ - `optim`: adamw_torch
465
+ - `optim_args`: None
466
+ - `adafactor`: False
467
+ - `group_by_length`: False
468
+ - `length_column_name`: length
469
+ - `ddp_find_unused_parameters`: None
470
+ - `ddp_bucket_cap_mb`: None
471
+ - `ddp_broadcast_buffers`: False
472
+ - `dataloader_pin_memory`: True
473
+ - `dataloader_persistent_workers`: False
474
+ - `skip_memory_metrics`: True
475
+ - `use_legacy_prediction_loop`: False
476
+ - `push_to_hub`: False
477
+ - `resume_from_checkpoint`: None
478
+ - `hub_model_id`: None
479
+ - `hub_strategy`: every_save
480
+ - `hub_private_repo`: False
481
+ - `hub_always_push`: False
482
+ - `gradient_checkpointing`: False
483
+ - `gradient_checkpointing_kwargs`: None
484
+ - `include_inputs_for_metrics`: False
485
+ - `include_for_metrics`: []
486
+ - `eval_do_concat_batches`: True
487
+ - `fp16_backend`: auto
488
+ - `push_to_hub_model_id`: None
489
+ - `push_to_hub_organization`: None
490
+ - `mp_parameters`:
491
+ - `auto_find_batch_size`: False
492
+ - `full_determinism`: False
493
+ - `torchdynamo`: None
494
+ - `ray_scope`: last
495
+ - `ddp_timeout`: 1800
496
+ - `torch_compile`: False
497
+ - `torch_compile_backend`: None
498
+ - `torch_compile_mode`: None
499
+ - `dispatch_batches`: None
500
+ - `split_batches`: None
501
+ - `include_tokens_per_second`: False
502
+ - `include_num_input_tokens_seen`: False
503
+ - `neftune_noise_alpha`: None
504
+ - `optim_target_modules`: None
505
+ - `batch_eval_metrics`: False
506
+ - `eval_on_start`: False
507
+ - `use_liger_kernel`: False
508
+ - `eval_use_gather_object`: False
509
+ - `average_tokens_across_devices`: False
510
+ - `prompts`: None
511
+ - `batch_sampler`: batch_sampler
512
+ - `multi_dataset_batch_sampler`: proportional
513
+
514
+ </details>
515
+
516
+ ### Training Logs
517
+ <details><summary>Click to expand</summary>
518
+
519
+ | Epoch | Step | Training Loss | Validation Loss | val_evaluator_cosine_ndcg@100 |
520
+ |:-------:|:----:|:-------------:|:---------------:|:-----------------------------:|
521
+ | 0.0909 | 10 | 1.9717 | 1.2192 | 0.4285 |
522
+ | 0.1818 | 20 | 1.8228 | 1.1896 | 0.4307 |
523
+ | 0.2727 | 30 | 1.9999 | 1.1429 | 0.4310 |
524
+ | 0.3636 | 40 | 1.6463 | 1.0845 | 0.4311 |
525
+ | 0.4545 | 50 | 1.9207 | 1.0205 | 0.4334 |
526
+ | 0.5455 | 60 | 1.5777 | 0.9509 | 0.4338 |
527
+ | 0.6364 | 70 | 1.4277 | 0.8810 | 0.4376 |
528
+ | 0.7273 | 80 | 1.408 | 0.8130 | 0.4432 |
529
+ | 0.8182 | 90 | 1.3565 | 0.7535 | 0.4436 |
530
+ | 0.9091 | 100 | 1.3322 | 0.6935 | 0.4495 |
531
+ | 1.0 | 110 | 0.8344 | 0.6420 | 0.4518 |
532
+ | 1.0909 | 120 | 1.1696 | 0.5956 | 0.4515 |
533
+ | 1.1818 | 130 | 0.9622 | 0.5524 | 0.4565 |
534
+ | 1.2727 | 140 | 0.9005 | 0.5173 | 0.4616 |
535
+ | 1.3636 | 150 | 0.962 | 0.4802 | 0.4662 |
536
+ | 1.4545 | 160 | 0.7924 | 0.4497 | 0.4693 |
537
+ | 1.5455 | 170 | 0.8955 | 0.4262 | 0.4711 |
538
+ | 1.6364 | 180 | 0.7652 | 0.4031 | 0.4736 |
539
+ | 1.7273 | 190 | 0.7517 | 0.3804 | 0.4773 |
540
+ | 1.8182 | 200 | 0.5669 | 0.3636 | 0.4784 |
541
+ | 1.9091 | 210 | 0.6641 | 0.3469 | 0.4813 |
542
+ | 2.0 | 220 | 0.5227 | 0.3267 | 0.4820 |
543
+ | 2.0909 | 230 | 0.6146 | 0.3075 | 0.4843 |
544
+ | 2.1818 | 240 | 0.4709 | 0.2908 | 0.4882 |
545
+ | 2.2727 | 250 | 0.5963 | 0.2780 | 0.4955 |
546
+ | 2.3636 | 260 | 0.5103 | 0.2668 | 0.4977 |
547
+ | 2.4545 | 270 | 0.4833 | 0.2566 | 0.5027 |
548
+ | 2.5455 | 280 | 0.4389 | 0.2431 | 0.5045 |
549
+ | 2.6364 | 290 | 0.4653 | 0.2317 | 0.5059 |
550
+ | 2.7273 | 300 | 0.3559 | 0.2263 | 0.5086 |
551
+ | 2.8182 | 310 | 0.4623 | 0.2197 | 0.5127 |
552
+ | 2.9091 | 320 | 0.3889 | 0.2103 | 0.5183 |
553
+ | 3.0 | 330 | 0.4014 | 0.2037 | 0.5206 |
554
+ | 3.0909 | 340 | 0.2977 | 0.1999 | 0.5228 |
555
+ | 3.1818 | 350 | 0.4656 | 0.1956 | 0.5266 |
556
+ | 3.2727 | 360 | 0.436 | 0.1873 | 0.5288 |
557
+ | 3.3636 | 370 | 0.3111 | 0.1803 | 0.5311 |
558
+ | 3.4545 | 380 | 0.333 | 0.1759 | 0.5325 |
559
+ | 3.5455 | 390 | 0.2899 | 0.1717 | 0.5381 |
560
+ | 3.6364 | 400 | 0.4245 | 0.1663 | 0.5419 |
561
+ | 3.7273 | 410 | 0.4247 | 0.1658 | 0.5421 |
562
+ | 3.8182 | 420 | 0.2251 | 0.1646 | 0.5442 |
563
+ | 3.9091 | 430 | 0.2784 | 0.1635 | 0.5448 |
564
+ | 4.0 | 440 | 0.2503 | 0.1613 | 0.5490 |
565
+ | 4.0909 | 450 | 0.2342 | 0.1588 | 0.5501 |
566
+ | 4.1818 | 460 | 0.3139 | 0.1584 | 0.5527 |
567
+ | 4.2727 | 470 | 0.2356 | 0.1552 | 0.5498 |
568
+ | 4.3636 | 480 | 0.3147 | 0.1496 | 0.5518 |
569
+ | 4.4545 | 490 | 0.2691 | 0.1469 | 0.5508 |
570
+ | 4.5455 | 500 | 0.2639 | 0.1466 | 0.5561 |
571
+ | 4.6364 | 510 | 0.1581 | 0.1432 | 0.5625 |
572
+ | 4.7273 | 520 | 0.1922 | 0.1406 | 0.5663 |
573
+ | 4.8182 | 530 | 0.2453 | 0.1406 | 0.5688 |
574
+ | 4.9091 | 540 | 0.2631 | 0.1399 | 0.5705 |
575
+ | 5.0 | 550 | 0.3324 | 0.1402 | 0.5681 |
576
+ | 5.0909 | 560 | 0.1801 | 0.1389 | 0.5715 |
577
+ | 5.1818 | 570 | 0.2096 | 0.1371 | 0.5736 |
578
+ | 5.2727 | 580 | 0.2167 | 0.1344 | 0.5743 |
579
+ | 5.3636 | 590 | 0.1553 | 0.1297 | 0.5791 |
580
+ | 5.4545 | 600 | 0.1903 | 0.1263 | 0.5790 |
581
+ | 5.5455 | 610 | 0.1388 | 0.1241 | 0.5816 |
582
+ | 5.6364 | 620 | 0.2642 | 0.1231 | 0.5809 |
583
+ | 5.7273 | 630 | 0.2119 | 0.1238 | 0.5792 |
584
+ | 5.8182 | 640 | 0.1767 | 0.1216 | 0.5809 |
585
+ | 5.9091 | 650 | 0.2167 | 0.1218 | 0.5810 |
586
+ | 6.0 | 660 | 0.26 | 0.1232 | 0.5793 |
587
+ | 6.0909 | 670 | 0.1603 | 0.1222 | 0.5807 |
588
+ | 6.1818 | 680 | 0.1534 | 0.1209 | 0.5794 |
589
+ | 6.2727 | 690 | 0.1742 | 0.1165 | 0.5821 |
590
+ | 6.3636 | 700 | 0.1133 | 0.1120 | 0.5824 |
591
+ | 6.4545 | 710 | 0.1198 | 0.1106 | 0.5817 |
592
+ | 6.5455 | 720 | 0.2019 | 0.1114 | 0.5832 |
593
+ | 6.6364 | 730 | 0.2268 | 0.1116 | 0.5823 |
594
+ | 6.7273 | 740 | 0.1779 | 0.1077 | 0.5887 |
595
+ | 6.8182 | 750 | 0.1586 | 0.1048 | 0.5892 |
596
+ | 6.9091 | 760 | 0.2074 | 0.1057 | 0.5872 |
597
+ | 7.0 | 770 | 0.1625 | 0.1091 | 0.5881 |
598
+ | 7.0909 | 780 | 0.2266 | 0.1079 | 0.5900 |
599
+ | 7.1818 | 790 | 0.148 | 0.1054 | 0.5895 |
600
+ | 7.2727 | 800 | 0.1248 | 0.1048 | 0.5916 |
601
+ | 7.3636 | 810 | 0.1753 | 0.1047 | 0.5956 |
602
+ | 7.4545 | 820 | 0.109 | 0.1045 | 0.5981 |
603
+ | 7.5455 | 830 | 0.1369 | 0.1056 | 0.5953 |
604
+ | 7.6364 | 840 | 0.1209 | 0.1068 | 0.5946 |
605
+ | 7.7273 | 850 | 0.182 | 0.1079 | 0.5952 |
606
+ | 7.8182 | 860 | 0.1116 | 0.1083 | 0.5978 |
607
+ | 7.9091 | 870 | 0.1813 | 0.1033 | 0.5985 |
608
+ | 8.0 | 880 | 0.1559 | 0.1010 | 0.6027 |
609
+ | 8.0909 | 890 | 0.1384 | 0.1019 | 0.6017 |
610
+ | 8.1818 | 900 | 0.1057 | 0.1034 | 0.6004 |
611
+ | 8.2727 | 910 | 0.1359 | 0.1033 | 0.5994 |
612
+ | 8.3636 | 920 | 0.0909 | 0.1008 | 0.6011 |
613
+ | 8.4545 | 930 | 0.0995 | 0.0986 | 0.6030 |
614
+ | 8.5455 | 940 | 0.1261 | 0.0973 | 0.6046 |
615
+ | 8.6364 | 950 | 0.1031 | 0.0955 | 0.6013 |
616
+ | 8.7273 | 960 | 0.1163 | 0.0949 | 0.6018 |
617
+ | 8.8182 | 970 | 0.1493 | 0.0963 | 0.6041 |
618
+ | 8.9091 | 980 | 0.13 | 0.0967 | 0.6044 |
619
+ | 9.0 | 990 | 0.1059 | 0.0937 | 0.6044 |
620
+ | 9.0909 | 1000 | 0.1287 | 0.0923 | 0.6045 |
621
+ | 9.1818 | 1010 | 0.1019 | 0.0924 | 0.6086 |
622
+ | 9.2727 | 1020 | 0.1645 | 0.0921 | 0.6086 |
623
+ | 9.3636 | 1030 | 0.1395 | 0.0931 | 0.6075 |
624
+ | 9.4545 | 1040 | 0.1067 | 0.0935 | 0.6051 |
625
+ | 9.5455 | 1050 | 0.1334 | 0.0930 | 0.6058 |
626
+ | 9.6364 | 1060 | 0.136 | 0.0919 | 0.6069 |
627
+ | 9.7273 | 1070 | 0.0968 | 0.0930 | 0.6052 |
628
+ | 9.8182 | 1080 | 0.1447 | 0.0946 | 0.6077 |
629
+ | 9.9091 | 1090 | 0.1288 | 0.0967 | 0.6049 |
630
+ | 10.0 | 1100 | 0.1001 | 0.0960 | 0.6034 |
631
+ | 10.0909 | 1110 | 0.1642 | 0.0952 | 0.6000 |
632
+ | 10.1818 | 1120 | 0.1737 | 0.0926 | 0.6028 |
633
+ | 10.2727 | 1130 | 0.1283 | 0.0906 | 0.6023 |
634
+ | 10.3636 | 1140 | 0.0959 | 0.0906 | 0.6073 |
635
+ | 10.4545 | 1150 | 0.0875 | 0.0927 | 0.6065 |
636
+ | 10.5455 | 1160 | 0.1284 | 0.0934 | 0.6058 |
637
+ | 10.6364 | 1170 | 0.1482 | 0.0937 | 0.6049 |
638
+ | 10.7273 | 1180 | 0.1089 | 0.0925 | 0.6018 |
639
+ | 10.8182 | 1190 | 0.0876 | 0.0896 | 0.6068 |
640
+ | 10.9091 | 1200 | 0.0849 | 0.0897 | 0.6062 |
641
+ | 11.0 | 1210 | 0.1041 | 0.0897 | 0.6073 |
642
+ | 11.0909 | 1220 | 0.107 | 0.0889 | 0.6043 |
643
+ | 11.1818 | 1230 | 0.1018 | 0.0868 | 0.6059 |
644
+ | 11.2727 | 1240 | 0.0835 | 0.0846 | 0.6106 |
645
+ | 11.3636 | 1250 | 0.1455 | 0.0831 | 0.6069 |
646
+ | 11.4545 | 1260 | 0.1071 | 0.0832 | 0.6051 |
647
+ | 11.5455 | 1270 | 0.0777 | 0.0839 | 0.6054 |
648
+ | 11.6364 | 1280 | 0.1218 | 0.0855 | 0.6051 |
649
+ | 11.7273 | 1290 | 0.0702 | 0.0862 | 0.6048 |
650
+ | 11.8182 | 1300 | 0.1017 | 0.0865 | 0.6068 |
651
+ | 11.9091 | 1310 | 0.1452 | 0.0860 | 0.6074 |
652
+ | 12.0 | 1320 | 0.1563 | 0.0855 | 0.6073 |
653
+ | 12.0909 | 1330 | 0.1026 | 0.0858 | 0.6102 |
654
+ | 12.1818 | 1340 | 0.108 | 0.0861 | 0.6062 |
655
+ | 12.2727 | 1350 | 0.078 | 0.0854 | 0.6055 |
656
+ | 12.3636 | 1360 | 0.0655 | 0.0847 | 0.6082 |
657
+ | 12.4545 | 1370 | 0.1075 | 0.0836 | 0.6085 |
658
+ | 12.5455 | 1380 | 0.0875 | 0.0846 | 0.6049 |
659
+ | 12.6364 | 1390 | 0.1082 | 0.0828 | 0.6096 |
660
+ | 12.7273 | 1400 | 0.1133 | 0.0816 | 0.6077 |
661
+ | 12.8182 | 1410 | 0.0931 | 0.0814 | 0.6106 |
662
+ | 12.9091 | 1420 | 0.0728 | 0.0818 | 0.6085 |
663
+ | 13.0 | 1430 | 0.1338 | 0.0827 | 0.6082 |
664
+ | 13.0909 | 1440 | 0.1232 | 0.0813 | 0.6076 |
665
+ | 13.1818 | 1450 | 0.093 | 0.0796 | 0.6110 |
666
+ | 13.2727 | 1460 | 0.0994 | 0.0793 | 0.6090 |
667
+ | 13.3636 | 1470 | 0.0424 | 0.0806 | 0.6109 |
668
+ | 13.4545 | 1480 | 0.0598 | 0.0833 | 0.6086 |
669
+ | 13.5455 | 1490 | 0.0813 | 0.0841 | 0.6093 |
670
+ | 13.6364 | 1500 | 0.0913 | 0.0817 | 0.6125 |
671
+ | 13.7273 | 1510 | 0.1048 | 0.0801 | 0.6133 |
672
+ | 13.8182 | 1520 | 0.0503 | 0.0800 | 0.6110 |
673
+ | 13.9091 | 1530 | 0.0954 | 0.0800 | 0.6111 |
674
+ | 14.0 | 1540 | 0.067 | 0.0791 | 0.6099 |
675
+ | 14.0909 | 1550 | 0.0808 | 0.0779 | 0.6111 |
676
+ | 14.1818 | 1560 | 0.1047 | 0.0783 | 0.6110 |
677
+ | 14.2727 | 1570 | 0.0685 | 0.0791 | 0.6125 |
678
+ | 14.3636 | 1580 | 0.1215 | 0.0793 | 0.6120 |
679
+ | 14.4545 | 1590 | 0.0761 | 0.0794 | 0.6157 |
680
+ | 14.5455 | 1600 | 0.0705 | 0.0790 | 0.6136 |
681
+ | 14.6364 | 1610 | 0.0722 | 0.0785 | 0.6098 |
682
+ | 14.7273 | 1620 | 0.0881 | 0.0785 | 0.6120 |
683
+ | 14.8182 | 1630 | 0.0668 | 0.0791 | 0.6122 |
684
+ | 14.9091 | 1640 | 0.1261 | 0.0787 | 0.6152 |
685
+ | 15.0 | 1650 | 0.0601 | 0.0784 | 0.6148 |
686
+ | 15.0909 | 1660 | 0.0701 | 0.0799 | 0.6167 |
687
+ | 15.1818 | 1670 | 0.1244 | 0.0794 | 0.6160 |
688
+ | 15.2727 | 1680 | 0.0531 | 0.0788 | 0.6174 |
689
+ | 15.3636 | 1690 | 0.0518 | 0.0780 | 0.6154 |
690
+ | 15.4545 | 1700 | 0.0961 | 0.0784 | 0.6142 |
691
+ | 15.5455 | 1710 | 0.1041 | - | - |
692
+
693
+ </details>
694
+
695
+ ### Framework Versions
696
+ - Python: 3.10.12
697
+ - Sentence Transformers: 3.3.0
698
+ - Transformers: 4.46.2
699
+ - PyTorch: 2.5.1+cu121
700
+ - Accelerate: 1.1.1
701
+ - Datasets: 3.1.0
702
+ - Tokenizers: 0.20.3
703
+
704
+ ## Citation
705
+
706
+ ### BibTeX
707
+
708
+ #### Sentence Transformers
709
+ ```bibtex
710
+ @inproceedings{reimers-2019-sentence-bert,
711
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
712
+ author = "Reimers, Nils and Gurevych, Iryna",
713
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
714
+ month = "11",
715
+ year = "2019",
716
+ publisher = "Association for Computational Linguistics",
717
+ url = "https://arxiv.org/abs/1908.10084",
718
+ }
719
+ ```
720
+
721
+ #### GISTEmbedLoss
722
+ ```bibtex
723
+ @misc{solatorio2024gistembed,
724
+ title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
725
+ author={Aivin V. Solatorio},
726
+ year={2024},
727
+ eprint={2402.16829},
728
+ archivePrefix={arXiv},
729
+ primaryClass={cs.LG}
730
+ }
731
+ ```
732
+
733
+ <!--
734
+ ## Glossary
735
+
736
+ *Clearly define terms in order to be accessible across audiences.*
737
+ -->
738
+
739
+ <!--
740
+ ## Model Card Authors
741
+
742
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
743
+ -->
744
+
745
+ <!--
746
+ ## Model Card Contact
747
+
748
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
749
+ -->
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-small-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "id2label": {
12
+ "0": "LABEL_0"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 1536,
16
+ "label2id": {
17
+ "LABEL_0": 0
18
+ },
19
+ "layer_norm_eps": 1e-12,
20
+ "max_position_embeddings": 512,
21
+ "model_type": "bert",
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 12,
24
+ "pad_token_id": 0,
25
+ "position_embedding_type": "absolute",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.46.2",
28
+ "type_vocab_size": 2,
29
+ "use_cache": true,
30
+ "vocab_size": 30522
31
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.0",
4
+ "transformers": "4.46.2",
5
+ "pytorch": "2.5.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:94adacda2ea8c6ad6837c2c1636afacb8ae8f0ad0661fe6d28fcd9526ce9f191
3
+ size 133462128
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff