metadata
base_model: BAAI/bge-small-en-v1.5
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@5
- cosine_ndcg@10
- cosine_ndcg@100
- cosine_mrr@5
- cosine_mrr@10
- cosine_mrr@100
- cosine_map@100
- dot_accuracy@1
- dot_accuracy@5
- dot_accuracy@10
- dot_precision@1
- dot_precision@5
- dot_precision@10
- dot_recall@1
- dot_recall@5
- dot_recall@10
- dot_ndcg@5
- dot_ndcg@10
- dot_ndcg@100
- dot_mrr@5
- dot_mrr@10
- dot_mrr@100
- dot_map@100
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:1606
- loss:GISTEmbedLoss
widget:
- source_sentence: Do the tours include visits to all the major ghats and Akhara camps?
sentences:
- >-
Yes, many tours do cover all major ghats such as Sangam, Ram Ghat, and
Dashashwamedh Ghat, along with visits to some of the most significant
Akhara camps. These tours offer pilgrims a unique opportunity to witness
the religious and cultural significance of these locations. However, we
recommend reviewing the specific itinerary of your chosen tour for
precise details.
- >-
Yes, many tours do cover all major ghats such as Sangam, Ram Ghat, and
Dashashwamedh Ghat, along with visits to some of the most significant
Akhara camps. These tours offer pilgrims a unique opportunity to witness
the religious and cultural significance of these locations. However, we
recommend reviewing the specific itinerary of your chosen tour for
precise details.
- >-
The orchestra rehearsed late into the night, perfecting their
performance for the upcoming concert. Each musician contributed their
unique sound, creating a harmonious blend of instruments. The conductor
insisted on precision and emotion, ensuring every note resonated with
the audience's heart. Attendees can expect a captivating experience,
filled with dynamic melodies and intricate crescendos that highlight the
orchestra's talent and dedication. For a firsthand experience, consider
arriving early to enjoy the pre-concert discussions.
- source_sentence: What is the significance of the Naga Sadhus in the Shahi Snan?
sentences:
- >-
The Naga Sadhus hold a significant place in the Shahi Snan during the
Kumbh Mela as they are considered the guardians of faith and ancient
traditions within Hinduism. Known for their ash-covered, unclothed
bodies, long matted hair, and intense spiritual practices, the Naga
Sadhus are the first to take the holy dip during the Shahi Snan,
symbolizing purity, renunciation, and spiritual strength. Their
participation is believed to purify the waters of the sacred rivers,
making them spiritually potent for the millions of pilgrims who follow.
The Naga Sadhus’ procession to the river, marked by their vibrant
chants, tridents, and fearless demeanor, is one of the most
awe-inspiring spectacles of the Kumbh Mela. Their presence represents
the commitment to asceticism, devotion, and the protection of religious
traditions, adding a deeper layer of spiritual intensity and
significance to the Shahi Snan ritual.
- >-
During the processions of Peshwai and Shahi Snaans at the Maha Kumbh
Mela, Mahamandaleshwaras play a unique and central role as the spiritual
leaders of their Akharas. They lead their followers in grand, royal
processions to the riverbanks for the Shahi Snan (royal bath),
symbolizing the beginning of the holy ritual. Riding on beautifully
decorated chariots, elephants, or horses, they lead the march with great
reverence and authority, followed by their disciples, saints, and
devotees. The presence of Mahamandaleshwaras in these processions
signifies the spiritual sanctity and importance of the ritual, inspiring
pilgrims to partake in the spiritual energy and blessings of the holy
dip. Their leadership adds a sense of grandeur and divine significance
to the Shahi Snaans, making them the focal point of the Kumbh Mela.
- >-
The vibrant world of reptiles is fascinating to explore, particularly
focusing on the unique adaptations they possess for survival. Snakes,
for instance, exhibit remarkable methods of locomotion, allowing them to
navigate diverse terrains with ease. Some species are known for their
ability to blend into their surroundings, employing camouflage
techniques that render them nearly invisible to both predators and prey.
Additionally, many reptiles display fascinating reproductive behaviors,
with some laying eggs in protected environments while others give birth
to live young. The intricate ecosystems that support these creatures
highlight the interdependence between various species, illustrating the
delicate balance of nature. Understanding these dynamics can enhance our
appreciation for the biodiversity that exists in our world and the
intricate roles each species plays within its habitat.
- source_sentence: Are there any carpool or ride-sharing options to travel to Prayagraj?
sentences:
- >-
In the realm of culinary experiences, exploring the myriad flavors of
Italian cuisine can be quite delightful. One might consider the
following aspects:<br><br>1. Pasta Varieties: There are numerous types
of pasta, from spaghetti to fettuccine, each offering a distinct texture
and taste in dishes.<br>2. Regional Sauces: Different areas of Italy are
known for unique sauces, such as marinara, pesto, and Alfredo, which can
transform a simple meal into a feast. Additionally, using fresh, local
ingredients enhances the flavors.<br>3. Dining Etiquette: Understanding
Italian dining customs, such as the significance of antipasti, can
enrich one's experience while enjoying meals with family and friends.
- >-
Yes, there are multiple carpooling and ride-sharing options you can use
to travel to Prayagraj. These include:<br><br>1. BlaBlaCar: This is a
trusted community carpooling app where you can connect with people who
are traveling in the same direction.<br>2. Uber and Ola Share: Both Uber
and Ola offer ride-sharing options where you can share your ride with
other passengers. Please note this might depend on the city you are
traveling from.<br>3. Local Carpooling groups: There may be local
carpooling groups on social media platforms like Facebook and WhatsApp
where people share their travel plans.
- >-
The Kumbh Mela hosts a diverse array of spiritual gurus, each
representing different spiritual traditions and philosophies within
Hinduism. Prominent among them are the Mahamandaleshwaras of the various
Akharas, who are highly respected for their deep knowledge of scriptures
and spiritual leadership. Then there are the Naga Sadhus, known for
their ascetic lifestyle and unique appearance, who represent intense
spiritual discipline and renunciation. \n \n The Acharyas and
Prayagwals serve as guides and teachers for pilgrims, offering religious
services and performing important rituals like Pind Daan and Shraadh.
Additionally, there are Dandi Sanyasis who follow the path of austerity
and renunciation, emphasizing self-discipline and simplicity.
- source_sentence: What is the best train route to Prayagraj from Varanasi?
sentences:
- >-
The best train route from Varanasi to Prayagraj is via the Indian
Railways. There are multiple trains that operate on this route daily.
<br><br>1. VBS BSB Express (14235)<br>2. Shiv Ganga Express
(12559)<br>3. Mahanagri Express (11093)<br>4. Kashi Vishwanath Express
(14257)<br>5. Vande Bharat <br><br>For the most accurate and up-to-date
information on train timings to Prayagraj, please visit the IRCTC
website <<u><a target='_blank'
href='https://www.irctc.co.in/nget/'>https://www.irctc.co.in/nget/</a></u>>
- >-
Yes, towing services are available if your vehicle breaks down in the
parking lot.
- >-
A delightful assortment of pastries can significantly enhance any
gathering. Chocolate eclairs, fruit tarts, and macarons are popular
choices among guests. <br><br>1. Lemon meringue tart<br>2. Almond
croissant<br>3. Raspberry mille-feuille<br>4. Vanilla cream puff<br>5.
Caramel flan <br><br>For an exquisite culinary experience, consider
attending a pastry-making workshop for hands-on learning and tips from
skilled bakers.
- source_sentence: What does Deep Daan symbolize?
sentences:
- >-
In the quiet corners of a bustling city, the sound of a distant siren
punctuates the air, hinting at life’s unpredictability. A lone musician
sets up his stand, strings resonating softly as pedestrians pass by,
each lost in their own thoughts. The warmth of the sun flows over the
pavement, while children chase after colorful kites soaring high above.
Nearby, a group gathers for laughter and stories, each voice woven into
a tapestry of community and connection. As day turns to dusk, the sky
transforms into a palette of vibrant colors, inviting dreams and
possibilities under the expansive canvas of the universe.
- >-
Deep Daan involves the ritual of lighting oil lamps (diyas) and floating
them on the river as an offering to the divine. This act symbolizes the
removal of darkness and ignorance, representing the soul’s journey
towards enlightenment and spiritual awakening. The flickering lamps also
signify hope, devotion, and a wish for divine blessings. During the
Kumbh Mela, Deep Daan is considered a powerful ritual that purifies the
mind and soul, bringing peace and fulfillment to the devotees performing
it.
- >-
The duration of the tours typically ranges from 1-day to 3-day packages.
Start times for the tours are usually early in the morning to ensure
participants make the most of the day’s activities, which may include
attending religious rituals, visiting temples, and sightseeing. Exact
timings will be communicated to you once your booking is confirmed.
model-index:
- name: SentenceTransformer based on BAAI/bge-small-en-v1.5
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: val evaluator
type: val_evaluator
metrics:
- type: cosine_accuracy@1
value: 0.5621890547263682
name: Cosine Accuracy@1
- type: cosine_accuracy@5
value: 0.9328358208955224
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9676616915422885
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.5621890547263682
name: Cosine Precision@1
- type: cosine_precision@5
value: 0.1865671641791045
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09676616915422885
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.5621890547263682
name: Cosine Recall@1
- type: cosine_recall@5
value: 0.9328358208955224
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9676616915422885
name: Cosine Recall@10
- type: cosine_ndcg@5
value: 0.7755192663647908
name: Cosine Ndcg@5
- type: cosine_ndcg@10
value: 0.7872765799335859
name: Cosine Ndcg@10
- type: cosine_ndcg@100
value: 0.7949599458501615
name: Cosine Ndcg@100
- type: cosine_mrr@5
value: 0.7216832504145936
name: Cosine Mrr@5
- type: cosine_mrr@10
value: 0.726826186527679
name: Cosine Mrr@10
- type: cosine_mrr@100
value: 0.7287172339895628
name: Cosine Mrr@100
- type: cosine_map@100
value: 0.7287172339895628
name: Cosine Map@100
- type: dot_accuracy@1
value: 0.5621890547263682
name: Dot Accuracy@1
- type: dot_accuracy@5
value: 0.9353233830845771
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.9676616915422885
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.5621890547263682
name: Dot Precision@1
- type: dot_precision@5
value: 0.1870646766169154
name: Dot Precision@5
- type: dot_precision@10
value: 0.09676616915422885
name: Dot Precision@10
- type: dot_recall@1
value: 0.5621890547263682
name: Dot Recall@1
- type: dot_recall@5
value: 0.9353233830845771
name: Dot Recall@5
- type: dot_recall@10
value: 0.9676616915422885
name: Dot Recall@10
- type: dot_ndcg@5
value: 0.776654033153749
name: Dot Ndcg@5
- type: dot_ndcg@10
value: 0.7875252591924246
name: Dot Ndcg@10
- type: dot_ndcg@100
value: 0.795208625109
name: Dot Ndcg@100
- type: dot_mrr@5
value: 0.7223880597014923
name: Dot Mrr@5
- type: dot_mrr@10
value: 0.7271164021164023
name: Dot Mrr@10
- type: dot_mrr@100
value: 0.7290074495782858
name: Dot Mrr@100
- type: dot_map@100
value: 0.7290074495782857
name: Dot Map@100
SentenceTransformer based on BAAI/bge-small-en-v1.5
This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-small-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 384 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("himanshu23099/bge_embedding_finetune1")
# Run inference
sentences = [
'What does Deep Daan symbolize?',
'Deep Daan involves the ritual of lighting oil lamps (diyas) and floating them on the river as an offering to the divine. This act symbolizes the removal of darkness and ignorance, representing the soul’s journey towards enlightenment and spiritual awakening. The flickering lamps also signify hope, devotion, and a wish for divine blessings. During the Kumbh Mela, Deep Daan is considered a powerful ritual that purifies the mind and soul, bringing peace and fulfillment to the devotees performing it.',
'In the quiet corners of a bustling city, the sound of a distant siren punctuates the air, hinting at life’s unpredictability. A lone musician sets up his stand, strings resonating softly as pedestrians pass by, each lost in their own thoughts. The warmth of the sun flows over the pavement, while children chase after colorful kites soaring high above. Nearby, a group gathers for laughter and stories, each voice woven into a tapestry of community and connection. As day turns to dusk, the sky transforms into a palette of vibrant colors, inviting dreams and possibilities under the expansive canvas of the universe.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Dataset:
val_evaluator
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.5622 |
cosine_accuracy@5 | 0.9328 |
cosine_accuracy@10 | 0.9677 |
cosine_precision@1 | 0.5622 |
cosine_precision@5 | 0.1866 |
cosine_precision@10 | 0.0968 |
cosine_recall@1 | 0.5622 |
cosine_recall@5 | 0.9328 |
cosine_recall@10 | 0.9677 |
cosine_ndcg@5 | 0.7755 |
cosine_ndcg@10 | 0.7873 |
cosine_ndcg@100 | 0.795 |
cosine_mrr@5 | 0.7217 |
cosine_mrr@10 | 0.7268 |
cosine_mrr@100 | 0.7287 |
cosine_map@100 | 0.7287 |
dot_accuracy@1 | 0.5622 |
dot_accuracy@5 | 0.9353 |
dot_accuracy@10 | 0.9677 |
dot_precision@1 | 0.5622 |
dot_precision@5 | 0.1871 |
dot_precision@10 | 0.0968 |
dot_recall@1 | 0.5622 |
dot_recall@5 | 0.9353 |
dot_recall@10 | 0.9677 |
dot_ndcg@5 | 0.7767 |
dot_ndcg@10 | 0.7875 |
dot_ndcg@100 | 0.7952 |
dot_mrr@5 | 0.7224 |
dot_mrr@10 | 0.7271 |
dot_mrr@100 | 0.729 |
dot_map@100 | 0.729 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 1,606 training samples
- Columns:
anchor
,positive
, andnegative
- Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 8 tokens
- mean: 18.11 tokens
- max: 32 tokens
- min: 3 tokens
- mean: 110.54 tokens
- max: 504 tokens
- min: 27 tokens
- mean: 114.86 tokens
- max: 424 tokens
- Samples:
anchor positive negative Why should one do the Prayagraj Panchkoshi Parikrama?
The Prayagraj Panchkoshi Parikrama is a deeply revered spiritual journey that offers multiple benefits to devotees. It is believed to grant blessings equivalent to visiting all sacred pilgrimage sites in India, providing divine grace and spiritual merit. The Parikrama route covers significant temples like the Dwadash Madhav temples, Akshayavat, and Mankameshwar, which are steeped in Hindu mythology and history, allowing pilgrims to connect with the spiritual and cultural heritage of Prayagraj. This circumambulation around sacred sites is also seen as a way to cleanse one's sins and progress towards Moksha (liberation from the cycle of birth and rebirth), making it a path of introspection and spiritual growth. The pilgrimage fosters unity among people from diverse backgrounds, offering a unique cultural exchange and shared spiritual experience. By participating, devotees also help revive an ancient tradition integral to the Kumbh Mela for centuries, reconnecting with age-old practices that have shaped the region's spiritual landscape. The Prayagraj Panchkoshi Parikrama is a profound journey of faith and devotion, enriching the spiritual lives of those who undertake it.
Elevators are remarkable inventions that revolutionized how we navigate tall buildings. They provide a swift, efficient means of transportation between floors, making urban life more accessible. These mechanical wonders operate on a system of pulleys and counterweights, enabling them to carry heavy loads effortlessly. Safety features like emergency brakes and backup power systems ensure that passengers remain secure during their journey. Various designs and styles can be seen in buildings around the world, from sleek modern glass models to vintage models that evoke nostalgia. Elevators also highlight the advancement of engineering and technology over time, evolving from rudimentary designs to sophisticated machines with smart technology. They are essential in various settings, including residential, commercial, and industrial spaces, offering convenience and practicality. Their presence also allows for the efficient use of vertical space, fostering creativity in architectural designs and city planning. Overall, elevators have become an essential part of contemporary infrastructure, enhancing the way we live and work.
Can I hire an E-Rickshaw for a specific duration or multiple stops within the Mela?
Yes, E-Rickshaws have designated pick-up points, and you can hire them for a specific duration or multiple stops depending on your needs and arrangements with the driver
The process of assigning roles in a theatrical production often involves extensive auditions and interviews. Each candidate brings unique skills, and the director must carefully consider how their abilities will fit into the overall vision for the performance. Team dynamics play a crucial role, as collaboration is essential for a successful show.
What are the best routes to avoid traffic while traveling from Prayagraj Junction to the Mela grounds?
The distance between Prayagraj Junction and the Mela Grounds during the Kumbh Mela in Prayagraj, India is approximately 5-7 kilometers. By bus, this could take anywhere from 20-40 minutes, depending on traffic and the specific route.
The ancient art of glassblowing has captivated artisans for centuries. Bubbles of molten glass are deftly shaped into exquisite forms, revealing the synergy between fire and craftsmanship. The process requires both skill and creativity, resulting in functional pieces or striking sculptures that bring vibrancy to any space. Each creation is unique, echoing the delicate dance of temperature and technique involved in the art form.
- Loss:
GISTEmbedLoss
with these parameters:{'guide': SentenceTransformer( (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ), 'temperature': 0.01}
Evaluation Dataset
Unnamed Dataset
- Size: 402 evaluation samples
- Columns:
anchor
,positive
, andnegative
- Approximate statistics based on the first 402 samples:
anchor positive negative type string string string details - min: 8 tokens
- mean: 17.98 tokens
- max: 30 tokens
- min: 3 tokens
- mean: 111.36 tokens
- max: 471 tokens
- min: 30 tokens
- mean: 116.68 tokens
- max: 501 tokens
- Samples:
anchor positive negative What is the Female Helpline number?
The Women and Child Helpline number for assistance during the Maha Kumbh 2025 is 1091. This service is available for any support related to the safety and well-being of women and children.
The average lifespan of a species can vary significantly. In some cases, dolphins can live up to 60 years, while certain types of tortoises have been known to exceed 150 years. Understanding the factors that influence longevity is essential in the study of wildlife conservation.
What is the estimated travel time from the Airport to the Mela grounds during peak hours?
The estimated travel time from the Airport to the Mela grounds is about 1 hour on non-peak days. Travel times may vary significantly during peak hours due to traffic and road conditions.
The recipe for chocolate cake requires several key ingredients to achieve the perfect texture. Begin by preheating the oven to 350°F. Combine flour, sugar, cocoa powder, and eggs in a large mixing bowl, stirring until smooth. Baking can be an enjoyable process filled with delightful aromas and flavors.
How safe is it to travel by public transport from Prayagraj city to the Kumbh Mela at night?
There is no direct metro service to the Mela grounds from Prayagraj city. However, Govt operated dedicated shuttle buses are available within Prayagraj for transportation to the Mela. These buses operate on fixed routes and fixed times.
The fastest way to prepare a delicious apple pie starts with choosing the right variety of apples. Granny Smith apples are great for tartness, while Honeycrisp provides sweetness. After washing and peeling the apples, slice them into thin pieces, ensuring an even texture. Combine the apple slices with sugar, cinnamon, and a hint of lemon juice. Roll out your pie crust and fill it generously with the apple mixture, top it with another crust, and create small vents to allow steam to escape. Bake at 425°F until golden brown, and enjoy the fantastic aroma that fills your kitchen!
- Loss:
GISTEmbedLoss
with these parameters:{'guide': SentenceTransformer( (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ), 'temperature': 0.01}
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16gradient_accumulation_steps
: 2learning_rate
: 1e-05weight_decay
: 0.01num_train_epochs
: 30warmup_ratio
: 0.1load_best_model_at_end
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 2eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 1e-05weight_decay
: 0.01adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 30max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | Validation Loss | val_evaluator_dot_map@100 |
---|---|---|---|---|
0.1980 | 10 | 0.8028 | 0.4071 | 0.6415 |
0.3960 | 20 | 0.7561 | 0.3701 | 0.6406 |
0.5941 | 30 | 0.9729 | 0.3100 | 0.6415 |
0.7921 | 40 | 0.6137 | 0.2505 | 0.6478 |
0.9901 | 50 | 0.4747 | 0.1978 | 0.6501 |
1.1881 | 60 | 0.4595 | 0.1609 | 0.6541 |
1.3861 | 70 | 0.3862 | 0.1300 | 0.6570 |
1.5842 | 80 | 0.293 | 0.1003 | 0.6606 |
1.7822 | 90 | 0.2806 | 0.0760 | 0.6588 |
1.9802 | 100 | 0.1249 | 0.0586 | 0.6616 |
2.1782 | 110 | 0.2265 | 0.0503 | 0.6677 |
2.3762 | 120 | 0.1292 | 0.0482 | 0.6701 |
2.5743 | 130 | 0.1649 | 0.0448 | 0.6756 |
2.7723 | 140 | 0.1213 | 0.0442 | 0.6810 |
2.9703 | 150 | 0.1363 | 0.0419 | 0.6843 |
3.1683 | 160 | 0.0972 | 0.0376 | 0.6859 |
3.3663 | 170 | 0.1079 | 0.0326 | 0.6896 |
3.5644 | 180 | 0.1265 | 0.0293 | 0.6899 |
3.7624 | 190 | 0.0645 | 0.0279 | 0.6952 |
3.9604 | 200 | 0.1116 | 0.0272 | 0.6934 |
4.1584 | 210 | 0.0757 | 0.0258 | 0.6954 |
4.3564 | 220 | 0.1492 | 0.0248 | 0.6991 |
4.5545 | 230 | 0.0536 | 0.0246 | 0.6971 |
4.7525 | 240 | 0.0346 | 0.0248 | 0.6958 |
4.9505 | 250 | 0.0501 | 0.0247 | 0.6974 |
5.1485 | 260 | 0.0443 | 0.0248 | 0.6975 |
5.3465 | 270 | 0.0585 | 0.0245 | 0.6998 |
5.5446 | 280 | 0.0514 | 0.0246 | 0.7013 |
5.7426 | 290 | 0.0948 | 0.0244 | 0.7073 |
5.9406 | 300 | 0.054 | 0.0243 | 0.7049 |
6.1386 | 310 | 0.0317 | 0.0241 | 0.7069 |
6.3366 | 320 | 0.1327 | 0.0249 | 0.7061 |
6.5347 | 330 | 0.0665 | 0.0255 | 0.7073 |
6.7327 | 340 | 0.09 | 0.0257 | 0.7073 |
6.9307 | 350 | 0.111 | 0.0255 | 0.7067 |
7.1287 | 360 | 0.0473 | 0.0255 | 0.7096 |
7.3267 | 370 | 0.0429 | 0.0248 | 0.7063 |
7.5248 | 380 | 0.0686 | 0.0249 | 0.7087 |
7.7228 | 390 | 0.1096 | 0.0251 | 0.7113 |
7.9208 | 400 | 0.0794 | 0.0255 | 0.7083 |
8.1188 | 410 | 0.0354 | 0.0246 | 0.7094 |
8.3168 | 420 | 0.078 | 0.0239 | 0.7093 |
8.5149 | 430 | 0.091 | 0.0234 | 0.7057 |
8.7129 | 440 | 0.084 | 0.0236 | 0.7107 |
8.9109 | 450 | 0.0702 | 0.0235 | 0.7114 |
9.1089 | 460 | 0.0701 | 0.0233 | 0.7142 |
9.3069 | 470 | 0.0706 | 0.0231 | 0.7140 |
9.5050 | 480 | 0.029 | 0.0230 | 0.7125 |
9.7030 | 490 | 0.0411 | 0.0233 | 0.7107 |
9.9010 | 500 | 0.0691 | 0.0233 | 0.7140 |
10.0990 | 510 | 0.0421 | 0.0232 | 0.7165 |
10.2970 | 520 | 0.0497 | 0.0232 | 0.7200 |
10.4950 | 530 | 0.0639 | 0.0232 | 0.7188 |
10.6931 | 540 | 0.0201 | 0.0238 | 0.7161 |
10.8911 | 550 | 0.0833 | 0.0241 | 0.7170 |
11.0891 | 560 | 0.0266 | 0.0242 | 0.7197 |
11.2871 | 570 | 0.0472 | 0.0241 | 0.7220 |
11.4851 | 580 | 0.0614 | 0.0240 | 0.7234 |
11.6832 | 590 | 0.0507 | 0.0242 | 0.7243 |
11.8812 | 600 | 0.031 | 0.0239 | 0.7226 |
12.0792 | 610 | 0.0413 | 0.0239 | 0.7216 |
12.2772 | 620 | 0.0222 | 0.0230 | 0.7234 |
12.4752 | 630 | 0.0466 | 0.0221 | 0.7239 |
12.6733 | 640 | 0.0482 | 0.0219 | 0.7218 |
12.8713 | 650 | 0.0657 | 0.0218 | 0.7197 |
13.0693 | 660 | 0.0521 | 0.0218 | 0.7235 |
13.2673 | 670 | 0.051 | 0.0218 | 0.7234 |
13.4653 | 680 | 0.0674 | 0.0220 | 0.7243 |
13.6634 | 690 | 0.0477 | 0.0220 | 0.7232 |
13.8614 | 700 | 0.0827 | 0.0218 | 0.7232 |
14.0594 | 710 | 0.0501 | 0.0217 | 0.7247 |
14.2574 | 720 | 0.0278 | 0.0216 | 0.7233 |
14.4554 | 730 | 0.0162 | 0.0216 | 0.7201 |
14.6535 | 740 | 0.0515 | 0.0217 | 0.7219 |
14.8515 | 750 | 0.0514 | 0.0218 | 0.7256 |
15.0495 | 760 | 0.088 | 0.0217 | 0.7252 |
15.2475 | 770 | 0.0298 | 0.0217 | 0.7226 |
15.4455 | 780 | 0.0682 | 0.0217 | 0.7259 |
15.6436 | 790 | 0.0485 | 0.0217 | 0.7253 |
15.8416 | 800 | 0.0419 | 0.0217 | 0.7286 |
16.0396 | 810 | 0.0823 | 0.0216 | 0.7268 |
16.2376 | 820 | 0.0533 | 0.0215 | 0.7250 |
16.4356 | 830 | 0.0336 | 0.0215 | 0.7262 |
16.6337 | 840 | 0.0375 | 0.0214 | 0.7270 |
16.8317 | 850 | 0.0243 | 0.0213 | 0.7281 |
17.0297 | 860 | 0.0675 | 0.0212 | 0.7265 |
17.2277 | 870 | 0.0482 | 0.0211 | 0.7260 |
17.4257 | 880 | 0.0511 | 0.0211 | 0.7297 |
17.6238 | 890 | 0.0396 | 0.0211 | 0.7282 |
17.8218 | 900 | 0.0493 | 0.0211 | 0.7275 |
18.0198 | 910 | 0.0378 | 0.0210 | 0.7279 |
18.2178 | 920 | 0.0546 | 0.0210 | 0.7265 |
18.4158 | 930 | 0.0421 | 0.0209 | 0.7286 |
18.6139 | 940 | 0.0599 | 0.0208 | 0.7286 |
18.8119 | 950 | 0.0766 | 0.0205 | 0.7297 |
19.0099 | 960 | 0.0204 | 0.0205 | 0.7275 |
19.2079 | 970 | 0.0321 | 0.0205 | 0.7282 |
19.4059 | 980 | 0.0069 | 0.0204 | 0.7266 |
19.6040 | 990 | 0.0563 | 0.0205 | 0.7245 |
19.8020 | 1000 | 0.0575 | 0.0205 | 0.7236 |
20.0 | 1010 | 0.0207 | 0.0205 | 0.7261 |
20.1980 | 1020 | 0.03 | 0.0205 | 0.7253 |
20.3960 | 1030 | 0.0712 | 0.0205 | 0.7269 |
20.5941 | 1040 | 0.0482 | 0.0205 | 0.7277 |
20.7921 | 1050 | 0.05 | 0.0205 | 0.7283 |
20.9901 | 1060 | 0.0407 | 0.0205 | 0.7282 |
21.1881 | 1070 | 0.0591 | 0.0205 | 0.7286 |
21.3861 | 1080 | 0.0228 | 0.0205 | 0.7265 |
21.5842 | 1090 | 0.0318 | 0.0205 | 0.7264 |
21.7822 | 1100 | 0.0768 | 0.0205 | 0.7254 |
21.9802 | 1110 | 0.0415 | 0.0205 | 0.7264 |
22.1782 | 1120 | 0.0681 | 0.0205 | 0.7252 |
22.3762 | 1130 | 0.0622 | 0.0205 | 0.7255 |
22.5743 | 1140 | 0.0508 | 0.0205 | 0.7251 |
22.7723 | 1150 | 0.0642 | 0.0205 | 0.7237 |
22.9703 | 1160 | 0.0469 | 0.0206 | 0.7245 |
23.1683 | 1170 | 0.0172 | 0.0206 | 0.7256 |
23.3663 | 1180 | 0.055 | 0.0206 | 0.7255 |
23.5644 | 1190 | 0.0488 | 0.0206 | 0.7266 |
23.7624 | 1200 | 0.0208 | 0.0206 | 0.7243 |
23.9604 | 1210 | 0.0415 | 0.0206 | 0.7249 |
24.1584 | 1220 | 0.0804 | 0.0206 | 0.7264 |
24.3564 | 1230 | 0.0243 | 0.0205 | 0.7256 |
24.5545 | 1240 | 0.037 | 0.0205 | 0.7258 |
24.7525 | 1250 | 0.0604 | 0.0205 | 0.7284 |
24.9505 | 1260 | 0.0278 | 0.0205 | 0.7245 |
25.1485 | 1270 | 0.0317 | 0.0205 | 0.7235 |
25.3465 | 1280 | 0.0824 | 0.0205 | 0.7253 |
25.5446 | 1290 | 0.0639 | 0.0205 | 0.7258 |
25.7426 | 1300 | 0.0269 | 0.0205 | 0.7247 |
25.9406 | 1310 | 0.0429 | 0.0205 | 0.7278 |
26.1386 | 1320 | 0.0692 | 0.0205 | 0.7279 |
26.3366 | 1330 | 0.0771 | 0.0205 | 0.7301 |
26.5347 | 1340 | 0.0578 | 0.0205 | 0.7280 |
26.7327 | 1350 | 0.025 | 0.0205 | 0.7258 |
26.9307 | 1360 | 0.0414 | 0.0205 | 0.7286 |
27.1287 | 1370 | 0.0484 | 0.0205 | 0.7284 |
27.3267 | 1380 | 0.0581 | 0.0205 | 0.7294 |
27.5248 | 1390 | 0.069 | 0.0205 | 0.7288 |
27.7228 | 1400 | 0.0864 | 0.0205 | 0.7301 |
27.9208 | 1410 | 0.0605 | 0.0205 | 0.7285 |
28.1188 | 1420 | 0.0327 | 0.0205 | 0.7271 |
28.3168 | 1430 | 0.0789 | 0.0205 | 0.7258 |
28.5149 | 1440 | 0.056 | 0.0205 | 0.7276 |
28.7129 | 1450 | 0.0256 | 0.0205 | 0.7272 |
28.9109 | 1460 | 0.0316 | 0.0205 | 0.7273 |
29.1089 | 1470 | 0.0528 | 0.0205 | 0.7287 |
29.3069 | 1480 | 0.0552 | 0.0205 | 0.7274 |
29.5050 | 1490 | 0.0441 | 0.0205 | 0.7287 |
29.7030 | 1500 | 0.0246 | 0.0205 | 0.7290 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.2.1
- Transformers: 4.44.2
- PyTorch: 2.5.0+cu121
- Accelerate: 0.34.2
- Datasets: 3.1.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
GISTEmbedLoss
@misc{solatorio2024gistembed,
title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
author={Aivin V. Solatorio},
year={2024},
eprint={2402.16829},
archivePrefix={arXiv},
primaryClass={cs.LG}
}