himanshu23099's picture
Add new SentenceTransformer model
6bdbfa5 verified
|
raw
history blame
58.6 kB
metadata
base_model: BAAI/bge-small-en-v1.5
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@5
  - cosine_ndcg@10
  - cosine_ndcg@100
  - cosine_mrr@5
  - cosine_mrr@10
  - cosine_mrr@100
  - cosine_map@100
  - dot_accuracy@1
  - dot_accuracy@5
  - dot_accuracy@10
  - dot_precision@1
  - dot_precision@5
  - dot_precision@10
  - dot_recall@1
  - dot_recall@5
  - dot_recall@10
  - dot_ndcg@5
  - dot_ndcg@10
  - dot_ndcg@100
  - dot_mrr@5
  - dot_mrr@10
  - dot_mrr@100
  - dot_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:1606
  - loss:GISTEmbedLoss
widget:
  - source_sentence: Do the tours include visits to all the major ghats and Akhara camps?
    sentences:
      - >-
        Yes, many tours do cover all major ghats such as Sangam, Ram Ghat, and
        Dashashwamedh Ghat, along with visits to some of the most significant
        Akhara camps. These tours offer pilgrims a unique opportunity to witness
        the religious and cultural significance of these locations. However, we
        recommend reviewing the specific itinerary of your chosen tour for
        precise details.
      - >-
        Yes, many tours do cover all major ghats such as Sangam, Ram Ghat, and
        Dashashwamedh Ghat, along with visits to some of the most significant
        Akhara camps. These tours offer pilgrims a unique opportunity to witness
        the religious and cultural significance of these locations. However, we
        recommend reviewing the specific itinerary of your chosen tour for
        precise details.
      - >-
        The orchestra rehearsed late into the night, perfecting their
        performance for the upcoming concert. Each musician contributed their
        unique sound, creating a harmonious blend of instruments. The conductor
        insisted on precision and emotion, ensuring every note resonated with
        the audience's heart. Attendees can expect a captivating experience,
        filled with dynamic melodies and intricate crescendos that highlight the
        orchestra's talent and dedication. For a firsthand experience, consider
        arriving early to enjoy the pre-concert discussions.
  - source_sentence: What is the significance of the Naga Sadhus in the Shahi Snan?
    sentences:
      - >-
        The Naga Sadhus hold a significant place in the Shahi Snan during the
        Kumbh Mela as they are considered the guardians of faith and ancient
        traditions within Hinduism. Known for their ash-covered, unclothed
        bodies, long matted hair, and intense spiritual practices, the Naga
        Sadhus are the first to take the holy dip during the Shahi Snan,
        symbolizing purity, renunciation, and spiritual strength. Their
        participation is believed to purify the waters of the sacred rivers,
        making them spiritually potent for the millions of pilgrims who follow.
        The Naga Sadhus’ procession to the river, marked by their vibrant
        chants, tridents, and fearless demeanor, is one of the most
        awe-inspiring spectacles of the Kumbh Mela. Their presence represents
        the commitment to asceticism, devotion, and the protection of religious
        traditions, adding a deeper layer of spiritual intensity and
        significance to the Shahi Snan ritual.
      - >-
        During the processions of Peshwai and Shahi Snaans at the Maha Kumbh
        Mela, Mahamandaleshwaras play a unique and central role as the spiritual
        leaders of their Akharas. They lead their followers in grand, royal
        processions to the riverbanks for the Shahi Snan (royal bath),
        symbolizing the beginning of the holy ritual. Riding on beautifully
        decorated chariots, elephants, or horses, they lead the march with great
        reverence and authority, followed by their disciples, saints, and
        devotees. The presence of Mahamandaleshwaras in these processions
        signifies the spiritual sanctity and importance of the ritual, inspiring
        pilgrims to partake in the spiritual energy and blessings of the holy
        dip. Their leadership adds a sense of grandeur and divine significance
        to the Shahi Snaans, making them the focal point of the Kumbh Mela.
      - >-
        The vibrant world of reptiles is fascinating to explore, particularly
        focusing on the unique adaptations they possess for survival. Snakes,
        for instance, exhibit remarkable methods of locomotion, allowing them to
        navigate diverse terrains with ease. Some species are known for their
        ability to blend into their surroundings, employing camouflage
        techniques that render them nearly invisible to both predators and prey.
        Additionally, many reptiles display fascinating reproductive behaviors,
        with some laying eggs in protected environments while others give birth
        to live young. The intricate ecosystems that support these creatures
        highlight the interdependence between various species, illustrating the
        delicate balance of nature. Understanding these dynamics can enhance our
        appreciation for the biodiversity that exists in our world and the
        intricate roles each species plays within its habitat.
  - source_sentence: Are there any carpool or ride-sharing options to travel to Prayagraj?
    sentences:
      - >-
        In the realm of culinary experiences, exploring the myriad flavors of
        Italian cuisine can be quite delightful. One might consider the
        following aspects:<br><br>1. Pasta Varieties: There are numerous types
        of pasta, from spaghetti to fettuccine, each offering a distinct texture
        and taste in dishes.<br>2. Regional Sauces: Different areas of Italy are
        known for unique sauces, such as marinara, pesto, and Alfredo, which can
        transform a simple meal into a feast. Additionally, using fresh, local
        ingredients enhances the flavors.<br>3. Dining Etiquette: Understanding
        Italian dining customs, such as the significance of antipasti, can
        enrich one's experience while enjoying meals with family and friends.
      - >-
        Yes, there are multiple carpooling and ride-sharing options you can use
        to travel to Prayagraj. These include:<br><br>1. BlaBlaCar: This is a
        trusted community carpooling app where you can connect with people who
        are traveling in the same direction.<br>2. Uber and Ola Share: Both Uber
        and Ola offer ride-sharing options where you can share your ride with
        other passengers. Please note this might depend on the city you are
        traveling from.<br>3. Local Carpooling groups: There may be local
        carpooling groups on social media platforms like Facebook and WhatsApp
        where people share their travel plans.
      - >-
        The Kumbh Mela hosts a diverse array of spiritual gurus, each
        representing different spiritual traditions and philosophies within
        Hinduism. Prominent among them are the Mahamandaleshwaras of the various
        Akharas, who are highly respected for their deep knowledge of scriptures
        and spiritual leadership. Then there are the Naga Sadhus, known for
        their ascetic lifestyle and unique appearance, who represent intense
        spiritual discipline and renunciation. \n  \n The Acharyas and
        Prayagwals serve as guides and teachers for pilgrims, offering religious
        services and performing important rituals like Pind Daan and Shraadh.
        Additionally, there are Dandi Sanyasis who follow the path of austerity
        and renunciation, emphasizing self-discipline and simplicity.
  - source_sentence: What is the best train route to Prayagraj from Varanasi?
    sentences:
      - >-
        The best train route from Varanasi to Prayagraj is via the Indian
        Railways. There are multiple trains that operate on this route daily.
        <br><br>1. VBS BSB Express (14235)<br>2. Shiv Ganga Express
        (12559)<br>3. Mahanagri Express (11093)<br>4. Kashi Vishwanath Express
        (14257)<br>5. Vande Bharat <br><br>For the most accurate and up-to-date
        information on train timings to Prayagraj, please visit the IRCTC
        website <<u><a target='_blank'
        href='https://www.irctc.co.in/nget/'>https://www.irctc.co.in/nget/</a></u>>
      - >-
        Yes, towing services are available if your vehicle breaks down in the
        parking lot.
      - >-
        A delightful assortment of pastries can significantly enhance any
        gathering. Chocolate eclairs, fruit tarts, and macarons are popular
        choices among guests. <br><br>1. Lemon meringue tart<br>2. Almond
        croissant<br>3. Raspberry mille-feuille<br>4. Vanilla cream puff<br>5.
        Caramel flan <br><br>For an exquisite culinary experience, consider
        attending a pastry-making workshop for hands-on learning and tips from
        skilled bakers.
  - source_sentence: What does Deep Daan symbolize?
    sentences:
      - >-
        In the quiet corners of a bustling city, the sound of a distant siren
        punctuates the air, hinting at life’s unpredictability. A lone musician
        sets up his stand, strings resonating softly as pedestrians pass by,
        each lost in their own thoughts. The warmth of the sun flows over the
        pavement, while children chase after colorful kites soaring high above.
        Nearby, a group gathers for laughter and stories, each voice woven into
        a tapestry of community and connection. As day turns to dusk, the sky
        transforms into a palette of vibrant colors, inviting dreams and
        possibilities under the expansive canvas of the universe.
      - >-
        Deep Daan involves the ritual of lighting oil lamps (diyas) and floating
        them on the river as an offering to the divine. This act symbolizes the
        removal of darkness and ignorance, representing the soul’s journey
        towards enlightenment and spiritual awakening. The flickering lamps also
        signify hope, devotion, and a wish for divine blessings. During the
        Kumbh Mela, Deep Daan is considered a powerful ritual that purifies the
        mind and soul, bringing peace and fulfillment to the devotees performing
        it.
      - >-
        The duration of the tours typically ranges from 1-day to 3-day packages.
        Start times for the tours are usually early in the morning to ensure
        participants make the most of the day’s activities, which may include
        attending religious rituals, visiting temples, and sightseeing. Exact
        timings will be communicated to you once your booking is confirmed.
model-index:
  - name: SentenceTransformer based on BAAI/bge-small-en-v1.5
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: val evaluator
          type: val_evaluator
        metrics:
          - type: cosine_accuracy@1
            value: 0.5621890547263682
            name: Cosine Accuracy@1
          - type: cosine_accuracy@5
            value: 0.9328358208955224
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9676616915422885
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.5621890547263682
            name: Cosine Precision@1
          - type: cosine_precision@5
            value: 0.1865671641791045
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09676616915422885
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.5621890547263682
            name: Cosine Recall@1
          - type: cosine_recall@5
            value: 0.9328358208955224
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.9676616915422885
            name: Cosine Recall@10
          - type: cosine_ndcg@5
            value: 0.7755192663647908
            name: Cosine Ndcg@5
          - type: cosine_ndcg@10
            value: 0.7872765799335859
            name: Cosine Ndcg@10
          - type: cosine_ndcg@100
            value: 0.7949599458501615
            name: Cosine Ndcg@100
          - type: cosine_mrr@5
            value: 0.7216832504145936
            name: Cosine Mrr@5
          - type: cosine_mrr@10
            value: 0.726826186527679
            name: Cosine Mrr@10
          - type: cosine_mrr@100
            value: 0.7287172339895628
            name: Cosine Mrr@100
          - type: cosine_map@100
            value: 0.7287172339895628
            name: Cosine Map@100
          - type: dot_accuracy@1
            value: 0.5621890547263682
            name: Dot Accuracy@1
          - type: dot_accuracy@5
            value: 0.9353233830845771
            name: Dot Accuracy@5
          - type: dot_accuracy@10
            value: 0.9676616915422885
            name: Dot Accuracy@10
          - type: dot_precision@1
            value: 0.5621890547263682
            name: Dot Precision@1
          - type: dot_precision@5
            value: 0.1870646766169154
            name: Dot Precision@5
          - type: dot_precision@10
            value: 0.09676616915422885
            name: Dot Precision@10
          - type: dot_recall@1
            value: 0.5621890547263682
            name: Dot Recall@1
          - type: dot_recall@5
            value: 0.9353233830845771
            name: Dot Recall@5
          - type: dot_recall@10
            value: 0.9676616915422885
            name: Dot Recall@10
          - type: dot_ndcg@5
            value: 0.776654033153749
            name: Dot Ndcg@5
          - type: dot_ndcg@10
            value: 0.7875252591924246
            name: Dot Ndcg@10
          - type: dot_ndcg@100
            value: 0.795208625109
            name: Dot Ndcg@100
          - type: dot_mrr@5
            value: 0.7223880597014923
            name: Dot Mrr@5
          - type: dot_mrr@10
            value: 0.7271164021164023
            name: Dot Mrr@10
          - type: dot_mrr@100
            value: 0.7290074495782858
            name: Dot Mrr@100
          - type: dot_map@100
            value: 0.7290074495782857
            name: Dot Map@100

SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-small-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("himanshu23099/bge_embedding_finetune1")
# Run inference
sentences = [
    'What does Deep Daan symbolize?',
    'Deep Daan involves the ritual of lighting oil lamps (diyas) and floating them on the river as an offering to the divine. This act symbolizes the removal of darkness and ignorance, representing the soul’s journey towards enlightenment and spiritual awakening. The flickering lamps also signify hope, devotion, and a wish for divine blessings. During the Kumbh Mela, Deep Daan is considered a powerful ritual that purifies the mind and soul, bringing peace and fulfillment to the devotees performing it.',
    'In the quiet corners of a bustling city, the sound of a distant siren punctuates the air, hinting at life’s unpredictability. A lone musician sets up his stand, strings resonating softly as pedestrians pass by, each lost in their own thoughts. The warmth of the sun flows over the pavement, while children chase after colorful kites soaring high above. Nearby, a group gathers for laughter and stories, each voice woven into a tapestry of community and connection. As day turns to dusk, the sky transforms into a palette of vibrant colors, inviting dreams and possibilities under the expansive canvas of the universe.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.5622
cosine_accuracy@5 0.9328
cosine_accuracy@10 0.9677
cosine_precision@1 0.5622
cosine_precision@5 0.1866
cosine_precision@10 0.0968
cosine_recall@1 0.5622
cosine_recall@5 0.9328
cosine_recall@10 0.9677
cosine_ndcg@5 0.7755
cosine_ndcg@10 0.7873
cosine_ndcg@100 0.795
cosine_mrr@5 0.7217
cosine_mrr@10 0.7268
cosine_mrr@100 0.7287
cosine_map@100 0.7287
dot_accuracy@1 0.5622
dot_accuracy@5 0.9353
dot_accuracy@10 0.9677
dot_precision@1 0.5622
dot_precision@5 0.1871
dot_precision@10 0.0968
dot_recall@1 0.5622
dot_recall@5 0.9353
dot_recall@10 0.9677
dot_ndcg@5 0.7767
dot_ndcg@10 0.7875
dot_ndcg@100 0.7952
dot_mrr@5 0.7224
dot_mrr@10 0.7271
dot_mrr@100 0.729
dot_map@100 0.729

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,606 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 8 tokens
    • mean: 18.11 tokens
    • max: 32 tokens
    • min: 3 tokens
    • mean: 110.54 tokens
    • max: 504 tokens
    • min: 27 tokens
    • mean: 114.86 tokens
    • max: 424 tokens
  • Samples:
    anchor positive negative
    Why should one do the Prayagraj Panchkoshi Parikrama? The Prayagraj Panchkoshi Parikrama is a deeply revered spiritual journey that offers multiple benefits to devotees. It is believed to grant blessings equivalent to visiting all sacred pilgrimage sites in India, providing divine grace and spiritual merit. The Parikrama route covers significant temples like the Dwadash Madhav temples, Akshayavat, and Mankameshwar, which are steeped in Hindu mythology and history, allowing pilgrims to connect with the spiritual and cultural heritage of Prayagraj. This circumambulation around sacred sites is also seen as a way to cleanse one's sins and progress towards Moksha (liberation from the cycle of birth and rebirth), making it a path of introspection and spiritual growth. The pilgrimage fosters unity among people from diverse backgrounds, offering a unique cultural exchange and shared spiritual experience. By participating, devotees also help revive an ancient tradition integral to the Kumbh Mela for centuries, reconnecting with age-old practices that have shaped the region's spiritual landscape. The Prayagraj Panchkoshi Parikrama is a profound journey of faith and devotion, enriching the spiritual lives of those who undertake it. Elevators are remarkable inventions that revolutionized how we navigate tall buildings. They provide a swift, efficient means of transportation between floors, making urban life more accessible. These mechanical wonders operate on a system of pulleys and counterweights, enabling them to carry heavy loads effortlessly. Safety features like emergency brakes and backup power systems ensure that passengers remain secure during their journey. Various designs and styles can be seen in buildings around the world, from sleek modern glass models to vintage models that evoke nostalgia. Elevators also highlight the advancement of engineering and technology over time, evolving from rudimentary designs to sophisticated machines with smart technology. They are essential in various settings, including residential, commercial, and industrial spaces, offering convenience and practicality. Their presence also allows for the efficient use of vertical space, fostering creativity in architectural designs and city planning. Overall, elevators have become an essential part of contemporary infrastructure, enhancing the way we live and work.
    Can I hire an E-Rickshaw for a specific duration or multiple stops within the Mela? Yes, E-Rickshaws have designated pick-up points, and you can hire them for a specific duration or multiple stops depending on your needs and arrangements with the driver The process of assigning roles in a theatrical production often involves extensive auditions and interviews. Each candidate brings unique skills, and the director must carefully consider how their abilities will fit into the overall vision for the performance. Team dynamics play a crucial role, as collaboration is essential for a successful show.
    What are the best routes to avoid traffic while traveling from Prayagraj Junction to the Mela grounds? The distance between Prayagraj Junction and the Mela Grounds during the Kumbh Mela in Prayagraj, India is approximately 5-7 kilometers. By bus, this could take anywhere from 20-40 minutes, depending on traffic and the specific route. The ancient art of glassblowing has captivated artisans for centuries. Bubbles of molten glass are deftly shaped into exquisite forms, revealing the synergy between fire and craftsmanship. The process requires both skill and creativity, resulting in functional pieces or striking sculptures that bring vibrancy to any space. Each creation is unique, echoing the delicate dance of temperature and technique involved in the art form.
  • Loss: GISTEmbedLoss with these parameters:
    {'guide': SentenceTransformer(
      (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
      (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
      (2): Normalize()
    ), 'temperature': 0.01}
    

Evaluation Dataset

Unnamed Dataset

  • Size: 402 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 402 samples:
    anchor positive negative
    type string string string
    details
    • min: 8 tokens
    • mean: 17.98 tokens
    • max: 30 tokens
    • min: 3 tokens
    • mean: 111.36 tokens
    • max: 471 tokens
    • min: 30 tokens
    • mean: 116.68 tokens
    • max: 501 tokens
  • Samples:
    anchor positive negative
    What is the Female Helpline number? The Women and Child Helpline number for assistance during the Maha Kumbh 2025 is 1091. This service is available for any support related to the safety and well-being of women and children. The average lifespan of a species can vary significantly. In some cases, dolphins can live up to 60 years, while certain types of tortoises have been known to exceed 150 years. Understanding the factors that influence longevity is essential in the study of wildlife conservation.
    What is the estimated travel time from the Airport to the Mela grounds during peak hours? The estimated travel time from the Airport to the Mela grounds is about 1 hour on non-peak days. Travel times may vary significantly during peak hours due to traffic and road conditions.

    The recipe for chocolate cake requires several key ingredients to achieve the perfect texture. Begin by preheating the oven to 350°F. Combine flour, sugar, cocoa powder, and eggs in a large mixing bowl, stirring until smooth. Baking can be an enjoyable process filled with delightful aromas and flavors.
    How safe is it to travel by public transport from Prayagraj city to the Kumbh Mela at night? There is no direct metro service to the Mela grounds from Prayagraj city. However, Govt operated dedicated shuttle buses are available within Prayagraj for transportation to the Mela. These buses operate on fixed routes and fixed times. The fastest way to prepare a delicious apple pie starts with choosing the right variety of apples. Granny Smith apples are great for tartness, while Honeycrisp provides sweetness. After washing and peeling the apples, slice them into thin pieces, ensuring an even texture. Combine the apple slices with sugar, cinnamon, and a hint of lemon juice. Roll out your pie crust and fill it generously with the apple mixture, top it with another crust, and create small vents to allow steam to escape. Bake at 425°F until golden brown, and enjoy the fantastic aroma that fills your kitchen!
  • Loss: GISTEmbedLoss with these parameters:
    {'guide': SentenceTransformer(
      (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
      (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
      (2): Normalize()
    ), 'temperature': 0.01}
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • gradient_accumulation_steps: 2
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 30
  • warmup_ratio: 0.1
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 30
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss val_evaluator_dot_map@100
0.1980 10 0.8028 0.4071 0.6415
0.3960 20 0.7561 0.3701 0.6406
0.5941 30 0.9729 0.3100 0.6415
0.7921 40 0.6137 0.2505 0.6478
0.9901 50 0.4747 0.1978 0.6501
1.1881 60 0.4595 0.1609 0.6541
1.3861 70 0.3862 0.1300 0.6570
1.5842 80 0.293 0.1003 0.6606
1.7822 90 0.2806 0.0760 0.6588
1.9802 100 0.1249 0.0586 0.6616
2.1782 110 0.2265 0.0503 0.6677
2.3762 120 0.1292 0.0482 0.6701
2.5743 130 0.1649 0.0448 0.6756
2.7723 140 0.1213 0.0442 0.6810
2.9703 150 0.1363 0.0419 0.6843
3.1683 160 0.0972 0.0376 0.6859
3.3663 170 0.1079 0.0326 0.6896
3.5644 180 0.1265 0.0293 0.6899
3.7624 190 0.0645 0.0279 0.6952
3.9604 200 0.1116 0.0272 0.6934
4.1584 210 0.0757 0.0258 0.6954
4.3564 220 0.1492 0.0248 0.6991
4.5545 230 0.0536 0.0246 0.6971
4.7525 240 0.0346 0.0248 0.6958
4.9505 250 0.0501 0.0247 0.6974
5.1485 260 0.0443 0.0248 0.6975
5.3465 270 0.0585 0.0245 0.6998
5.5446 280 0.0514 0.0246 0.7013
5.7426 290 0.0948 0.0244 0.7073
5.9406 300 0.054 0.0243 0.7049
6.1386 310 0.0317 0.0241 0.7069
6.3366 320 0.1327 0.0249 0.7061
6.5347 330 0.0665 0.0255 0.7073
6.7327 340 0.09 0.0257 0.7073
6.9307 350 0.111 0.0255 0.7067
7.1287 360 0.0473 0.0255 0.7096
7.3267 370 0.0429 0.0248 0.7063
7.5248 380 0.0686 0.0249 0.7087
7.7228 390 0.1096 0.0251 0.7113
7.9208 400 0.0794 0.0255 0.7083
8.1188 410 0.0354 0.0246 0.7094
8.3168 420 0.078 0.0239 0.7093
8.5149 430 0.091 0.0234 0.7057
8.7129 440 0.084 0.0236 0.7107
8.9109 450 0.0702 0.0235 0.7114
9.1089 460 0.0701 0.0233 0.7142
9.3069 470 0.0706 0.0231 0.7140
9.5050 480 0.029 0.0230 0.7125
9.7030 490 0.0411 0.0233 0.7107
9.9010 500 0.0691 0.0233 0.7140
10.0990 510 0.0421 0.0232 0.7165
10.2970 520 0.0497 0.0232 0.7200
10.4950 530 0.0639 0.0232 0.7188
10.6931 540 0.0201 0.0238 0.7161
10.8911 550 0.0833 0.0241 0.7170
11.0891 560 0.0266 0.0242 0.7197
11.2871 570 0.0472 0.0241 0.7220
11.4851 580 0.0614 0.0240 0.7234
11.6832 590 0.0507 0.0242 0.7243
11.8812 600 0.031 0.0239 0.7226
12.0792 610 0.0413 0.0239 0.7216
12.2772 620 0.0222 0.0230 0.7234
12.4752 630 0.0466 0.0221 0.7239
12.6733 640 0.0482 0.0219 0.7218
12.8713 650 0.0657 0.0218 0.7197
13.0693 660 0.0521 0.0218 0.7235
13.2673 670 0.051 0.0218 0.7234
13.4653 680 0.0674 0.0220 0.7243
13.6634 690 0.0477 0.0220 0.7232
13.8614 700 0.0827 0.0218 0.7232
14.0594 710 0.0501 0.0217 0.7247
14.2574 720 0.0278 0.0216 0.7233
14.4554 730 0.0162 0.0216 0.7201
14.6535 740 0.0515 0.0217 0.7219
14.8515 750 0.0514 0.0218 0.7256
15.0495 760 0.088 0.0217 0.7252
15.2475 770 0.0298 0.0217 0.7226
15.4455 780 0.0682 0.0217 0.7259
15.6436 790 0.0485 0.0217 0.7253
15.8416 800 0.0419 0.0217 0.7286
16.0396 810 0.0823 0.0216 0.7268
16.2376 820 0.0533 0.0215 0.7250
16.4356 830 0.0336 0.0215 0.7262
16.6337 840 0.0375 0.0214 0.7270
16.8317 850 0.0243 0.0213 0.7281
17.0297 860 0.0675 0.0212 0.7265
17.2277 870 0.0482 0.0211 0.7260
17.4257 880 0.0511 0.0211 0.7297
17.6238 890 0.0396 0.0211 0.7282
17.8218 900 0.0493 0.0211 0.7275
18.0198 910 0.0378 0.0210 0.7279
18.2178 920 0.0546 0.0210 0.7265
18.4158 930 0.0421 0.0209 0.7286
18.6139 940 0.0599 0.0208 0.7286
18.8119 950 0.0766 0.0205 0.7297
19.0099 960 0.0204 0.0205 0.7275
19.2079 970 0.0321 0.0205 0.7282
19.4059 980 0.0069 0.0204 0.7266
19.6040 990 0.0563 0.0205 0.7245
19.8020 1000 0.0575 0.0205 0.7236
20.0 1010 0.0207 0.0205 0.7261
20.1980 1020 0.03 0.0205 0.7253
20.3960 1030 0.0712 0.0205 0.7269
20.5941 1040 0.0482 0.0205 0.7277
20.7921 1050 0.05 0.0205 0.7283
20.9901 1060 0.0407 0.0205 0.7282
21.1881 1070 0.0591 0.0205 0.7286
21.3861 1080 0.0228 0.0205 0.7265
21.5842 1090 0.0318 0.0205 0.7264
21.7822 1100 0.0768 0.0205 0.7254
21.9802 1110 0.0415 0.0205 0.7264
22.1782 1120 0.0681 0.0205 0.7252
22.3762 1130 0.0622 0.0205 0.7255
22.5743 1140 0.0508 0.0205 0.7251
22.7723 1150 0.0642 0.0205 0.7237
22.9703 1160 0.0469 0.0206 0.7245
23.1683 1170 0.0172 0.0206 0.7256
23.3663 1180 0.055 0.0206 0.7255
23.5644 1190 0.0488 0.0206 0.7266
23.7624 1200 0.0208 0.0206 0.7243
23.9604 1210 0.0415 0.0206 0.7249
24.1584 1220 0.0804 0.0206 0.7264
24.3564 1230 0.0243 0.0205 0.7256
24.5545 1240 0.037 0.0205 0.7258
24.7525 1250 0.0604 0.0205 0.7284
24.9505 1260 0.0278 0.0205 0.7245
25.1485 1270 0.0317 0.0205 0.7235
25.3465 1280 0.0824 0.0205 0.7253
25.5446 1290 0.0639 0.0205 0.7258
25.7426 1300 0.0269 0.0205 0.7247
25.9406 1310 0.0429 0.0205 0.7278
26.1386 1320 0.0692 0.0205 0.7279
26.3366 1330 0.0771 0.0205 0.7301
26.5347 1340 0.0578 0.0205 0.7280
26.7327 1350 0.025 0.0205 0.7258
26.9307 1360 0.0414 0.0205 0.7286
27.1287 1370 0.0484 0.0205 0.7284
27.3267 1380 0.0581 0.0205 0.7294
27.5248 1390 0.069 0.0205 0.7288
27.7228 1400 0.0864 0.0205 0.7301
27.9208 1410 0.0605 0.0205 0.7285
28.1188 1420 0.0327 0.0205 0.7271
28.3168 1430 0.0789 0.0205 0.7258
28.5149 1440 0.056 0.0205 0.7276
28.7129 1450 0.0256 0.0205 0.7272
28.9109 1460 0.0316 0.0205 0.7273
29.1089 1470 0.0528 0.0205 0.7287
29.3069 1480 0.0552 0.0205 0.7274
29.5050 1490 0.0441 0.0205 0.7287
29.7030 1500 0.0246 0.0205 0.7290
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.2.1
  • Transformers: 4.44.2
  • PyTorch: 2.5.0+cu121
  • Accelerate: 0.34.2
  • Datasets: 3.1.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

GISTEmbedLoss

@misc{solatorio2024gistembed,
    title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
    author={Aivin V. Solatorio},
    year={2024},
    eprint={2402.16829},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}