YAML Metadata
Warning:
The pipeline tag "text-ranking" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, text2text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, other
MiniLM-L12-H384 trained on GooAQ
This is a Cross Encoder model finetuned from answerdotai/ModernBERT-base on the gooaq dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
Model Details
Model Description
- Model Type: Cross Encoder
- Base model: answerdotai/ModernBERT-base
- Maximum Sequence Length: 8192 tokens
- Number of Output Labels: 1 label
- Training Dataset:
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("tomaarsen/reranker-ModernBERT-base-gooaq-cmnrl")
# Get scores for pairs of texts
pairs = [
['should you take ibuprofen with high blood pressure?', "In general, people with high blood pressure should use acetaminophen or possibly aspirin for over-the-counter pain relief. Unless your health care provider has said it's OK, you should not use ibuprofen, ketoprofen, or naproxen sodium. If aspirin or acetaminophen doesn't help with your pain, call your doctor."],
['how old do you have to be to work in sc?', 'The general minimum age of employment for South Carolina youth is 14, although the state allows younger children who are performers to work in show business. If their families are agricultural workers, children younger than age 14 may also participate in farm labor.'],
['how to write a topic proposal for a research paper?', "['Write down the main topic of your paper. ... ', 'Write two or three short sentences under the main topic that explain why you chose that topic. ... ', 'Write a thesis sentence that states the angle and purpose of your research paper. ... ', 'List the items you will cover in the body of the paper that support your thesis statement.']"],
['how much does aaf pay players?', 'These dates provided an opportunity for players cut at the NFL roster deadline, and each player signed a non-guaranteed three-year contract worth a total of $250,000 ($70,000 in 2019; $80,000 in 2020; $100,000 in 2021), with performance-based and fan-interaction incentives allowing for players to earn more.'],
['is jove and zeus the same?', 'Jupiter, or Jove, in Roman mythology is the king of the gods and the god of sky and thunder, equivalent to Zeus in Greek traditions.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'should you take ibuprofen with high blood pressure?',
[
"In general, people with high blood pressure should use acetaminophen or possibly aspirin for over-the-counter pain relief. Unless your health care provider has said it's OK, you should not use ibuprofen, ketoprofen, or naproxen sodium. If aspirin or acetaminophen doesn't help with your pain, call your doctor.",
'The general minimum age of employment for South Carolina youth is 14, although the state allows younger children who are performers to work in show business. If their families are agricultural workers, children younger than age 14 may also participate in farm labor.',
"['Write down the main topic of your paper. ... ', 'Write two or three short sentences under the main topic that explain why you chose that topic. ... ', 'Write a thesis sentence that states the angle and purpose of your research paper. ... ', 'List the items you will cover in the body of the paper that support your thesis statement.']",
'These dates provided an opportunity for players cut at the NFL roster deadline, and each player signed a non-guaranteed three-year contract worth a total of $250,000 ($70,000 in 2019; $80,000 in 2020; $100,000 in 2021), with performance-based and fan-interaction incentives allowing for players to earn more.',
'Jupiter, or Jove, in Roman mythology is the king of the gods and the god of sky and thunder, equivalent to Zeus in Greek traditions.',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
Evaluation
Metrics
Cross Encoder Reranking
- Datasets:
NanoMSMARCO_R100
,NanoNFCorpus_R100
andNanoNQ_R100
- Evaluated with
CrossEncoderRerankingEvaluator
with these parameters:{ "at_k": 10, "always_rerank_positives": true }
Metric | NanoMSMARCO_R100 | NanoNFCorpus_R100 | NanoNQ_R100 |
---|---|---|---|
map | 0.4386 (-0.0510) | 0.3362 (+0.0752) | 0.5793 (+0.1597) |
mrr@10 | 0.4263 (-0.0512) | 0.5449 (+0.0451) | 0.5857 (+0.1590) |
ndcg@10 | 0.5101 (-0.0303) | 0.3597 (+0.0347) | 0.6474 (+0.1468) |
Cross Encoder Nano BEIR
- Dataset:
NanoBEIR_R100_mean
- Evaluated with
CrossEncoderNanoBEIREvaluator
with these parameters:{ "dataset_names": [ "msmarco", "nfcorpus", "nq" ], "rerank_k": 100, "at_k": 10, "always_rerank_positives": true }
Metric | Value |
---|---|
map | 0.4514 (+0.0613) |
mrr@10 | 0.5190 (+0.0510) |
ndcg@10 | 0.5057 (+0.0504) |
Training Details
Training Dataset
gooaq
- Dataset: gooaq at b089f72
- Size: 99,000 training samples
- Columns:
question
andanswer
- Approximate statistics based on the first 1000 samples:
question answer type string string details - min: 17 characters
- mean: 42.88 characters
- max: 95 characters
- min: 53 characters
- mean: 251.42 characters
- max: 398 characters
- Samples:
question answer what are the 5 characteristics of a star?
Key Concept: Characteristics used to classify stars include color, temperature, size, composition, and brightness.
are copic markers alcohol ink?
Copic Ink is alcohol-based and flammable. Keep away from direct sunlight and extreme temperatures.
what is the difference between appellate term and appellate division?
Appellate terms An appellate term is an intermediate appellate court that hears appeals from the inferior courts within their designated counties or judicial districts, and are intended to ease the workload on the Appellate Division and provide a less expensive forum closer to the people.
- Loss:
CachedMultipleNegativesRankingLoss
with these parameters:{ "scale": 10.0, "num_negatives": 5, "activation_fct": "torch.nn.modules.activation.Sigmoid", "mini_batch_size": 16 }
Evaluation Dataset
gooaq
- Dataset: gooaq at b089f72
- Size: 1,000 evaluation samples
- Columns:
question
andanswer
- Approximate statistics based on the first 1000 samples:
question answer type string string details - min: 18 characters
- mean: 43.05 characters
- max: 88 characters
- min: 51 characters
- mean: 252.39 characters
- max: 386 characters
- Samples:
question answer should you take ibuprofen with high blood pressure?
In general, people with high blood pressure should use acetaminophen or possibly aspirin for over-the-counter pain relief. Unless your health care provider has said it's OK, you should not use ibuprofen, ketoprofen, or naproxen sodium. If aspirin or acetaminophen doesn't help with your pain, call your doctor.
how old do you have to be to work in sc?
The general minimum age of employment for South Carolina youth is 14, although the state allows younger children who are performers to work in show business. If their families are agricultural workers, children younger than age 14 may also participate in farm labor.
how to write a topic proposal for a research paper?
['Write down the main topic of your paper. ... ', 'Write two or three short sentences under the main topic that explain why you chose that topic. ... ', 'Write a thesis sentence that states the angle and purpose of your research paper. ... ', 'List the items you will cover in the body of the paper that support your thesis statement.']
- Loss:
CachedMultipleNegativesRankingLoss
with these parameters:{ "scale": 10.0, "num_negatives": 5, "activation_fct": "torch.nn.modules.activation.Sigmoid", "mini_batch_size": 16 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 64per_device_eval_batch_size
: 64learning_rate
: 2e-05num_train_epochs
: 1warmup_ratio
: 0.1seed
: 12bf16
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 64per_device_eval_batch_size
: 64per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 12data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_R100_ndcg@10 | NanoNFCorpus_R100_ndcg@10 | NanoNQ_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
---|---|---|---|---|---|---|---|
-1 | -1 | - | - | 0.0077 (-0.5327) | 0.2528 (-0.0722) | 0.0311 (-0.4696) | 0.0972 (-0.3582) |
0.0006 | 1 | 2.0578 | - | - | - | - | - |
0.0646 | 100 | 1.241 | - | - | - | - | - |
0.1293 | 200 | 0.0547 | - | - | - | - | - |
0.1616 | 250 | - | 0.0271 | 0.4767 (-0.0637) | 0.3039 (-0.0212) | 0.5435 (+0.0429) | 0.4414 (-0.0140) |
0.1939 | 300 | 0.0203 | - | - | - | - | - |
0.2586 | 400 | 0.0122 | - | - | - | - | - |
0.3232 | 500 | 0.0094 | 0.0087 | 0.4937 (-0.0467) | 0.3334 (+0.0084) | 0.6134 (+0.1127) | 0.4802 (+0.0248) |
0.3878 | 600 | 0.0075 | - | - | - | - | - |
0.4525 | 700 | 0.0065 | - | - | - | - | - |
0.4848 | 750 | - | 0.0070 | 0.5089 (-0.0315) | 0.3458 (+0.0208) | 0.6809 (+0.1802) | 0.5119 (+0.0565) |
0.5171 | 800 | 0.0062 | - | - | - | - | - |
0.5818 | 900 | 0.0061 | - | - | - | - | - |
0.6464 | 1000 | 0.0062 | 0.0058 | 0.5470 (+0.0066) | 0.3589 (+0.0339) | 0.6427 (+0.1421) | 0.5162 (+0.0608) |
0.7111 | 1100 | 0.0055 | - | - | - | - | - |
0.7757 | 1200 | 0.0059 | - | - | - | - | - |
0.8080 | 1250 | - | 0.0055 | 0.5017 (-0.0388) | 0.3571 (+0.0321) | 0.6484 (+0.1478) | 0.5024 (+0.0470) |
0.8403 | 1300 | 0.0059 | - | - | - | - | - |
0.9050 | 1400 | 0.0049 | - | - | - | - | - |
0.9696 | 1500 | 0.0055 | 0.0096 | 0.5091 (-0.0313) | 0.3587 (+0.0337) | 0.6442 (+0.1435) | 0.5040 (+0.0486) |
-1 | -1 | - | - | 0.5101 (-0.0303) | 0.3597 (+0.0347) | 0.6474 (+0.1468) | 0.5057 (+0.0504) |
Framework Versions
- Python: 3.11.10
- Sentence Transformers: 3.5.0.dev0
- Transformers: 4.49.0
- PyTorch: 2.5.1+cu124
- Accelerate: 1.2.0
- Datasets: 2.21.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- 0
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.
Model tree for tomaarsen/reranker-ModernBERT-base-gooaq-cmnrl
Base model
answerdotai/ModernBERT-baseDataset used to train tomaarsen/reranker-ModernBERT-base-gooaq-cmnrl
Evaluation results
- Map on NanoMSMARCO R100self-reported0.439
- Mrr@10 on NanoMSMARCO R100self-reported0.426
- Ndcg@10 on NanoMSMARCO R100self-reported0.510
- Map on NanoNFCorpus R100self-reported0.336
- Mrr@10 on NanoNFCorpus R100self-reported0.545
- Ndcg@10 on NanoNFCorpus R100self-reported0.360
- Map on NanoNQ R100self-reported0.579
- Mrr@10 on NanoNQ R100self-reported0.586
- Ndcg@10 on NanoNQ R100self-reported0.647
- Map on NanoBEIR R100 meanself-reported0.451