SentenceTransformer based on thomaskim1130/stella_en_400M_v5-FinanceRAG

This is a sentence-transformers model finetuned from thomaskim1130/stella_en_400M_v5-FinanceRAG. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: thomaskim1130/stella_en_400M_v5-FinanceRAG
Maximum Sequence Length: 512 tokens
Output Dimensionality: 1024 tokens
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: NewModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 1024, 'out_features': 1024, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    "Instruct: Given a web search query, retrieve relevant passages that answer the query.\nQuery: Title: \nText: In the year with lowest amount of Deposits with banks Average volume, what's the increasing rate of Deposits with banks Average volume?",
    'Title: \nText: Additional Interest Rate Details Average Balances and Interest Ratesé\x88¥æ\x93\x9cssets(1)(2)(3)(4)\n|  | Average volume | Interest revenue | % Average rate |\n| In millions of dollars, except rates | 2015 | 2014 | 2013 | 2015 | 2014 | 2013 | 2015 | 2014 | 2013 |\n| Assets |  |  |  |  |  |  |  |  |  |\n| Deposits with banks-5 | $133,790 | $161,359 | $144,904 | $727 | $959 | $1,026 | 0.54% | 0.59% | 0.71% |\n| Federal funds sold and securities borrowed or purchased under agreements to resell-6 |  |  |  |  |  |  |  |  |  |\n| In U.S. offices | $150,359 | $153,688 | $158,237 | $1,211 | $1,034 | $1,133 | 0.81% | 0.67% | 0.72% |\n| In offices outside the U.S.-5 | 84,006 | 101,177 | 109,233 | 1,305 | 1,332 | 1,433 | 1.55 | 1.32 | 1.31 |\n| Total | $234,365 | $254,865 | $267,470 | $2,516 | $2,366 | $2,566 | 1.07% | 0.93% | 0.96% |\n| Trading account assets-7(8) |  |  |  |  |  |  |  |  |  |\n| In U.S. offices | $114,639 | $114,910 | $126,123 | $3,945 | $3,472 | $3,728 | 3.44% | 3.02% | 2.96% |\n| In offices outside the U.S.-5 | 103,348 | 119,801 | 127,291 | 2,141 | 2,538 | 2,683 | 2.07 | 2.12 | 2.11 |\n| Total | $217,987 | $234,711 | $253,414 | $6,086 | $6,010 | $6,411 | 2.79% | 2.56% | 2.53% |\n| Investments |  |  |  |  |  |  |  |  |  |\n| In U.S. offices |  |  |  |  |  |  |  |  |  |\n| Taxable | $214,714 | $188,910 | $174,084 | $3,812 | $3,286 | $2,713 | 1.78% | 1.74% | 1.56% |\n| Exempt from U.S. income tax | 20,034 | 20,386 | 18,075 | 443 | 626 | 811 | 2.21 | 3.07 | 4.49 |\n| In offices outside the U.S.-5 | 102,376 | 113,163 | 114,122 | 3,071 | 3,627 | 3,761 | 3.00 | 3.21 | 3.30 |\n| Total | $337,124 | $322,459 | $306,281 | $7,326 | $7,539 | $7,285 | 2.17% | 2.34% | 2.38% |\n| Loans (net of unearned income)(9) |  |  |  |  |  |  |  |  |  |\n| In U.S. offices | $354,439 | $361,769 | $354,707 | $24,558 | $26,076 | $25,941 | 6.93% | 7.21% | 7.31% |\n| In offices outside the U.S.-5 | 273,072 | 296,656 | 292,852 | 15,988 | 18,723 | 19,660 | 5.85 | 6.31 | 6.71 |\n| Total | $627,511 | $658,425 | $647,559 | $40,546 | $44,799 | $45,601 | 6.46% | 6.80% | 7.04% |\n| Other interest-earning assets-10 | $55,060 | $40,375 | $38,233 | $1,839 | $507 | $602 | 3.34% | 1.26% | 1.57% |\n| Total interest-earning assets | $1,605,837 | $1,672,194 | $1,657,861 | $59,040 | $62,180 | $63,491 | 3.68% | 3.72% | 3.83% |\n| Non-interest-earning assets-7 | $218,000 | $224,721 | $222,526 |  |  |  |  |  |  |\n| Total assets from discontinued operations | — | — | 2,909 |  |  |  |  |  |  |\n| Total assets | $1,823,837 | $1,896,915 | $1,883,296 |  |  |  |  |  |  |\nNet interest revenue includes the taxable equivalent adjustments related to the tax-exempt bond portfolio (based on the U. S.  federal statutory tax rate of 35%) of $487 million, $498 million and $521 million for 2015, 2014 and 2013, respectively.\nInterest rates and amounts include the effects of risk management activities associated with the respective asset categories.\nMonthly or quarterly averages have been used by certain subsidiaries where daily averages are unavailable.\nDetailed average volume, Interest revenue and Interest expense exclude Discontinued operations.\nSee Note 2 to the Consolidated Financial Statements.\nAverage rates reflect prevailing local interest rates, including inflationary effects and monetary corrections in certain countries.\nAverage volumes of securities borrowed or purchased under agreements to resell are reported net pursuant to ASC 210-20-45.\nHowever, Interest revenue excludes the impact of ASC 210-20-45.\nThe fair value carrying amounts of derivative contracts are reported net, pursuant to ASC 815-10-45, in Non-interest-earning assets and Other non-interest bearing liabilities.\nInterest expense on Trading account liabilities of ICG is reported as a reduction of Interest revenue.\nInterest revenue and Interest expense on cash collateral positions are reported in interest on Trading account assets and Trading account liabilities, respectively.\nIncludes cash-basis loans.\nIncludes brokerage receivables.\nDuring 2015, continued management actions, primarily the sale or transfer to held-for-sale of approximately $1.5 billion of delinquent residential first mortgages, including $0.9 billion in the fourth quarter largely associated with the transfer of CitiFinancial loans to held-for-sale referenced above, were the primary driver of the overall improvement in delinquencies within Citi Holdings\x80\x99 residential first mortgage portfolio.\nCredit performance from quarter to quarter could continue to be impacted by the amount of delinquent loan sales or transfers to held-for-sale, as well as overall trends in HPI and interest rates.\nNorth America Residential First Mortgages\x80\x94State Delinquency Trends The following tables set forth the six U. S.  states and/or regions with the highest concentration of Citi\x80\x99s residential first mortgages.\n| In billions of dollars | December 31, 2015 | December 31, 2014 |\n| State-1 | ENR-2 | ENRDistribution | 90+DPD% | %LTV >100%-3 | RefreshedFICO | ENR-2 | ENRDistribution | 90+DPD% | %LTV >100%-3 | RefreshedFICO |\n| CA | $19.2 | 37% | 0.2% | 1% | 754 | $18.9 | 31% | 0.6% | 2% | 745 |\n| NY/NJ/CT-4 | 12.7 | 25 | 0.8 | 1 | 751 | 12.2 | 20 | 1.9 | 2 | 740 |\n| VA/MD | 2.2 | 4 | 1.2 | 2 | 719 | 3.0 | 5 | 3.0 | 8 | 695 |\n| IL-4 | 2.2 | 4 | 1.0 | 3 | 735 | 2.5 | 4 | 2.5 | 9 | 713 |\n| FL-4 | 2.2 | 4 | 1.1 | 4 | 723 | 2.8 | 5 | 3.0 | 14 | 700 |\n| TX | 1.9 | 4 | 1.0 | — | 711 | 2.5 | 4 | 2.7 | — | 680 |\n| Other | 11.0 | 21 | 1.3 | 2 | 710 | 18.2 | 30 | 3.3 | 7 | 677 |\n| Total-5 | $51.5 | 100% | 0.7% | 1% | 738 | $60.1 | 100% | 2.1% | 4% | 715 |\nNote: Totals may not sum due to rounding.\n(1) Certain of the states are included as part of a region based on Citi\x80\x99s view of similar HPI within the region.\n(2) Ending net receivables.\nExcludes loans in Canada and Puerto Rico, loans guaranteed by U. S.  government agencies, loans recorded at fair value and loans subject to long term standby commitments (LTSCs).\nExcludes balances for which FICO or LTV data are unavailable.\n(3) LTV ratios (loan balance divided by appraised value) are calculated at origination and updated by applying market price data.\n(4) New York, New Jersey, Connecticut, Florida and Illinois are judicial states.\n(5) Improvement in state trends during 2015 was primarily due to the sale or transfer to held-for-sale of residential first mortgages, including the transfer of CitiFinancial residential first mortgages to held-for-sale in the fourth quarter of 2015.\nForeclosures A substantial majority of Citi\x80\x99s foreclosure inventory consists of residential first mortgages.\nAt December 31, 2015, Citi\x80\x99s foreclosure inventory included approximately $0.1 billion, or 0.2%, of the total residential first mortgage portfolio, compared to $0.6 billion, or 0.9%, at December 31, 2014, based on the dollar amount of ending net receivables of loans in foreclosure inventory, excluding loans that are guaranteed by U. S.  government agencies and loans subject to LTSCs.\nNorth America Consumer Mortgage Quarterly Credit Trends \x80\x94Net Credit Losses and Delinquencies\x80\x94Home Equity Loans Citi\x80\x99s home equity loan portfolio consists of both fixed-rate home equity loans and loans extended under home equity lines of credit.\nFixed-rate home equity loans are fully amortizing.\nHome equity lines of credit allow for amounts to be drawn for a period of time with the payment of interest only and then, at the end of the draw period, the then-outstanding amount is converted to an amortizing loan (the interest-only payment feature during the revolving period is standard for this product across the industry).\nAfter conversion, the home equity loans typically have a 20-year amortization period.\nAs of December 31, 2015, Citi\x80\x99s home equity loan portfolio of $22.8 billion consisted of $6.3 billion of fixed-rate home equity loans and $16.5 billion of loans extended under home equity lines of credit (Revolving HELOCs).',
    'Title: \nText: Issuer Purchases of Equity Securities Repurchases of common stock are made to support the Company\x80\x99s stock-based employee compensation plans and for other corporate purposes.\nOn February 13, 2006, the Board of Directors authorized the purchase of $2.0 billion of the Company\x80\x99s common stock between February 13, 2006 and February 28, 2007.\nIn August 2006, 3M\x80\x99s Board of Directors authorized the repurchase of an additional $1.0 billion in share repurchases, raising the total authorization to $3.0 billion for the period from February 13, 2006 to February 28, 2007.\nIn February 2007, 3M\x80\x99s Board of Directors authorized a twoyear share repurchase of up to $7.0 billion for the period from February 12, 2007 to February 28, 2009.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Dataset: Evaluate
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.4636
cosine_accuracy@3	0.682
cosine_accuracy@5	0.7597
cosine_accuracy@10	0.8519
cosine_precision@1	0.4636
cosine_precision@3	0.2565
cosine_precision@5	0.1777
cosine_precision@10	0.1024
cosine_recall@1	0.4095
cosine_recall@3	0.6424
cosine_recall@5	0.7299
cosine_recall@10	0.8398
cosine_ndcg@10	0.6409
cosine_mrr@10	0.5902
cosine_map@100	0.5753
dot_accuracy@1	0.4393
dot_accuracy@3	0.6748
dot_accuracy@5	0.7354
dot_accuracy@10	0.8422
dot_precision@1	0.4393
dot_precision@3	0.25
dot_precision@5	0.1709
dot_precision@10	0.0998
dot_recall@1	0.3828
dot_recall@3	0.6338
dot_recall@5	0.7005
dot_recall@10	0.8224
dot_ndcg@10	0.6195
dot_mrr@10	0.5712
dot_map@100	0.5528

Training Details

Training Dataset

Unnamed Dataset

Size: 2,256 training samples
Columns: sentence_0 and sentence_1
Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1
type string string
details
min: 28 tokens
mean: 45.02 tokens
max: 114 tokens

min: 23 tokens
mean: 406.36 tokens
max: 512 tokens

	sentence_0	sentence_1
type	string	string
details	min: 28 tokens mean: 45.02 tokens max: 114 tokens	min: 23 tokens mean: 406.36 tokens max: 512 tokens

Samples:

sentence_0	sentence_1
`Instruct: Given a web search query, retrieve relevant passages that answer the query. Query: Title: Text: What do all Notional sum up, excluding those negative ones in 2008 for As of December 31, 2008 for Financial assets with interest rate risk? (in million)`	Title: Text: Cash Flows Our estimated future benefit payments for funded and unfunded plans are as follows (in millions): 1 The expected benefit payments for our other postretirement benefit plans are net of estimated federal subsidies expected to be received under the Medicare Prescription Drug, Improvement and Modernization Act of 2003. Federal subsidies are estimated to be $3 million for the period 2019-2023 and $2 million for the period 2024-2028. The Company anticipates making pension contributions in 2019 of $32 million, all of which will be allocated to our international plans. The majority of these contributions are required by funding regulations or law.
`Instruct: Given a web search query, retrieve relevant passages that answer the query. Query: Title: Text: what's the total amount of No surrender charge of 2010 Individual Fixed Annuities, Change in cash of 2008, and Total reserves of 2010 Individual Variable Annuities ?`	Title: Text: 2010 and 2009 Comparison Surrender rates have improved compared to the prior year for group retirement products, individual fixed annuities and individual variable annuities as surrenders have returned to more normal levels. Surrender rates for individual fixed annuities have decreased significantly in 2010 due to the low interest rate environment and the relative competitiveness of interest credited rates on the existing block of fixed annuities versus interest rates on alternative investment options available in the marketplace. Surrender rates for group retirement products are expected to increase in 2011 as certain large group surrenders are anticipated.2009 and 2008 Comparison Surrenders and other withdrawals increased in 2009 for group retirement products primarily due to higher large group surrenders. However, surrender rates and withdrawals have improved for individual fixed annuities and individual variable annuities. The following table presents reserves by surrender charge category and surrender rates:
`Instruct: Given a web search query, retrieve relevant passages that answer the query. Query: Title: Text: What was the total amount of elements for RevPAR excluding those elements greater than 150 in 2016 ?`	`Title: Text: 2016 Compared to 2015 Comparable?Company-Operated North American Properties`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
num_train_epochs: 2
fp16: True
batch_sampler: no_duplicates
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 2
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
batch_sampler: no_duplicates
multi_dataset_batch_sampler: round_robin

Training Logs

Epoch	Step	Evaluate_cosine_map@100
0	0	0.4564
1.0	141	0.5233
2.0	282	0.5753

Framework Versions

Python: 3.10.12
Sentence Transformers: 3.1.1
Transformers: 4.45.2
PyTorch: 2.5.1+cu121
Accelerate: 1.1.1
Datasets: 3.1.0
Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

thomaskim1130
/

stella_en_400M_v5-FinanceRAG-v2