bge-large-en-v1.5-2024-12-10_07-12-15-quality-weight-1

This model is a fine-tuned version of BAAI/bge-large-en-v1.5 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.0151
Spearman: 0.9383
Pearson: 0.9340
Mse: 0.0151

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 64
total_train_batch_size: 256
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.05
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss	Spearman	Pearson	Mse
0.0301	0.0997	263	0.0276	0.8847	0.8733	0.0276
0.0296	0.1994	526	0.0279	0.8977	0.8830	0.0279
0.0314	0.2990	789	0.0236	0.9045	0.8946	0.0236
0.0228	0.3987	1052	0.0231	0.9065	0.8942	0.0231
0.0241	0.4984	1315	0.0217	0.9111	0.9031	0.0217
0.0162	0.5981	1578	0.0221	0.9114	0.9033	0.0221
0.0227	0.6978	1841	0.0203	0.9168	0.9101	0.0203
0.0203	0.7975	2104	0.0211	0.9181	0.9105	0.0211
0.0215	0.8971	2367	0.0199	0.9155	0.9102	0.0199
0.0203	0.9968	2630	0.0193	0.9204	0.9151	0.0193
0.0187	1.0963	2893	0.0188	0.9234	0.9151	0.0188
0.0192	1.1960	3156	0.0185	0.9240	0.9186	0.0185
0.0128	1.2956	3419	0.0195	0.9241	0.9177	0.0195
0.0128	1.3953	3682	0.0175	0.9261	0.9213	0.0175
0.0191	1.4950	3945	0.0177	0.9256	0.9206	0.0177
0.0129	1.5947	4208	0.0186	0.9246	0.9199	0.0186
0.0167	1.6944	4471	0.0179	0.9272	0.9223	0.0179
0.0098	1.7940	4734	0.0177	0.9282	0.9249	0.0177
0.0155	1.8937	4997	0.0173	0.9275	0.9239	0.0173
0.0153	1.9934	5260	0.0181	0.9300	0.9261	0.0181
0.0107	2.0929	5523	0.0167	0.9311	0.9267	0.0167
0.0126	2.1925	5786	0.0164	0.9306	0.9264	0.0164
0.0096	2.2922	6049	0.0164	0.9318	0.9273	0.0164
0.012	2.3919	6312	0.0162	0.9311	0.9279	0.0162
0.0126	2.4916	6575	0.0170	0.9329	0.9285	0.0170
0.0086	2.5913	6838	0.0166	0.9323	0.9283	0.0166
0.0088	2.6910	7101	0.0160	0.9334	0.9295	0.0160
0.0088	2.7906	7364	0.0158	0.9339	0.9302	0.0158
0.013	2.8903	7627	0.0158	0.9336	0.9299	0.0158
0.0073	2.9900	7890	0.0157	0.9346	0.9308	0.0157
0.0071	3.0894	8153	0.0155	0.9354	0.9317	0.0155
0.0081	3.1891	8416	0.0158	0.9360	0.9317	0.0158
0.0092	3.2888	8679	0.0155	0.9358	0.9316	0.0155
0.0088	3.3885	8942	0.0156	0.9361	0.9324	0.0156
0.0058	3.4882	9205	0.0153	0.9366	0.9329	0.0153
0.0061	3.5879	9468	0.0158	0.9367	0.9322	0.0158
0.0081	3.6875	9731	0.0154	0.9369	0.9333	0.0154
0.0053	3.7872	9994	0.0150	0.9369	0.9336	0.0150
0.0063	3.8869	10257	0.0149	0.9373	0.9341	0.0149
0.006	3.9866	10520	0.0152	0.9375	0.9341	0.0152
0.0046	4.0860	10783	0.0150	0.9376	0.9345	0.0150
0.0044	4.1857	11046	0.0150	0.9376	0.9343	0.0150
0.0051	4.2854	11309	0.0151	0.9377	0.9343	0.0151
0.0062	4.3851	11572	0.0150	0.9378	0.9346	0.0150
0.0044	4.4848	11835	0.0150	0.9380	0.9346	0.0150
0.0052	4.5845	12098	0.0150	0.9378	0.9346	0.0150
0.0037	4.6841	12361	0.0151	0.9378	0.9345	0.0151
0.0031	4.7838	12624	0.0151	0.9378	0.9346	0.0151
0.0053	4.8835	12887	0.0150	0.9379	0.9346	0.0150
0.0046	4.9832	13150	0.0150	0.9379	0.9346	0.0150

Framework versions

Transformers 4.47.0
Pytorch 2.5.1+cu124
Datasets 2.19.2
Tokenizers 0.21.0

gutsartificial
/

bge-large-en-v1.5-2024-12-10_07-12-15-quality-weight-1

bge-large-en-v1.5-2024-12-10_07-12-15-quality-weight-1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for gutsartificial/bge-large-en-v1.5-2024-12-10_07-12-15-quality-weight-1

Evaluation results