SentenceTransformer based on neuralmind/bert-large-portuguese-cased

This is a sentence-transformers model finetuned from neuralmind/bert-large-portuguese-cased. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: neuralmind/bert-large-portuguese-cased
Maximum Sequence Length: 512 tokens
Output Dimensionality: 1024 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("SenhorDasMoscas/acho2-ptbr-e4-lr3e-05")
# Run inference
sentences = [
    'livro ficcao',
    'produto basico arroz feijao massa item mercearia snack alimento congelar dia dia situacoes emergencial',
    'produto voltar publico adulto brinquedo sexual jogo adulto',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Dataset: eval-similarity
Evaluated with EmbeddingSimilarityEvaluator

Metric	Value
pearson_cosine	0.9024
spearman_cosine	0.8404

Training Details

Training Dataset

Unnamed Dataset

Size: 10,822 training samples
Columns: text1, text2, and label
Approximate statistics based on the first 1000 samples:
text1 text2 label
type string string float
details
min: 3 tokens
mean: 7.16 tokens
max: 15 tokens

min: 11 tokens
mean: 25.08 tokens
max: 36 tokens

min: 0.1
mean: 0.53
max: 1.0

	text1	text2	label
type	string	string	float
details	min: 3 tokens mean: 7.16 tokens max: 15 tokens	min: 11 tokens mean: 25.08 tokens max: 36 tokens	min: 0.1 mean: 0.53 max: 1.0

Samples:

text1	text2	label
`tenis nike`	`artigo esportivo bola raquete acessorio academia roupa esportiva equipamento esporte outdoor escalada ciclismo`	`1.0`
`tapete Sao Carlos`	`tinta cimento ferramenta construcao material reforma piso azulejo equipamento protecao individual`	`0.1`
`kit sensual lua Mel`	`produto voltar publico adulto brinquedo sexual jogo adulto`	`1.0`

Loss: CosineSimilarityLoss with these parameters:

{
    "loss_fct": "torch.nn.modules.loss.MSELoss"
}

Evaluation Dataset

Unnamed Dataset

Size: 1,203 evaluation samples
Columns: text1, text2, and label
Approximate statistics based on the first 1000 samples:
text1 text2 label
type string string float
details
min: 3 tokens
mean: 7.09 tokens
max: 14 tokens

min: 11 tokens
mean: 25.62 tokens
max: 36 tokens

min: 0.1
mean: 0.57
max: 1.0

	text1	text2	label
type	string	string	float
details	min: 3 tokens mean: 7.09 tokens max: 14 tokens	min: 11 tokens mean: 25.62 tokens max: 36 tokens	min: 0.1 mean: 0.57 max: 1.0

Samples:

text1	text2	label
`carvao`	`tinta cimento ferramenta construcao material reforma piso azulejo equipamento protecao individual`	`1.0`
`telha fibrocimento`	`produto basico arroz feijao massa item mercearia snack alimento congelar dia dia situacoes emergencial`	`0.1`
`racao cachorro pedigree loja decoracao`	`baloe paineis decorativo item tematico casamento aniversario luminaria bandeirola vela acessorio transformar ambiente festa ocasioes especial`	`0.1`

Loss: CosineSimilarityLoss with these parameters:

{
    "loss_fct": "torch.nn.modules.loss.MSELoss"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 32
per_device_eval_batch_size: 32
learning_rate: 3e-05
weight_decay: 0.1
num_train_epochs: 4
warmup_ratio: 0.1
warmup_steps: 135
fp16: True
load_best_model_at_end: True

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 32
per_device_eval_batch_size: 32
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 3e-05
weight_decay: 0.1
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 4
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 135
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional

Training Logs

Click to expand

Epoch	Step	Training Loss	Validation Loss	eval-similarity_spearman_cosine
0.0147	5	0.2323	-	-
0.0295	10	0.2056	-	-
0.0442	15	0.2203	-	-
0.0590	20	0.1947	-	-
0.0737	25	0.1811	-	-
0.0885	30	0.1526	-	-
0.1032	35	0.1511	-	-
0.1180	40	0.1543	-	-
0.1327	45	0.1529	-	-
0.1475	50	0.1296	-	-
0.1622	55	0.1212	-	-
0.1770	60	0.1023	-	-
0.1917	65	0.1011	-	-
0.2065	70	0.1047	-	-
0.2212	75	0.1077	-	-
0.2360	80	0.0909	-	-
0.2507	85	0.0913	-	-
0.2655	90	0.1045	-	-
0.2802	95	0.0761	-	-
0.2950	100	0.0705	-	-
0.3097	105	0.086	-	-
0.3245	110	0.0753	-	-
0.3392	115	0.0652	-	-
0.3540	120	0.0663	-	-
0.3687	125	0.0862	-	-
0.3835	130	0.085	-	-
0.3982	135	0.0803	-	-
0.4130	140	0.088	-	-
0.4277	145	0.0569	-	-
0.4425	150	0.0689	-	-
0.4572	155	0.0746	-	-
0.4720	160	0.069	-	-
0.4867	165	0.0665	-	-
0.5015	170	0.0778	-	-
0.5162	175	0.0513	-	-
0.5310	180	0.0525	-	-
0.5457	185	0.0817	-	-
0.5605	190	0.0731	-	-
0.5752	195	0.0704	-	-
0.5900	200	0.0742	0.0651	0.8003
0.6047	205	0.0722	-	-
0.6195	210	0.0894	-	-
0.6342	215	0.0679	-	-
0.6490	220	0.0532	-	-
0.6637	225	0.0877	-	-
0.6785	230	0.2859	-	-
0.6932	235	0.3122	-	-
0.7080	240	0.1166	-	-
0.7227	245	0.0785	-	-
0.7375	250	0.0636	-	-
0.7522	255	0.0613	-	-
0.7670	260	0.0648	-	-
0.7817	265	0.0597	-	-
0.7965	270	0.0597	-	-
0.8112	275	0.0662	-	-
0.8260	280	0.0581	-	-
0.8407	285	0.0685	-	-
0.8555	290	0.0629	-	-
0.8702	295	0.0694	-	-
0.8850	300	0.055	-	-
0.8997	305	0.0647	-	-
0.9145	310	0.0634	-	-
0.9292	315	0.0724	-	-
0.9440	320	0.0658	-	-
0.9587	325	0.0594	-	-
0.9735	330	0.053	-	-
0.9882	335	0.0622	-	-
1.0029	340	0.0622	-	-
1.0177	345	0.0593	-	-
1.0324	350	0.0541	-	-
1.0472	355	0.0493	-	-
1.0619	360	0.0504	-	-
1.0767	365	0.0539	-	-
1.0914	370	0.0439	-	-
1.1062	375	0.0613	-	-
1.1209	380	0.0432	-	-
1.1357	385	0.0617	-	-
1.1504	390	0.0546	-	-
1.1652	395	0.0427	-	-
1.1799	400	0.0674	0.0488	0.8279
1.1947	405	0.055	-	-
1.2094	410	0.0393	-	-
1.2242	415	0.0561	-	-
1.2389	420	0.0531	-	-
1.2537	425	0.0374	-	-
1.2684	430	0.0374	-	-
1.2832	435	0.0369	-	-
1.2979	440	0.0408	-	-
1.3127	445	0.0508	-	-
1.3274	450	0.0558	-	-
1.3422	455	0.0566	-	-
1.3569	460	0.0466	-	-
1.3717	465	0.0363	-	-
1.3864	470	0.0489	-	-
1.4012	475	0.0535	-	-
1.4159	480	0.0502	-	-
1.4307	485	0.0429	-	-
1.4454	490	0.0541	-	-
1.4602	495	0.057	-	-
1.4749	500	0.0402	-	-
1.4897	505	0.0464	-	-
1.5044	510	0.0405	-	-
1.5192	515	0.0469	-	-
1.5339	520	0.0519	-	-
1.5487	525	0.0338	-	-
1.5634	530	0.0476	-	-
1.5782	535	0.0385	-	-
1.5929	540	0.0442	-	-
1.6077	545	0.0379	-	-
1.6224	550	0.0477	-	-
1.6372	555	0.0525	-	-
1.6519	560	0.0487	-	-
1.6667	565	0.0499	-	-
1.6814	570	0.0344	-	-
1.6962	575	0.0503	-	-
1.7109	580	0.0568	-	-
1.7257	585	0.0465	-	-
1.7404	590	0.0325	-	-
1.7552	595	0.0479	-	-
1.7699	600	0.046	0.0466	0.8309
1.7847	605	0.0482	-	-
1.7994	610	0.0546	-	-
1.8142	615	0.0465	-	-
1.8289	620	0.049	-	-
1.8437	625	0.0422	-	-
1.8584	630	0.0358	-	-
1.8732	635	0.0519	-	-
1.8879	640	0.0416	-	-
1.9027	645	0.0344	-	-
1.9174	650	0.0339	-	-
1.9322	655	0.0365	-	-
1.9469	660	0.038	-	-
1.9617	665	0.0417	-	-
1.9764	670	0.0521	-	-
1.9912	675	0.0242	-	-
2.0059	680	0.0405	-	-
2.0206	685	0.0233	-	-
2.0354	690	0.0299	-	-
2.0501	695	0.0194	-	-
2.0649	700	0.0424	-	-
2.0796	705	0.0245	-	-
2.0944	710	0.0374	-	-
2.1091	715	0.0295	-	-
2.1239	720	0.0236	-	-
2.1386	725	0.0477	-	-
2.1534	730	0.0211	-	-
2.1681	735	0.0306	-	-
2.1829	740	0.0265	-	-
2.1976	745	0.0398	-	-
2.2124	750	0.0468	-	-
2.2271	755	0.0252	-	-
2.2419	760	0.0329	-	-
2.2566	765	0.0317	-	-
2.2714	770	0.035	-	-
2.2861	775	0.0387	-	-
2.3009	780	0.037	-	-
2.3156	785	0.0285	-	-
2.3304	790	0.0377	-	-
2.3451	795	0.0344	-	-
2.3599	800	0.0335	0.0431	0.8360
2.3746	805	0.0296	-	-
2.3894	810	0.0357	-	-
2.4041	815	0.0244	-	-
2.4189	820	0.0373	-	-
2.4336	825	0.0295	-	-
2.4484	830	0.0353	-	-
2.4631	835	0.0303	-	-
2.4779	840	0.0206	-	-
2.4926	845	0.0284	-	-
2.5074	850	0.0293	-	-
2.5221	855	0.035	-	-
2.5369	860	0.0295	-	-
2.5516	865	0.0349	-	-
2.5664	870	0.0195	-	-
2.5811	875	0.0265	-	-
2.5959	880	0.0298	-	-
2.6106	885	0.0321	-	-
2.6254	890	0.0321	-	-
2.6401	895	0.0299	-	-
2.6549	900	0.0216	-	-
2.6696	905	0.02	-	-
2.6844	910	0.0277	-	-
2.6991	915	0.0381	-	-
2.7139	920	0.0296	-	-
2.7286	925	0.0339	-	-
2.7434	930	0.035	-	-
2.7581	935	0.0293	-	-
2.7729	940	0.038	-	-
2.7876	945	0.0291	-	-
2.8024	950	0.0411	-	-
2.8171	955	0.0377	-	-
2.8319	960	0.0282	-	-
2.8466	965	0.0388	-	-
2.8614	970	0.0286	-	-
2.8761	975	0.0177	-	-
2.8909	980	0.0352	-	-
2.9056	985	0.0329	-	-
2.9204	990	0.0265	-	-
2.9351	995	0.0363	-	-
2.9499	1000	0.021	0.0404	0.8374
2.9646	1005	0.0342	-	-
2.9794	1010	0.0415	-	-
2.9941	1015	0.0232	-	-
3.0088	1020	0.0251	-	-
3.0236	1025	0.0317	-	-
3.0383	1030	0.0344	-	-
3.0531	1035	0.021	-	-
3.0678	1040	0.0271	-	-
3.0826	1045	0.021	-	-
3.0973	1050	0.0151	-	-
3.1121	1055	0.0222	-	-
3.1268	1060	0.0186	-	-
3.1416	1065	0.0357	-	-
3.1563	1070	0.0179	-	-
3.1711	1075	0.0291	-	-
3.1858	1080	0.0313	-	-
3.2006	1085	0.0349	-	-
3.2153	1090	0.0181	-	-
3.2301	1095	0.0294	-	-
3.2448	1100	0.0216	-	-
3.2596	1105	0.0334	-	-
3.2743	1110	0.0256	-	-
3.2891	1115	0.026	-	-
3.3038	1120	0.0176	-	-
3.3186	1125	0.0231	-	-
3.3333	1130	0.0164	-	-
3.3481	1135	0.0226	-	-
3.3628	1140	0.0286	-	-
3.3776	1145	0.02	-	-
3.3923	1150	0.0229	-	-
3.4071	1155	0.0231	-	-
3.4218	1160	0.0289	-	-
3.4366	1165	0.0188	-	-
3.4513	1170	0.0313	-	-
3.4661	1175	0.0179	-	-
3.4808	1180	0.0157	-	-
3.4956	1185	0.0252	-	-
3.5103	1190	0.019	-	-
3.5251	1195	0.0251	-	-
3.5398	1200	0.021	0.0399	0.8404
3.5546	1205	0.0154	-	-
3.5693	1210	0.0187	-	-
3.5841	1215	0.0221	-	-
3.5988	1220	0.0148	-	-
3.6136	1225	0.0168	-	-
3.6283	1230	0.0236	-	-
3.6431	1235	0.0194	-	-
3.6578	1240	0.0245	-	-
3.6726	1245	0.0171	-	-
3.6873	1250	0.0235	-	-
3.7021	1255	0.0243	-	-
3.7168	1260	0.0325	-	-
3.7316	1265	0.0196	-	-
3.7463	1270	0.0362	-	-
3.7611	1275	0.0188	-	-
3.7758	1280	0.0151	-	-
3.7906	1285	0.0189	-	-
3.8053	1290	0.0286	-	-
3.8201	1295	0.0266	-	-
3.8348	1300	0.0216	-	-
3.8496	1305	0.0218	-	-
3.8643	1310	0.0214	-	-
3.8791	1315	0.0224	-	-
3.8938	1320	0.0213	-	-
3.9086	1325	0.0302	-	-
3.9233	1330	0.0196	-	-
3.9381	1335	0.0218	-	-
3.9528	1340	0.0226	-	-
3.9676	1345	0.0204	-	-
3.9823	1350	0.0215	-	-
3.9971	1355	0.0258	-	-

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
Sentence Transformers: 3.3.1
Transformers: 4.47.1
PyTorch: 2.5.1+cu121
Accelerate: 1.1.1
Datasets: 2.14.4
Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

SenhorDasMoscas
/

acho2-ptbr-e4-lr3e-05

SentenceTransformer based on neuralmind/bert-large-portuguese-cased

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Evaluation

Metrics

Semantic Similarity

Training Details

Training Dataset

Unnamed Dataset

Evaluation Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Framework Versions

Citation

BibTeX

Sentence Transformers

Model tree for SenhorDasMoscas/acho2-ptbr-e4-lr3e-05

Evaluation results