luismsgomes
commited on
Commit
·
fbb7c63
1
Parent(s):
c88d963
added trained model
Browse files- 1_Pooling/config.json +10 -0
- README.md +129 -0
- config.json +38 -0
- config_sentence_transformers.json +9 -0
- eval/similarity_evaluation_assin-ptbr-test_results.csv +2 -0
- eval/similarity_evaluation_assin-ptpt-test_results.csv +2 -0
- eval/similarity_evaluation_assin2-test_results.csv +2 -0
- eval/similarity_evaluation_iris-sts-test_results.csv +2 -0
- eval/similarity_evaluation_stsb-multi-mt-pt-test_results.csv +2 -0
- eval/similarity_evaluation_validation_results.csv +101 -0
- model.safetensors +3 -0
- modules.json +14 -0
- sentence_bert_config.json +4 -0
- special_tokens_map.json +51 -0
- tokenizer.json +0 -0
- tokenizer_config.json +65 -0
- train-config.yaml +21 -0
1_Pooling/config.json
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"word_embedding_dimension": 1536,
|
3 |
+
"pooling_mode_cls_token": false,
|
4 |
+
"pooling_mode_mean_tokens": true,
|
5 |
+
"pooling_mode_max_tokens": false,
|
6 |
+
"pooling_mode_mean_sqrt_len_tokens": false,
|
7 |
+
"pooling_mode_weightedmean_tokens": false,
|
8 |
+
"pooling_mode_lasttoken": false,
|
9 |
+
"include_prompt": true
|
10 |
+
}
|
README.md
CHANGED
@@ -1,3 +1,132 @@
|
|
1 |
---
|
|
|
2 |
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language: pt
|
3 |
license: mit
|
4 |
+
library_name: sentence-transformers
|
5 |
+
pipeline_tag: sentence-similarity
|
6 |
+
tags:
|
7 |
+
- sentence-transformers
|
8 |
+
- feature-extraction
|
9 |
+
- sentence-similarity
|
10 |
+
- transformers
|
11 |
+
|
12 |
---
|
13 |
+
|
14 |
+
# Serafim 900m Portuguese (PT) Sentence Encoder
|
15 |
+
|
16 |
+
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 1536 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
17 |
+
|
18 |
+
<!--- Describe your model here -->
|
19 |
+
|
20 |
+
## Usage (Sentence-Transformers)
|
21 |
+
|
22 |
+
Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
|
23 |
+
|
24 |
+
```
|
25 |
+
pip install -U sentence-transformers
|
26 |
+
```
|
27 |
+
|
28 |
+
Then you can use the model like this:
|
29 |
+
|
30 |
+
```python
|
31 |
+
from sentence_transformers import SentenceTransformer
|
32 |
+
sentences = ["This is an example sentence", "Each sentence is converted"]
|
33 |
+
|
34 |
+
model = SentenceTransformer('{MODEL_NAME}')
|
35 |
+
embeddings = model.encode(sentences)
|
36 |
+
print(embeddings)
|
37 |
+
```
|
38 |
+
|
39 |
+
|
40 |
+
|
41 |
+
## Usage (HuggingFace Transformers)
|
42 |
+
Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
|
43 |
+
|
44 |
+
```python
|
45 |
+
from transformers import AutoTokenizer, AutoModel
|
46 |
+
import torch
|
47 |
+
|
48 |
+
|
49 |
+
#Mean Pooling - Take attention mask into account for correct averaging
|
50 |
+
def mean_pooling(model_output, attention_mask):
|
51 |
+
token_embeddings = model_output[0] #First element of model_output contains all token embeddings
|
52 |
+
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
|
53 |
+
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
|
54 |
+
|
55 |
+
|
56 |
+
# Sentences we want sentence embeddings for
|
57 |
+
sentences = ['This is an example sentence', 'Each sentence is converted']
|
58 |
+
|
59 |
+
# Load model from HuggingFace Hub
|
60 |
+
tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
|
61 |
+
model = AutoModel.from_pretrained('{MODEL_NAME}')
|
62 |
+
|
63 |
+
# Tokenize sentences
|
64 |
+
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
|
65 |
+
|
66 |
+
# Compute token embeddings
|
67 |
+
with torch.no_grad():
|
68 |
+
model_output = model(**encoded_input)
|
69 |
+
|
70 |
+
# Perform pooling. In this case, mean pooling.
|
71 |
+
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
|
72 |
+
|
73 |
+
print("Sentence embeddings:")
|
74 |
+
print(sentence_embeddings)
|
75 |
+
```
|
76 |
+
|
77 |
+
|
78 |
+
|
79 |
+
## Evaluation Results
|
80 |
+
|
81 |
+
<!--- Describe how your model was evaluated -->
|
82 |
+
|
83 |
+
For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={MODEL_NAME})
|
84 |
+
|
85 |
+
|
86 |
+
## Training
|
87 |
+
The model was trained with the parameters:
|
88 |
+
|
89 |
+
**DataLoader**:
|
90 |
+
|
91 |
+
`torch.utils.data.dataloader.DataLoader` of length 1183 with parameters:
|
92 |
+
```
|
93 |
+
{'batch_size': 16, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
|
94 |
+
```
|
95 |
+
|
96 |
+
**Loss**:
|
97 |
+
|
98 |
+
`sentence_transformers.losses.CoSENTLoss.CoSENTLoss` with parameters:
|
99 |
+
```
|
100 |
+
{'scale': 20.0, 'similarity_fct': 'pairwise_cos_sim'}
|
101 |
+
```
|
102 |
+
|
103 |
+
Parameters of the fit()-Method:
|
104 |
+
```
|
105 |
+
{
|
106 |
+
"epochs": 10,
|
107 |
+
"evaluation_steps": 119,
|
108 |
+
"evaluator": "sentence_transformers.evaluation.EmbeddingSimilarityEvaluator.EmbeddingSimilarityEvaluator",
|
109 |
+
"max_grad_norm": 1,
|
110 |
+
"optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
|
111 |
+
"optimizer_params": {
|
112 |
+
"lr": 1e-06
|
113 |
+
},
|
114 |
+
"scheduler": "WarmupLinear",
|
115 |
+
"steps_per_epoch": 1183,
|
116 |
+
"warmup_steps": 1183,
|
117 |
+
"weight_decay": 0.01
|
118 |
+
}
|
119 |
+
```
|
120 |
+
|
121 |
+
|
122 |
+
## Full Model Architecture
|
123 |
+
```
|
124 |
+
SentenceTransformer(
|
125 |
+
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DebertaV2Model
|
126 |
+
(1): Pooling({'word_embedding_dimension': 1536, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
|
127 |
+
)
|
128 |
+
```
|
129 |
+
|
130 |
+
## Citing & Authors
|
131 |
+
|
132 |
+
<!--- Describe where people can find more information -->
|
config.json
ADDED
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_name_or_path": "models/albertina-900m-ptpt-europarl-eubookshop-ted2020-tatoeba-ct1-nli-gist10-v1",
|
3 |
+
"architectures": [
|
4 |
+
"DebertaV2Model"
|
5 |
+
],
|
6 |
+
"attention_head_size": 64,
|
7 |
+
"attention_probs_dropout_prob": 0.1,
|
8 |
+
"conv_act": "gelu",
|
9 |
+
"conv_kernel_size": 3,
|
10 |
+
"hidden_act": "gelu",
|
11 |
+
"hidden_dropout_prob": 0.1,
|
12 |
+
"hidden_size": 1536,
|
13 |
+
"initializer_range": 0.02,
|
14 |
+
"intermediate_size": 6144,
|
15 |
+
"layer_norm_eps": 1e-07,
|
16 |
+
"max_position_embeddings": 512,
|
17 |
+
"max_relative_positions": -1,
|
18 |
+
"model_type": "deberta-v2",
|
19 |
+
"norm_rel_ebd": "layer_norm",
|
20 |
+
"num_attention_heads": 24,
|
21 |
+
"num_hidden_layers": 24,
|
22 |
+
"pad_token_id": 0,
|
23 |
+
"pooler_dropout": 0,
|
24 |
+
"pooler_hidden_act": "gelu",
|
25 |
+
"pooler_hidden_size": 1536,
|
26 |
+
"pos_att_type": [
|
27 |
+
"p2c",
|
28 |
+
"c2p"
|
29 |
+
],
|
30 |
+
"position_biased_input": false,
|
31 |
+
"position_buckets": 256,
|
32 |
+
"relative_attention": true,
|
33 |
+
"share_att_key": true,
|
34 |
+
"torch_dtype": "float32",
|
35 |
+
"transformers_version": "4.39.3",
|
36 |
+
"type_vocab_size": 0,
|
37 |
+
"vocab_size": 128100
|
38 |
+
}
|
config_sentence_transformers.json
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"__version__": {
|
3 |
+
"sentence_transformers": "2.6.1",
|
4 |
+
"transformers": "4.39.3",
|
5 |
+
"pytorch": "2.2.2+cu121"
|
6 |
+
},
|
7 |
+
"prompts": {},
|
8 |
+
"default_prompt_name": null
|
9 |
+
}
|
eval/similarity_evaluation_assin-ptbr-test_results.csv
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
|
2 |
+
-1,-1,0.8075227156584123,0.7883413099648385,0.8180576858475002,0.7927229847323022,0.8174669419480555,0.7923730715175294,0.7051105842918381,0.689291436431814
|
eval/similarity_evaluation_assin-ptpt-test_results.csv
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
|
2 |
+
-1,-1,0.8398807243271513,0.830669533954273,0.8471996412591254,0.8330789353909063,0.8469094638091792,0.8326100102419636,0.7494520951197198,0.7411218421717549
|
eval/similarity_evaluation_assin2-test_results.csv
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
|
2 |
+
-1,-1,0.8548813206256356,0.8266509159141969,0.8373929869094534,0.8254115917892378,0.8374704310841169,0.8252176122026239,0.770906756889177,0.7278639880863395
|
eval/similarity_evaluation_iris-sts-test_results.csv
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
|
2 |
+
-1,-1,0.8279941050190001,0.823119444971123,0.8041795163558138,0.8097763184725875,0.8051128154098245,0.8105488384195639,0.8057909117487693,0.8149271728899469
|
eval/similarity_evaluation_stsb-multi-mt-pt-test_results.csv
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
|
2 |
+
-1,-1,0.8474849678461291,0.8570183131973625,0.8452275645972888,0.8568781860925376,0.8452748867543359,0.8571309389539689,0.7492848737757264,0.7435566736867906
|
eval/similarity_evaluation_validation_results.csv
ADDED
@@ -0,0 +1,101 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
|
2 |
+
0,119,0.8346352333520197,0.8422258392177346,0.8197328311746798,0.8367438852730853,0.8196068984821245,0.8366298806343659,0.7999211730641367,0.8053265672590242
|
3 |
+
0,238,0.8415745890153112,0.8496095291374381,0.8274763650212377,0.8454146873497612,0.8274548823651928,0.8453744878235535,0.7969709739096635,0.8028039583453915
|
4 |
+
0,357,0.8471888539359285,0.8555908909768408,0.8352050721932037,0.8528823194369696,0.835280067861715,0.8530515634727752,0.780989840038815,0.7879134941576926
|
5 |
+
0,476,0.8507833456286397,0.8585549655097531,0.840200684023137,0.8575144771585445,0.8403271432910826,0.8576170544545164,0.762348014249639,0.7703305106263088
|
6 |
+
0,595,0.853728637707974,0.8611992964914437,0.8420733708416612,0.8600249158181528,0.842197972954118,0.8601328503801203,0.7594614594636205,0.7688144711990734
|
7 |
+
0,714,0.8581341835739743,0.8649379219352153,0.8461227843405976,0.8635347248304177,0.8462742004065475,0.8636965138901219,0.7592863294609045,0.7694674305360168
|
8 |
+
0,833,0.8617022624175867,0.8675011960829289,0.8490680130964816,0.8656953192577258,0.8492312160377112,0.8658860896842852,0.7636658012413412,0.7730869159500267
|
9 |
+
0,952,0.8643163567507588,0.8706952625463907,0.8519765861516742,0.8689336743383054,0.8521915507762666,0.8691745592674796,0.7592215659881358,0.769186321249198
|
10 |
+
0,1071,0.8611884373197293,0.8689202700885262,0.8487978530766281,0.8669072933524347,0.8489373361025635,0.8671607146801986,0.7582760453988152,0.7713158768684139
|
11 |
+
0,-1,0.8625732361590839,0.8719521910950583,0.8505907673710995,0.8687591440294259,0.8507990574257344,0.8690258019410102,0.7733986406344333,0.7851179681140572
|
12 |
+
1,119,0.8637581854219967,0.8727700044319258,0.8516982658195925,0.8684338468122845,0.8518921945540673,0.8686887630550413,0.7803127775757297,0.7901394351735449
|
13 |
+
1,238,0.8682053463581425,0.8765962894990316,0.8558395863225453,0.8718997524767961,0.8560405579464444,0.872161713939156,0.7903410027509883,0.7983628762795825
|
14 |
+
1,357,0.8708807373918951,0.8784204184626634,0.8591423211215432,0.8746097281513557,0.8594010444976521,0.8749954074656185,0.7839380734149561,0.7914107028700469
|
15 |
+
1,476,0.8741631123660174,0.8793750571771543,0.861003977756639,0.8770255475070269,0.8612162538427957,0.8772978152802219,0.7791966451332369,0.7878095913800425
|
16 |
+
1,595,0.8756930592886156,0.8803242602839599,0.8618262658846538,0.877805489760628,0.8620093635714191,0.8780396434712808,0.7812351760461747,0.7893244160597023
|
17 |
+
1,714,0.8758144408693579,0.8808607952554944,0.8605710330861767,0.8772149578087245,0.8607473511262962,0.8774595099086204,0.7860915777429817,0.795473981985746
|
18 |
+
1,833,0.8776072052673576,0.881278524199399,0.8624933340459493,0.8787030385135469,0.8626351249862101,0.8789840724569133,0.7766840101481022,0.7871081705935022
|
19 |
+
1,952,0.8769741665043229,0.8824471439922139,0.8619313234505173,0.8776593891366289,0.8621291750253715,0.8779535378381056,0.7922799721222221,0.8009354162292346
|
20 |
+
1,1071,0.8775980441756146,0.8827541900015038,0.8625238402655603,0.8797365984212887,0.862664721843668,0.8800151030124124,0.7770746783577623,0.7889645158344328
|
21 |
+
1,-1,0.877099109875496,0.8833609213732116,0.8609816375695695,0.878285892053751,0.8610893950283386,0.8784517206813098,0.7801227547503152,0.7937268453513722
|
22 |
+
2,119,0.8797123417342431,0.883311542932101,0.8625163465488785,0.8793169305755694,0.8625877386285254,0.8795218643704542,0.7769360986154548,0.7894635798554434
|
23 |
+
2,238,0.8774215776264491,0.8839346833608172,0.8620022096288248,0.8797857728271842,0.8621243517355406,0.8800591535743469,0.7793757375983782,0.7921988557357168
|
24 |
+
2,357,0.8774541296526093,0.8836527851793334,0.8596102951576453,0.877332053722741,0.8597705124184118,0.8775777323366196,0.791813183311673,0.8034857727347607
|
25 |
+
2,476,0.8754756764039029,0.884149068654327,0.8609488577232387,0.8797482487391428,0.8610905041763258,0.8800249949037978,0.7760560198605742,0.7910944848244663
|
26 |
+
2,595,0.8777657495486987,0.8851085015033453,0.8621205431833266,0.8809319597588805,0.8622656419255181,0.8811721520501965,0.7777457906056856,0.7914322148177159
|
27 |
+
2,714,0.8785240118960488,0.8846726918792049,0.8600507271518938,0.8791553550801946,0.8601082907322292,0.879325222267365,0.7875005659126604,0.7997218082510967
|
28 |
+
2,833,0.8791941692079219,0.8856305505816707,0.8618978820190925,0.8806042705185254,0.8619686380064013,0.8808193890646552,0.7844849303036342,0.7967633009057592
|
29 |
+
2,952,0.8803283497877149,0.8857125268640175,0.8627878619812928,0.8813356706792251,0.8628858761188877,0.8815478540953914,0.7831892307080908,0.7951059592833541
|
30 |
+
2,1071,0.8795404929338702,0.8844312344596341,0.8599202912664154,0.8789285212854102,0.8598858754843276,0.8790520758238188,0.7922701916057839,0.8042356729678226
|
31 |
+
2,-1,0.880508979314413,0.885095687429474,0.8616143053434878,0.8809830411647824,0.8616187501402122,0.8811695572916177,0.7874118084250354,0.7997450242457712
|
32 |
+
3,119,0.8816533382802142,0.8851243056867188,0.8628425007521099,0.880829025420507,0.8628125648500715,0.8809692507712359,0.7869744646248878,0.7979916755416128
|
33 |
+
3,238,0.8807309692032356,0.8850163773393138,0.8624790157595976,0.8799381539498488,0.8624312380559872,0.8800304126407217,0.7945512813551645,0.8046794677197014
|
34 |
+
3,357,0.8826232640116277,0.8857770158049765,0.8639310978192669,0.8817881834140852,0.86387934916352,0.8818851073711856,0.7906760859386978,0.8015165708711791
|
35 |
+
3,476,0.8809344351088982,0.8857802143851341,0.8597534631974171,0.879375764731323,0.8596971458337169,0.8794561503894729,0.8028357247959041,0.8144060060713446
|
36 |
+
3,595,0.8820522308796299,0.8859536709804566,0.8620460165163851,0.8823521967598943,0.8620117777847519,0.8824593546641254,0.7994128547215231,0.8093649925971239
|
37 |
+
3,714,0.8826480264525399,0.8865490022745225,0.863193631181805,0.8829999412689435,0.8631559253089283,0.883053446050108,0.7961183141584264,0.8057997663159502
|
38 |
+
3,833,0.882009894592221,0.8864467897201691,0.8610185339633059,0.8816839405077725,0.8609406967187891,0.8817475524124645,0.800585953549725,0.8119971965609749
|
39 |
+
3,952,0.8823437458230885,0.8873805139753449,0.8614310414878082,0.8815176517993175,0.8613949983418825,0.8816575030505889,0.7936419123889158,0.8076496676030118
|
40 |
+
3,1071,0.8826260168164414,0.8876498625323778,0.8622013126142408,0.8815402460589877,0.8622150789268221,0.8817652464424495,0.7940618231781008,0.8071859811738558
|
41 |
+
3,-1,0.8836393190295158,0.888473308146647,0.8633960055371852,0.8827903189384879,0.8634147656281557,0.8829850701103715,0.7890017808734735,0.8029192274151917
|
42 |
+
4,119,0.8823269095622938,0.8866172371904456,0.859286368297807,0.8789092916130744,0.8592201961602011,0.8790749602279632,0.7991227979721551,0.8119059356362098
|
43 |
+
4,238,0.8826294305368632,0.887586021684644,0.8607354869574075,0.880994037596667,0.8606673790052873,0.8811584876478844,0.7981880899663201,0.8115542102434757
|
44 |
+
4,357,0.8832933726454343,0.8882327497282719,0.8622748213094523,0.8822333902142013,0.8621851085676328,0.882355549696812,0.7988963332996472,0.8124587315554166
|
45 |
+
4,476,0.881840606772165,0.8871740242898601,0.8597749785578143,0.8801527492219774,0.8596811583195798,0.8801590286792192,0.7969672061105048,0.8117635173621588
|
46 |
+
4,595,0.8825416511317475,0.8880577247333887,0.8607539589011415,0.8814134240194618,0.8607086725511335,0.8814996409846403,0.7961649450250078,0.8103670019003154
|
47 |
+
4,714,0.8827719191295197,0.8881860868030793,0.8611783953057693,0.8825743918676177,0.8611376073518966,0.8827259527265117,0.7927346101697337,0.8078758159577673
|
48 |
+
4,833,0.8832817154391954,0.8873959595747313,0.859775482425091,0.879869875494685,0.8596985017910792,0.8799332015315074,0.8015401529412554,0.8149507340611569
|
49 |
+
4,952,0.8839285784973908,0.8882458968476794,0.8617608864496966,0.8819075612947938,0.8616972691840576,0.8820104514776882,0.7993808818218207,0.812585045222275
|
50 |
+
4,1071,0.8842443693402884,0.8884481320355871,0.861411652254568,0.8819965111408936,0.8613699357899787,0.8821318238594213,0.8010105017393705,0.8142833242064953
|
51 |
+
4,-1,0.8835761328285844,0.8875401487658368,0.8607692480192544,0.8816097204116775,0.8606935887981578,0.8817094983258242,0.8029179148892821,0.8163325954535507
|
52 |
+
5,119,0.8819248884341266,0.8868404163950547,0.8570311377998923,0.8787894475696096,0.8569700814480326,0.8789334641619787,0.809271447266343,0.8233360073648849
|
53 |
+
5,238,0.8828858577177023,0.8873538955526303,0.8585903109915216,0.8803086938842148,0.8585284373368823,0.8803969069408362,0.8057578109521051,0.8196105353498617
|
54 |
+
5,357,0.8832909519288645,0.8872992374274631,0.8583917444932727,0.8797038148166145,0.8583140564407189,0.8798148754120191,0.8055051212877938,0.8196555967170814
|
55 |
+
5,476,0.8826094514670383,0.8867466022826724,0.8573749245221443,0.8788982614268408,0.8572875550370438,0.878961852484204,0.8067103466288865,0.820547481298486
|
56 |
+
5,595,0.8824653848000071,0.8868049693628222,0.8576239395487223,0.8789344326737744,0.8575386219495266,0.8789623101275904,0.80491169635693,0.8192953636418171
|
57 |
+
5,714,0.883827984557394,0.887059438655981,0.8593328578430289,0.8802883691732354,0.8592383748304541,0.8803542548868963,0.8016058901768601,0.815977015849128
|
58 |
+
5,833,0.8844118179914054,0.8877861807360763,0.8603227262765712,0.8803764558810377,0.8602683815895195,0.880439845012194,0.8035038719008838,0.8167024769486264
|
59 |
+
5,952,0.8841082371446042,0.8878693971567,0.8597206326772422,0.8795947385235381,0.8597097120275098,0.8797125141547529,0.806073907418141,0.8188202620523802
|
60 |
+
5,1071,0.883779381056941,0.8879402321779829,0.8594641306620686,0.879248078430363,0.8594511971950967,0.8793885937042077,0.8113494252090672,0.8235114341956302
|
61 |
+
5,-1,0.8847414960923969,0.8883022169509511,0.8606873758387314,0.8820278066203547,0.8606306678543,0.8820793670697799,0.8017729510049806,0.8160117533648213
|
62 |
+
6,119,0.8838288422287383,0.8873597927326926,0.8583880856333336,0.8789969305127278,0.8583360432437586,0.8791727559910513,0.809000328260038,0.821975775738682
|
63 |
+
6,238,0.8836114667450747,0.88790223343508,0.8589272216695568,0.8798240396096403,0.8588412675704384,0.8799039937040818,0.8080929006253283,0.821721847487454
|
64 |
+
6,357,0.883778420023108,0.8882032634209196,0.8596298895772982,0.880196829766678,0.8595364536914696,0.8802463482974427,0.8073354999592192,0.8207415217726155
|
65 |
+
6,476,0.8841043256772667,0.888041408229941,0.859810104100427,0.8801869754264525,0.8596809935676518,0.8801790758462332,0.8049252203380058,0.818654908198364
|
66 |
+
6,595,0.8838454589666223,0.8872387446887711,0.8584521556400968,0.8792383996269925,0.8582673995747374,0.8791877355608364,0.8058480323319779,0.819487011995362
|
67 |
+
6,714,0.8833172986319796,0.8876791168185859,0.8581522916922523,0.8797672422330824,0.8580179727965334,0.8797845301675117,0.8036633399921781,0.818298359326293
|
68 |
+
6,833,0.8834612655017161,0.8882526297591219,0.8592653110475658,0.8802309563347771,0.8591876346502277,0.8802985138290014,0.8027752682307142,0.816891303715587
|
69 |
+
6,952,0.8835073686420667,0.8876633606655318,0.8577806834391506,0.8791748627714069,0.8577173082663588,0.8792212806970685,0.8079631141408513,0.8217472929064212
|
70 |
+
6,1071,0.8842506598446916,0.8883416521022558,0.8590551448577368,0.8807189197133856,0.8589967741515393,0.8807289235962532,0.8063848226510055,0.8202184974150694
|
71 |
+
6,-1,0.8844600757143403,0.8881849263171692,0.8587697220019439,0.8806486650130695,0.858697821712032,0.8807337831296924,0.8054941978241436,0.8199172072087788
|
72 |
+
7,119,0.8843233564651158,0.8878668561914071,0.8583754601948382,0.8794236243971868,0.8582777784974844,0.8794019523685322,0.8067345312512814,0.8208734165053734
|
73 |
+
7,238,0.8842069123767062,0.8876958700011693,0.8576908275191834,0.8786038773476983,0.8576162374293321,0.8785831681578846,0.8112058763968272,0.8248139182215718
|
74 |
+
7,357,0.8843568838718368,0.8874532187170571,0.8576078949060202,0.8789180326460854,0.857505217930445,0.8789497578711735,0.809849416186635,0.8235332397101185
|
75 |
+
7,476,0.8840960442473929,0.88758882230927,0.8577984557372981,0.8789836071140122,0.8577025219199634,0.8790212363859548,0.8109781202622401,0.8244100257497375
|
76 |
+
7,595,0.8839522818886445,0.8878792664378807,0.8577508869451539,0.8789612189619493,0.8576695096590751,0.8790398088348405,0.8114132404949927,0.8251376104265605
|
77 |
+
7,714,0.8845789341200696,0.8878316390152882,0.8577039062047612,0.8795504665148022,0.8576005066352508,0.8795613353697498,0.8101185439459966,0.8243093738847095
|
78 |
+
7,833,0.8844863367228244,0.8877441555281138,0.857209933409016,0.8786124252989105,0.857133630519639,0.8787347231438426,0.81327057733651,0.8268425028082523
|
79 |
+
7,952,0.884990469341495,0.8883032929352225,0.8585746339320629,0.8801830536048518,0.8584889202814387,0.8801856656920507,0.8103449022001461,0.8239891033797256
|
80 |
+
7,1071,0.8850047552231747,0.8882237708987581,0.8584780288032094,0.8797523045928094,0.8584021942420996,0.8797210860496824,0.8128707366250155,0.8260361309332872
|
81 |
+
7,-1,0.8850130416456771,0.8882085392118093,0.8583691365371056,0.8795345430182983,0.8582834204985644,0.879553317951403,0.8147244503581379,0.8276986779576448
|
82 |
+
8,119,0.8855986538452559,0.8890111343134921,0.859481644376079,0.8807434561263918,0.8594233704068954,0.8807969959149904,0.8109033875801263,0.8243871522369683
|
83 |
+
8,238,0.8848267279333828,0.8884737905882877,0.8581581033252191,0.8793356550120777,0.858088377724103,0.8793829772364761,0.813149448497148,0.8267203223021778
|
84 |
+
8,357,0.884131504672114,0.8881623430585593,0.8571323348237995,0.8786245273869394,0.857058528214968,0.8786646439724858,0.8137684876575587,0.8277266315676831
|
85 |
+
8,476,0.8844227251532278,0.8881091748831389,0.8570469050793121,0.8788083094401644,0.8569683400073297,0.8788021063771873,0.8134265280692008,0.8275774171869791
|
86 |
+
8,595,0.8849894433902971,0.8884472111574989,0.8579427362418912,0.8796785160873225,0.8578700389859438,0.8797504118833454,0.8125462042885104,0.8266702584294826
|
87 |
+
8,714,0.885436741163783,0.8885093734904793,0.8584871714459472,0.8798006293770824,0.8584068641570854,0.8798711497148297,0.8127096678528773,0.8265193950775724
|
88 |
+
8,833,0.8849984170629683,0.8882580609531054,0.8577326127814072,0.8794075516493508,0.8576477441659175,0.8794500064166123,0.8132890048973692,0.8274101791184788
|
89 |
+
8,952,0.8845661125214329,0.888062058103838,0.8572723544827137,0.8792775550907799,0.8571972402011276,0.8792914426273238,0.813280453345221,0.8276583986861932
|
90 |
+
8,1071,0.8847988095756709,0.8881979938737582,0.8576123917582051,0.8792309837031971,0.8575387320390747,0.8792823921909539,0.8144436921534146,0.828347426711952
|
91 |
+
8,-1,0.885035579557929,0.8883907300155452,0.858015997657079,0.8794273474961103,0.857943885268162,0.8794874571401071,0.8133686556708982,0.8272294589572149
|
92 |
+
9,119,0.8851013154381319,0.8883719988726287,0.8577841020119,0.8789955702633683,0.8577237377708655,0.8790972244681775,0.8149674818850363,0.8285548551734055
|
93 |
+
9,238,0.8851983236326592,0.8885050681662259,0.8579398453081819,0.8792140615845516,0.8578933578605225,0.8793023945506666,0.8153075127117759,0.8288012065142186
|
94 |
+
9,357,0.8851902701287351,0.8886245192451891,0.8580723149102902,0.8794392667136579,0.8580226964018515,0.8795110527045215,0.8147058523539643,0.8283904352688051
|
95 |
+
9,476,0.8852587044988647,0.8885275318024204,0.8580862697499411,0.8795488907424688,0.8580276149758482,0.8796409828051247,0.8145297144961017,0.8282917085555715
|
96 |
+
9,595,0.8853092023594366,0.8885406616358956,0.8582555040090838,0.8799553935316201,0.8581934604223546,0.880051702411222,0.8133260547126236,0.8273292862167865
|
97 |
+
9,714,0.885259239382198,0.8885171080665215,0.8581607689075803,0.8799298029678154,0.8581027687689353,0.8799861700994119,0.8135668662502087,0.8275243398979024
|
98 |
+
9,833,0.8851742516515474,0.888485349797145,0.8579674650502147,0.8796152756792894,0.8579118189362843,0.8797145710793993,0.8144297643257481,0.8283031321447352
|
99 |
+
9,952,0.8851121365078664,0.8884794837703002,0.85783832539016,0.8794322215708568,0.8577845693011349,0.8795155229614361,0.8147876641430668,0.8285994550694681
|
100 |
+
9,1071,0.8851464311449804,0.8884921129348036,0.8578738562486644,0.8794391359760069,0.8578212878659055,0.8795307830357721,0.8148702579699264,0.8286361820023991
|
101 |
+
9,-1,0.8851496072546251,0.8884768692344156,0.857861213926136,0.8794319947766976,0.8578082485108272,0.8795180610062464,0.8148881308081785,0.8286586758055441
|
model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f805cc49fd4e2c2a9dd448b842929d9fc889b72feb934bde8cec25849750803f
|
3 |
+
size 3538419000
|
modules.json
ADDED
@@ -0,0 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[
|
2 |
+
{
|
3 |
+
"idx": 0,
|
4 |
+
"name": "0",
|
5 |
+
"path": "",
|
6 |
+
"type": "sentence_transformers.models.Transformer"
|
7 |
+
},
|
8 |
+
{
|
9 |
+
"idx": 1,
|
10 |
+
"name": "1",
|
11 |
+
"path": "1_Pooling",
|
12 |
+
"type": "sentence_transformers.models.Pooling"
|
13 |
+
}
|
14 |
+
]
|
sentence_bert_config.json
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"max_seq_length": 128,
|
3 |
+
"do_lower_case": false
|
4 |
+
}
|
special_tokens_map.json
ADDED
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"bos_token": {
|
3 |
+
"content": "[CLS]",
|
4 |
+
"lstrip": false,
|
5 |
+
"normalized": false,
|
6 |
+
"rstrip": false,
|
7 |
+
"single_word": false
|
8 |
+
},
|
9 |
+
"cls_token": {
|
10 |
+
"content": "[CLS]",
|
11 |
+
"lstrip": false,
|
12 |
+
"normalized": false,
|
13 |
+
"rstrip": false,
|
14 |
+
"single_word": false
|
15 |
+
},
|
16 |
+
"eos_token": {
|
17 |
+
"content": "[SEP]",
|
18 |
+
"lstrip": false,
|
19 |
+
"normalized": false,
|
20 |
+
"rstrip": false,
|
21 |
+
"single_word": false
|
22 |
+
},
|
23 |
+
"mask_token": {
|
24 |
+
"content": "[MASK]",
|
25 |
+
"lstrip": false,
|
26 |
+
"normalized": false,
|
27 |
+
"rstrip": false,
|
28 |
+
"single_word": false
|
29 |
+
},
|
30 |
+
"pad_token": {
|
31 |
+
"content": "[PAD]",
|
32 |
+
"lstrip": false,
|
33 |
+
"normalized": false,
|
34 |
+
"rstrip": false,
|
35 |
+
"single_word": false
|
36 |
+
},
|
37 |
+
"sep_token": {
|
38 |
+
"content": "[SEP]",
|
39 |
+
"lstrip": false,
|
40 |
+
"normalized": false,
|
41 |
+
"rstrip": false,
|
42 |
+
"single_word": false
|
43 |
+
},
|
44 |
+
"unk_token": {
|
45 |
+
"content": "[UNK]",
|
46 |
+
"lstrip": false,
|
47 |
+
"normalized": false,
|
48 |
+
"rstrip": false,
|
49 |
+
"single_word": false
|
50 |
+
}
|
51 |
+
}
|
tokenizer.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
tokenizer_config.json
ADDED
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"added_tokens_decoder": {
|
3 |
+
"0": {
|
4 |
+
"content": "[PAD]",
|
5 |
+
"lstrip": false,
|
6 |
+
"normalized": false,
|
7 |
+
"rstrip": false,
|
8 |
+
"single_word": false,
|
9 |
+
"special": true
|
10 |
+
},
|
11 |
+
"1": {
|
12 |
+
"content": "[CLS]",
|
13 |
+
"lstrip": false,
|
14 |
+
"normalized": false,
|
15 |
+
"rstrip": false,
|
16 |
+
"single_word": false,
|
17 |
+
"special": true
|
18 |
+
},
|
19 |
+
"2": {
|
20 |
+
"content": "[SEP]",
|
21 |
+
"lstrip": false,
|
22 |
+
"normalized": false,
|
23 |
+
"rstrip": false,
|
24 |
+
"single_word": false,
|
25 |
+
"special": true
|
26 |
+
},
|
27 |
+
"3": {
|
28 |
+
"content": "[UNK]",
|
29 |
+
"lstrip": false,
|
30 |
+
"normalized": false,
|
31 |
+
"rstrip": false,
|
32 |
+
"single_word": false,
|
33 |
+
"special": true
|
34 |
+
},
|
35 |
+
"128000": {
|
36 |
+
"content": "[MASK]",
|
37 |
+
"lstrip": false,
|
38 |
+
"normalized": false,
|
39 |
+
"rstrip": false,
|
40 |
+
"single_word": false,
|
41 |
+
"special": true
|
42 |
+
}
|
43 |
+
},
|
44 |
+
"bos_token": "[CLS]",
|
45 |
+
"clean_up_tokenization_spaces": true,
|
46 |
+
"cls_token": "[CLS]",
|
47 |
+
"do_lower_case": false,
|
48 |
+
"eos_token": "[SEP]",
|
49 |
+
"mask_token": "[MASK]",
|
50 |
+
"max_length": 128,
|
51 |
+
"model_max_length": 512,
|
52 |
+
"pad_to_multiple_of": null,
|
53 |
+
"pad_token": "[PAD]",
|
54 |
+
"pad_token_type_id": 0,
|
55 |
+
"padding_side": "right",
|
56 |
+
"sep_token": "[SEP]",
|
57 |
+
"sp_model_kwargs": {},
|
58 |
+
"split_by_punct": false,
|
59 |
+
"stride": 0,
|
60 |
+
"tokenizer_class": "DebertaV2Tokenizer",
|
61 |
+
"truncation_side": "right",
|
62 |
+
"truncation_strategy": "longest_first",
|
63 |
+
"unk_token": "[UNK]",
|
64 |
+
"vocab_type": "spm"
|
65 |
+
}
|
train-config.yaml
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
trainer: "sts"
|
2 |
+
model_name: "albertina-900m-ptpt-europarl-eubookshop-ted2020-tatoeba-ct1-nli-gist10-sts-cosent20-v1"
|
3 |
+
base_model_name: "albertina-900m-ptpt-europarl-eubookshop-ted2020-tatoeba-ct1-nli-gist10-v1"
|
4 |
+
loss_function: "cosent"
|
5 |
+
seed: 1
|
6 |
+
learning_rate: 1e-6
|
7 |
+
warmup_ratio: 0.1
|
8 |
+
weight_decay: 0.01
|
9 |
+
batch_size: 16
|
10 |
+
use_amp: True
|
11 |
+
epochs: 10
|
12 |
+
validations_per_epoch: 10
|
13 |
+
|
14 |
+
# HPs used by JRodrigues to train albertina-100m-portuguese-ptpt-encoder:
|
15 |
+
# learning_rate 1e-5
|
16 |
+
# lr_scheduler_type linear
|
17 |
+
# weight_decay 0.01
|
18 |
+
# per_device_train_batch_size 192
|
19 |
+
# gradient_accumulation_steps 1
|
20 |
+
# num_train_epochs 150
|
21 |
+
# num_warmup_steps 10000
|