ighina commited on
Commit
b1b75b1
·
1 Parent(s): 4a4fa94

Upload 13 files

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
README.md CHANGED
@@ -1,8 +1,122 @@
1
  ---
2
- license: cc-by-3.0
3
- language:
4
- - en
5
- library_name: sentence-transformers
6
  pipeline_tag: sentence-similarity
 
 
 
 
 
7
  ---
8
- A roberta model further fine-tuned for encoding sentences from the same section in wikipedia articles closer in space. The model was fine-tuned on the training set of wikisection english dataset.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
 
2
  pipeline_tag: sentence-similarity
3
+ tags:
4
+ - sentence-transformers
5
+ - feature-extraction
6
+ - sentence-similarity
7
+ - transformers
8
  ---
9
+
10
+ # {MODEL_NAME}
11
+
12
+ This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
13
+
14
+ <!--- Describe your model here -->
15
+
16
+ ## Usage (Sentence-Transformers)
17
+
18
+ Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
19
+
20
+ ```
21
+ pip install -U sentence-transformers
22
+ ```
23
+
24
+ Then you can use the model like this:
25
+
26
+ ```python
27
+ from sentence_transformers import SentenceTransformer
28
+ sentences = ["This is an example sentence", "Each sentence is converted"]
29
+
30
+ model = SentenceTransformer('{MODEL_NAME}')
31
+ embeddings = model.encode(sentences)
32
+ print(embeddings)
33
+ ```
34
+
35
+
36
+
37
+ ## Usage (HuggingFace Transformers)
38
+ Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
39
+
40
+ ```python
41
+ from transformers import AutoTokenizer, AutoModel
42
+ import torch
43
+
44
+
45
+ def cls_pooling(model_output, attention_mask):
46
+ return model_output[0][:,0]
47
+
48
+
49
+ # Sentences we want sentence embeddings for
50
+ sentences = ['This is an example sentence', 'Each sentence is converted']
51
+
52
+ # Load model from HuggingFace Hub
53
+ tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
54
+ model = AutoModel.from_pretrained('{MODEL_NAME}')
55
+
56
+ # Tokenize sentences
57
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
58
+
59
+ # Compute token embeddings
60
+ with torch.no_grad():
61
+ model_output = model(**encoded_input)
62
+
63
+ # Perform pooling. In this case, cls pooling.
64
+ sentence_embeddings = cls_pooling(model_output, encoded_input['attention_mask'])
65
+
66
+ print("Sentence embeddings:")
67
+ print(sentence_embeddings)
68
+ ```
69
+
70
+
71
+
72
+ ## Evaluation Results
73
+
74
+ <!--- Describe how your model was evaluated -->
75
+
76
+ For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={MODEL_NAME})
77
+
78
+
79
+ ## Training
80
+ The model was trained with the parameters:
81
+
82
+ **DataLoader**:
83
+
84
+ `torch.utils.data.dataloader.DataLoader` of length 6708 with parameters:
85
+ ```
86
+ {'batch_size': 128, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
87
+ ```
88
+
89
+ **Loss**:
90
+
91
+ `sentence_transformers.losses.CosineSimilarityLoss.CosineSimilarityLoss`
92
+
93
+ Parameters of the fit()-Method:
94
+ ```
95
+ {
96
+ "epochs": 10,
97
+ "evaluation_steps": 0,
98
+ "evaluator": "sentence_transformers.evaluation.BinaryClassificationEvaluator.BinaryClassificationEvaluator",
99
+ "max_grad_norm": 1,
100
+ "optimizer_class": "<class 'transformers.optimization.AdamW'>",
101
+ "optimizer_params": {
102
+ "lr": 2e-05
103
+ },
104
+ "scheduler": "WarmupLinear",
105
+ "steps_per_epoch": null,
106
+ "warmup_steps": 10000,
107
+ "weight_decay": 0.01
108
+ }
109
+ ```
110
+
111
+
112
+ ## Full Model Architecture
113
+ ```
114
+ SentenceTransformer(
115
+ (0): Transformer({'max_seq_length': 32, 'do_lower_case': False}) with Transformer model: RobertaModel
116
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
117
+ )
118
+ ```
119
+
120
+ ## Citing & Authors
121
+
122
+ <!--- Describe where people can find more information -->
config.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "roberta-base",
3
+ "architectures": [
4
+ "RobertaModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 2,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 3072,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 514,
17
+ "model_type": "roberta",
18
+ "num_attention_heads": 12,
19
+ "num_hidden_layers": 12,
20
+ "pad_token_id": 1,
21
+ "position_embedding_type": "absolute",
22
+ "torch_dtype": "float32",
23
+ "transformers_version": "4.18.0",
24
+ "type_vocab_size": 1,
25
+ "use_cache": true,
26
+ "vocab_size": 50265
27
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.2.0",
4
+ "transformers": "4.18.0",
5
+ "pytorch": "1.11.0"
6
+ }
7
+ }
eval/binary_classification_evaluation_Valid_Topic_Boundaries_results.csv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,steps,cossim_accuracy,cossim_accuracy_threshold,cossim_f1,cossim_precision,cossim_recall,cossim_f1_threshold,cossim_ap,manhatten_accuracy,manhatten_accuracy_threshold,manhatten_f1,manhatten_precision,manhatten_recall,manhatten_f1_threshold,manhatten_ap,euclidean_accuracy,euclidean_accuracy_threshold,euclidean_f1,euclidean_precision,euclidean_recall,euclidean_f1_threshold,euclidean_ap,dot_accuracy,dot_accuracy_threshold,dot_f1,dot_precision,dot_recall,dot_f1_threshold,dot_ap
2
+ 0,-1,0.688875678312298,0.21451625227928162,0.7007406801162173,0.6340813152632748,0.7830620084141211,0.011116713285446167,0.7527617912617275,0.6873208950673739,446.39990234375,0.7296517172539196,0.6258807617634868,0.8746722760807268,463.9516906738281,0.752832196770017,0.6722151088348272,19.273948669433594,0.6973663671030038,0.5497503691723508,0.9533565026522773,25.77716064453125,0.7326285132766597,0.6885860618254984,47.748291015625,0.700788257678717,0.6322526852714699,0.7859886592280958,1.2547117471694946,0.7602589761489239
3
+ 1,-1,0.6977775745381379,0.20492404699325562,0.7134806284713263,0.6432891740333521,0.8008658008658008,0.01354411244392395,0.7591359016016286,0.6690902993719895,335.69403076171875,0.7013278618713948,0.6080418876283815,0.8284250960307298,386.7208557128906,0.7241603451150835,0.6556307542223035,16.894609451293945,0.6970770151636074,0.5897868748681157,0.8520821901103591,18.752403259277344,0.7152351172613437,0.6981129199439059,23.636016845703125,0.71349794909407,0.6433960415441897,0.8007438570818852,1.8228182792663574,0.7716774359749061
4
+ 2,-1,0.6902018169623804,0.179021954536438,0.7095585781598304,0.6311053068977799,0.8102859581732821,-0.048351287841796875,0.7566907740053774,0.6613621120663374,340.5686340332031,0.7034177615242224,0.5584829910026168,0.9499420767026401,464.56915283203125,0.7183696833593858,0.6543198585452107,16.773998260498047,0.693946160214757,0.5895643277346747,0.8432412657764771,19.22742462158203,0.710098062689827,0.6899731723675385,21.468889236450195,0.7100117533924565,0.6318041359638698,0.8103164441192611,-6.466011047363281,0.7660399369870547
5
+ 3,-1,0.6868940918236693,0.13593482971191406,0.7039844614098033,0.6313141087671763,0.7955612462654716,-0.0446164608001709,0.7513436926039477,0.6648070239619536,364.3100280761719,0.7190925154769113,0.604612987012987,0.8870495701481617,441.24981689453125,0.7237040102988117,0.6609200658496434,17.429285049438477,0.6928780512965995,0.6003977920307506,0.8190354246692275,20.231002807617188,0.7127326391214277,0.6855069812816291,20.047109603881836,0.7042868820604907,0.6283446115587863,0.8011096884336321,-9.032946586608887,0.7599840429875714
6
+ 4,-1,0.6790287177611122,0.12285339832305908,0.6975741095468367,0.6221595660796636,0.7937930613986952,-0.08657693862915039,0.7471036028656264,0.6705536247789768,422.10455322265625,0.7195281627838396,0.5868239278086699,0.9297908664105847,477.7283935546875,0.7297992216447178,0.6612249253094323,18.67955780029297,0.6909119826112311,0.611151492414828,0.7946161819401256,21.062088012695312,0.7158698879917538,0.6772757758673251,22.645580291748047,0.6985589225589225,0.6256996718780158,0.7906225230168892,-14.085229873657227,0.7481621125333402
7
+ 5,-1,0.6718645204560697,0.1337960958480835,0.6940711254745858,0.6098298589468338,0.8053167489787209,-0.13094359636306763,0.7394516922830743,0.6854307664166819,426.9678955078125,0.7215379685998078,0.6224487539526348,0.858148893360161,468.9588623046875,0.7344527601862045,0.6613468690933479,20.207061767578125,0.6924590016522981,0.548437639099083,0.9390585939881715,26.321395874023438,0.7174525809361008,0.6701420645082616,16.273883819580078,0.6948475724014939,0.6030642230646717,0.8195841716968477,-35.22602081298828,0.7388797703282758
8
+ 6,-1,0.6762544966770319,0.14582541584968567,0.6955985407207159,0.6106682027649769,0.8079690262788854,-0.1332850456237793,0.7428792113476992,0.6900341442594964,438.0541076660156,0.720180161524125,0.6257478296073051,0.8481799890250594,478.80511474609375,0.7372861116264031,0.6660874336930674,20.88856315612793,0.7004425809457255,0.5667382864035584,0.916712395585635,25.949804306030273,0.7210590113180488,0.6736174623498568,22.276962280273438,0.6967376917287378,0.6065521915951197,0.8184257057496495,-35.493534088134766,0.7411595217234634
9
+ 7,-1,0.6667886104505822,0.09249621629714966,0.6922657589980189,0.613855549529581,0.7936406316688007,-0.11688518524169922,0.7342520760074628,0.6866959331748064,441.6366882324219,0.7141725421849006,0.6298910005589715,0.8244924089994512,483.64276123046875,0.7366585594845865,0.6607219072007804,21.191844940185547,0.7012119567141457,0.5720392013400851,0.9057374550332297,26.606861114501953,0.7193225608617343,0.6650204255838059,22.81573486328125,0.6929348906890106,0.6130909090909091,0.796689226266691,-30.088247299194336,0.7300973533703405
10
+ 8,-1,0.6664837509907933,0.10211536288261414,0.6907975298563472,0.6079962915749217,0.7997073349186026,-0.14499104022979736,0.7307157363107932,0.6851563929028718,431.011474609375,0.7106677910715228,0.6258296588535623,0.8221145052130968,485.1802062988281,0.7370754199630012,0.659426254496677,21.681676864624023,0.6997913901005121,0.572470232323624,0.899945125297238,26.773082733154297,0.7186558847939153,0.6642887628803121,18.20148468017578,0.6913789285853837,0.6035474701534963,0.8091274922260838,-42.23976135253906,0.726631100097931
11
+ 9,-1,0.6630998109871349,0.12585166096687317,0.6890432923570283,0.6133498263475903,0.7860496311200537,-0.12174004316329956,0.7282460206472967,0.6822754710078653,437.261962890625,0.7058481156659656,0.6234373174436266,0.8133650387171514,488.2479248046875,0.7359740092524213,0.6575208828729956,21.085512161254883,0.6997576090460782,0.575146721102322,0.893299189073837,26.86428451538086,0.717956315362813,0.6615602707152003,20.969688415527344,0.6900005285132921,0.6089035026351383,0.7960185354551552,-36.937259674072266,0.7237710727313599
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:85a51b49cf59abc469ae56a0129609737eefd48008c71e35962e2400551d0a28
3
+ size 498652017
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 32,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": false}}
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"errors": "replace", "bos_token": "<s>", "eos_token": "</s>", "sep_token": "</s>", "cls_token": "<s>", "unk_token": "<unk>", "pad_token": "<pad>", "mask_token": "<mask>", "add_prefix_space": false, "trim_offsets": true, "model_max_length": 512, "special_tokens_map_file": null, "name_or_path": "roberta-base", "tokenizer_class": "RobertaTokenizer"}
vocab.json ADDED
The diff for this file is too large to render. See raw diff