d1mitriz commited on
Commit
10df9cf
·
1 Parent(s): 22a6274

Fixed model card, try 5

Browse files
Files changed (1) hide show
  1. README.md +39 -24
README.md CHANGED
@@ -14,20 +14,26 @@ metrics:
14
  model-index:
15
  - name: st-greek-media-bert-base-uncased
16
  results: [
17
- task:
18
- name: STS Benchmark
19
- type: sentence-similarity
20
- metrics:
21
- accuracy_cosinus: 0.9563965089445283
22
- accuracy_euclidean: 0.9566394253292384
23
- accuracy_manhattan: 0.9565353183072198
24
- dataset:
25
- name: all_custom_greek_media_triplets
26
- type: sentence-pair
 
 
 
 
 
27
  ]
28
  ---
29
 
30
  sentence_transformers.losses.TripletLoss.TripletLoss` with parameters:
 
31
  ```
32
  {'distance_metric': 'TripletDistanceMetric.EUCLIDEAN', 'triplet_margin': 5}
33
 
@@ -43,7 +49,9 @@ This is a [sentence-transformers](https://www.SBERT.net) based on the [Greek Med
43
  Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
44
 
45
  ```
 
46
  pip install -U sentence-transformers
 
47
  ```
48
 
49
  Then you can use the model like this:
@@ -57,19 +65,20 @@ embeddings = model.encode(sentences)
57
  print(embeddings)
58
  ```
59
 
60
-
61
-
62
  ## Usage (HuggingFace Transformers)
63
- Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
 
 
 
64
 
65
  ```python
66
  from transformers import AutoTokenizer, AutoModel
67
  import torch
68
 
69
 
70
- #Mean Pooling - Take attention mask into account for correct averaging
71
  def mean_pooling(model_output, attention_mask):
72
- token_embeddings = model_output[0] #First element of model_output contains all token embeddings
73
  input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
74
  return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
75
 
@@ -95,20 +104,23 @@ print("Sentence embeddings:")
95
  print(sentence_embeddings)
96
  ```
97
 
98
-
99
-
100
  ## Evaluation Results
101
 
102
  <!--- Describe how your model was evaluated -->
103
 
104
- For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=dimitriz/st-greek-media-bert-base-uncased)
105
-
106
 
107
  ## Training
108
- The model was trained on a custom dataset containing triplets from the **combined** Greek 'internet', 'social-media' and 'press' domains, described in the paper [DACL](https://...).
109
- - The dataset was created by sampling triplets of sentences from the same domain, where the first two sentences are more similar than the third one.
110
- - Training objective was to maximize the similarity between the first two sentences and minimize the similarity between the first and the third sentence.
111
- - The model was trained for 3 epochs with a batch size of 16 and a maximum sequence length of 512 tokens.
 
 
 
 
 
112
  - The model was trained on a single NVIDIA RTX A6000 GPU with 48GB of memory.
113
 
114
  The model was trained with the parameters:
@@ -116,6 +128,7 @@ The model was trained with the parameters:
116
  **DataLoader**:
117
 
118
  `torch.utils.data.dataloader.DataLoader` of length 10807 with parameters:
 
119
  ```
120
  {'batch_size': 16, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
121
  ```
@@ -123,11 +136,13 @@ The model was trained with the parameters:
123
  **Loss**:
124
 
125
  `sentence_transformers.losses.TripletLoss.TripletLoss` with parameters:
 
126
  ```
127
  {'distance_metric': 'TripletDistanceMetric.EUCLIDEAN', 'triplet_margin': 5}
128
  ```
129
 
130
  Parameters of the fit()-Method:
 
131
  ```
132
  {
133
  "epochs": 3,
@@ -145,8 +160,8 @@ Parameters of the fit()-Method:
145
  }
146
  ```
147
 
148
-
149
  ## Full Model Architecture
 
150
  ```
151
  SentenceTransformer(
152
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
 
14
  model-index:
15
  - name: st-greek-media-bert-base-uncased
16
  results: [
17
+ {
18
+ "task": {
19
+ "name": "STS Benchmark",
20
+ "type": "sentence-similarity"
21
+ },
22
+ "metrics": {
23
+ "accuracy_cosinus": 0.9563965089445283,
24
+ "accuracy_euclidean": 0.9566394253292384,
25
+ "accuracy_manhattan": 0.9565353183072198
26
+ },
27
+ "dataset": {
28
+ "name": "all_custom_greek_media_triplets",
29
+ "type": "sentence-pair"
30
+ },
31
+ }
32
  ]
33
  ---
34
 
35
  sentence_transformers.losses.TripletLoss.TripletLoss` with parameters:
36
+
37
  ```
38
  {'distance_metric': 'TripletDistanceMetric.EUCLIDEAN', 'triplet_margin': 5}
39
 
 
49
  Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
50
 
51
  ```
52
+
53
  pip install -U sentence-transformers
54
+
55
  ```
56
 
57
  Then you can use the model like this:
 
65
  print(embeddings)
66
  ```
67
 
 
 
68
  ## Usage (HuggingFace Transformers)
69
+
70
+ Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input
71
+ through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word
72
+ embeddings.
73
 
74
  ```python
75
  from transformers import AutoTokenizer, AutoModel
76
  import torch
77
 
78
 
79
+ # Mean Pooling - Take attention mask into account for correct averaging
80
  def mean_pooling(model_output, attention_mask):
81
+ token_embeddings = model_output[0] # First element of model_output contains all token embeddings
82
  input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
83
  return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
84
 
 
104
  print(sentence_embeddings)
105
  ```
106
 
 
 
107
  ## Evaluation Results
108
 
109
  <!--- Describe how your model was evaluated -->
110
 
111
+ For an automated evaluation of this model, see the *Sentence Embeddings
112
+ Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=dimitriz/st-greek-media-bert-base-uncased)
113
 
114
  ## Training
115
+
116
+ The model was trained on a custom dataset containing triplets from the **combined** Greek 'internet', 'social-media'
117
+ and 'press' domains, described in the paper [DACL](https://...).
118
+
119
+ - The dataset was created by sampling triplets of sentences from the same domain, where the first two sentences are more
120
+ similar than the third one.
121
+ - Training objective was to maximize the similarity between the first two sentences and minimize the similarity between
122
+ the first and the third sentence.
123
+ - The model was trained for 3 epochs with a batch size of 16 and a maximum sequence length of 512 tokens.
124
  - The model was trained on a single NVIDIA RTX A6000 GPU with 48GB of memory.
125
 
126
  The model was trained with the parameters:
 
128
  **DataLoader**:
129
 
130
  `torch.utils.data.dataloader.DataLoader` of length 10807 with parameters:
131
+
132
  ```
133
  {'batch_size': 16, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
134
  ```
 
136
  **Loss**:
137
 
138
  `sentence_transformers.losses.TripletLoss.TripletLoss` with parameters:
139
+
140
  ```
141
  {'distance_metric': 'TripletDistanceMetric.EUCLIDEAN', 'triplet_margin': 5}
142
  ```
143
 
144
  Parameters of the fit()-Method:
145
+
146
  ```
147
  {
148
  "epochs": 3,
 
160
  }
161
  ```
162
 
 
163
  ## Full Model Architecture
164
+
165
  ```
166
  SentenceTransformer(
167
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel