dimitriz
/

st-greek-media-bert-base-uncased

@@ -14,20 +14,26 @@ metrics:
 model-index:
   - name: st-greek-media-bert-base-uncased
     results: [
-task:
-  name: STS Benchmark
-  type: sentence-similarity
-metrics:
-  accuracy_cosinus: 0.9563965089445283
-  accuracy_euclidean: 0.9566394253292384
-  accuracy_manhattan: 0.9565353183072198
-dataset:
-  name: all_custom_greek_media_triplets
-  type: sentence-pair
     ]
 ---
 sentence_transformers.losses.TripletLoss.TripletLoss` with parameters:
   ```
   {'distance_metric': 'TripletDistanceMetric.EUCLIDEAN', 'triplet_margin': 5}
@@ -43,7 +49,9 @@ This is a [sentence-transformers](https://www.SBERT.net) based on the [Greek Med
 Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
 ```
 pip install -U sentence-transformers
 ```
 Then you can use the model like this:
@@ -57,19 +65,20 @@ embeddings = model.encode(sentences)
 print(embeddings)
 ```
 ## Usage (HuggingFace Transformers)
-Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
 ```python
 from transformers import AutoTokenizer, AutoModel
 import torch
-#Mean Pooling - Take attention mask into account for correct averaging
 def mean_pooling(model_output, attention_mask):
-    token_embeddings = model_output[0] #First element of model_output contains all token embeddings
     input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
     return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
@@ -95,20 +104,23 @@ print("Sentence embeddings:")
 print(sentence_embeddings)
 ```
 ## Evaluation Results
 <!--- Describe how your model was evaluated -->
-For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=dimitriz/st-greek-media-bert-base-uncased)
 ## Training
-The model was trained on a custom dataset containing triplets from the **combined** Greek 'internet', 'social-media' and 'press' domains, described in the paper [DACL](https://...).
-- The dataset was created by sampling triplets of sentences from the same domain, where the first two sentences are more similar than the third one.
-- Training objective was to maximize the similarity between the first two sentences and minimize the similarity between the first and the third sentence.
-- The model was trained for 3 epochs with a batch size of 16 and a maximum sequence length of 512 tokens.
 - The model was trained on a single NVIDIA RTX A6000 GPU with 48GB of memory.
 The model was trained with the parameters:
@@ -116,6 +128,7 @@ The model was trained with the parameters:
 **DataLoader**:
 `torch.utils.data.dataloader.DataLoader` of length 10807 with parameters:
 ```
 {'batch_size': 16, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
 ```
@@ -123,11 +136,13 @@ The model was trained with the parameters:
 **Loss**:
 `sentence_transformers.losses.TripletLoss.TripletLoss` with parameters:
   ```
   {'distance_metric': 'TripletDistanceMetric.EUCLIDEAN', 'triplet_margin': 5}
   ```
 Parameters of the fit()-Method:
 ```
 {
     "epochs": 3,
@@ -145,8 +160,8 @@ Parameters of the fit()-Method:
 }
 ```
 ## Full Model Architecture
 ```
 SentenceTransformer(
   (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel

 model-index:
   - name: st-greek-media-bert-base-uncased
     results: [
+      {
+        "task": {
+          "name": "STS Benchmark",
+          "type": "sentence-similarity"
+        },
+        "metrics": {
+          "accuracy_cosinus": 0.9563965089445283,
+          "accuracy_euclidean": 0.9566394253292384,
+          "accuracy_manhattan": 0.9565353183072198
+        },
+        "dataset": {
+          "name": "all_custom_greek_media_triplets",
+          "type": "sentence-pair"
+        },
+      }
     ]
 ---
 sentence_transformers.losses.TripletLoss.TripletLoss` with parameters:
   ```
   {'distance_metric': 'TripletDistanceMetric.EUCLIDEAN', 'triplet_margin': 5}
 Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
 ```
 pip install -U sentence-transformers
 ```
 Then you can use the model like this:
 print(embeddings)
 ```
 ## Usage (HuggingFace Transformers)
+Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input
+through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word
+embeddings.
 ```python
 from transformers import AutoTokenizer, AutoModel
 import torch
+# Mean Pooling - Take attention mask into account for correct averaging
 def mean_pooling(model_output, attention_mask):
+    token_embeddings = model_output[0]  # First element of model_output contains all token embeddings
     input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
     return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
 print(sentence_embeddings)
 ```
 ## Evaluation Results
 <!--- Describe how your model was evaluated -->
+For an automated evaluation of this model, see the *Sentence Embeddings
+Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=dimitriz/st-greek-media-bert-base-uncased)
 ## Training
+The model was trained on a custom dataset containing triplets from the **combined** Greek 'internet', 'social-media'
+and 'press' domains, described in the paper [DACL](https://...).
+- The dataset was created by sampling triplets of sentences from the same domain, where the first two sentences are more
+  similar than the third one.
+- Training objective was to maximize the similarity between the first two sentences and minimize the similarity between
+  the first and the third sentence.
+- The model was trained for 3 epochs with a batch size of 16 and a maximum sequence length of 512 tokens.
 - The model was trained on a single NVIDIA RTX A6000 GPU with 48GB of memory.
 The model was trained with the parameters:
 **DataLoader**:
 `torch.utils.data.dataloader.DataLoader` of length 10807 with parameters:
 ```
 {'batch_size': 16, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
 ```
 **Loss**:
 `sentence_transformers.losses.TripletLoss.TripletLoss` with parameters:
   ```
   {'distance_metric': 'TripletDistanceMetric.EUCLIDEAN', 'triplet_margin': 5}
   ```
 Parameters of the fit()-Method:
 ```
 {
     "epochs": 3,
 }
 ```
 ## Full Model Architecture
 ```
 SentenceTransformer(
   (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel