dleemiller commited on
Commit
4154380
·
verified ·
1 Parent(s): 0b502c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -43,8 +43,8 @@ pretraining from my much larger semi-synthetic dataset `dleemiller/wiki-sim` tha
43
  | Model | STS-B Test Pearson | STS-B Test Spearman | Context Length | Parameters | Speed |
44
  |--------------------------------|--------------------|---------------------|----------------|------------|---------|
45
  | **ModernCE-base-sts** | **0.9162** | **0.9122** | **8192** | 149M | **Fast** |
46
- | `roberta-large-stsb` | 0.9147 | - | 512 | 355M | Slow |
47
- | `distilroberta-base-stsb` | 0.8792 | - | 512 | 66M | Fast |
48
 
49
 
50
  ---
@@ -78,8 +78,8 @@ The model returns similarity scores in the range `[0, 1]`, where higher scores i
78
 
79
  ### Pretraining
80
  The model was pretrained on the `pair-score-sampled` subset of the [`dleemiller/wiki-sim`](https://huggingface.co/datasets/dleemiller/wiki-sim) dataset. This dataset provides diverse sentence pairs with semantic similarity scores, helping the model build a robust understanding of relationships between sentences.
81
- - **Classifier Dropout:** 0.3, to introduce regularization and reduce overfitting.
82
- - **Objective:** STS-B scores from `roberta-large-stsb`.
83
 
84
  ### Fine-Tuning
85
  Fine-tuning was performed on the [`sentence-transformers/stsb`](https://huggingface.co/datasets/sentence-transformers/stsb) dataset.
 
43
  | Model | STS-B Test Pearson | STS-B Test Spearman | Context Length | Parameters | Speed |
44
  |--------------------------------|--------------------|---------------------|----------------|------------|---------|
45
  | **ModernCE-base-sts** | **0.9162** | **0.9122** | **8192** | 149M | **Fast** |
46
+ | `stsb-roberta-large` | 0.9147 | - | 512 | 355M | Slow |
47
+ | `stsb-distilroberta-base` | 0.8792 | - | 512 | 66M | Fast |
48
 
49
 
50
  ---
 
78
 
79
  ### Pretraining
80
  The model was pretrained on the `pair-score-sampled` subset of the [`dleemiller/wiki-sim`](https://huggingface.co/datasets/dleemiller/wiki-sim) dataset. This dataset provides diverse sentence pairs with semantic similarity scores, helping the model build a robust understanding of relationships between sentences.
81
+ - **Classifier Dropout:** a somewhat large classifier dropout of 0.3, to reduce overreliance on teacher scores.
82
+ - **Objective:** STS-B scores from `cross-encoder/stsb-roberta-large`.
83
 
84
  ### Fine-Tuning
85
  Fine-tuning was performed on the [`sentence-transformers/stsb`](https://huggingface.co/datasets/sentence-transformers/stsb) dataset.