dleemiller
/

ModernCE-base-sts

Text Classification

sentence-transformers

stsbenchmark-sts

Model card Files Files and versions Community

dleemiller commited on Jan 13

Commit

4154380

·

verified ·

1 Parent(s): 0b502c4

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -43,8 +43,8 @@ pretraining from my much larger semi-synthetic dataset `dleemiller/wiki-sim` tha
 | Model                          | STS-B Test Pearson | STS-B Test Spearman | Context Length | Parameters | Speed  |
 |--------------------------------|--------------------|---------------------|----------------|------------|---------|
 | **ModernCE-base-sts** | **0.9162**         | **0.9122**          | **8192**       | 149M       | **Fast** |
-| `roberta-large-stsb`           | 0.9147            | -              | 512            | 355M       | Slow    |
-| `distilroberta-base-stsb`      | 0.8792            | -              | 512            | 66M        | Fast    |
 ---
@@ -78,8 +78,8 @@ The model returns similarity scores in the range `[0, 1]`, where higher scores i
 ### Pretraining
 The model was pretrained on the `pair-score-sampled` subset of the [`dleemiller/wiki-sim`](https://huggingface.co/datasets/dleemiller/wiki-sim) dataset. This dataset provides diverse sentence pairs with semantic similarity scores, helping the model build a robust understanding of relationships between sentences.
-- **Classifier Dropout:** 0.3, to introduce regularization and reduce overfitting.
-- **Objective:** STS-B scores from `roberta-large-stsb`.
 ### Fine-Tuning
 Fine-tuning was performed on the [`sentence-transformers/stsb`](https://huggingface.co/datasets/sentence-transformers/stsb) dataset.

 | Model                          | STS-B Test Pearson | STS-B Test Spearman | Context Length | Parameters | Speed  |
 |--------------------------------|--------------------|---------------------|----------------|------------|---------|
 | **ModernCE-base-sts** | **0.9162**         | **0.9122**          | **8192**       | 149M       | **Fast** |
+| `stsb-roberta-large`           | 0.9147            | -              | 512            | 355M       | Slow    |
+| `stsb-distilroberta-base`      | 0.8792            | -              | 512            | 66M        | Fast    |
 ---
 ### Pretraining
 The model was pretrained on the `pair-score-sampled` subset of the [`dleemiller/wiki-sim`](https://huggingface.co/datasets/dleemiller/wiki-sim) dataset. This dataset provides diverse sentence pairs with semantic similarity scores, helping the model build a robust understanding of relationships between sentences.
+- **Classifier Dropout:** a somewhat large classifier dropout of 0.3, to reduce overreliance on teacher scores.
+- **Objective:** STS-B scores from `cross-encoder/stsb-roberta-large`.
 ### Fine-Tuning
 Fine-tuning was performed on the [`sentence-transformers/stsb`](https://huggingface.co/datasets/sentence-transformers/stsb) dataset.