dleemiller
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -43,8 +43,8 @@ pretraining from my much larger semi-synthetic dataset `dleemiller/wiki-sim` tha
|
|
43 |
| Model | STS-B Test Pearson | STS-B Test Spearman | Context Length | Parameters | Speed |
|
44 |
|--------------------------------|--------------------|---------------------|----------------|------------|---------|
|
45 |
| **ModernCE-base-sts** | **0.9162** | **0.9122** | **8192** | 149M | **Fast** |
|
46 |
-
| `roberta-large
|
47 |
-
| `distilroberta-base
|
48 |
|
49 |
|
50 |
---
|
@@ -78,8 +78,8 @@ The model returns similarity scores in the range `[0, 1]`, where higher scores i
|
|
78 |
|
79 |
### Pretraining
|
80 |
The model was pretrained on the `pair-score-sampled` subset of the [`dleemiller/wiki-sim`](https://huggingface.co/datasets/dleemiller/wiki-sim) dataset. This dataset provides diverse sentence pairs with semantic similarity scores, helping the model build a robust understanding of relationships between sentences.
|
81 |
-
- **Classifier Dropout:** 0.3, to
|
82 |
-
- **Objective:** STS-B scores from `roberta-large
|
83 |
|
84 |
### Fine-Tuning
|
85 |
Fine-tuning was performed on the [`sentence-transformers/stsb`](https://huggingface.co/datasets/sentence-transformers/stsb) dataset.
|
|
|
43 |
| Model | STS-B Test Pearson | STS-B Test Spearman | Context Length | Parameters | Speed |
|
44 |
|--------------------------------|--------------------|---------------------|----------------|------------|---------|
|
45 |
| **ModernCE-base-sts** | **0.9162** | **0.9122** | **8192** | 149M | **Fast** |
|
46 |
+
| `stsb-roberta-large` | 0.9147 | - | 512 | 355M | Slow |
|
47 |
+
| `stsb-distilroberta-base` | 0.8792 | - | 512 | 66M | Fast |
|
48 |
|
49 |
|
50 |
---
|
|
|
78 |
|
79 |
### Pretraining
|
80 |
The model was pretrained on the `pair-score-sampled` subset of the [`dleemiller/wiki-sim`](https://huggingface.co/datasets/dleemiller/wiki-sim) dataset. This dataset provides diverse sentence pairs with semantic similarity scores, helping the model build a robust understanding of relationships between sentences.
|
81 |
+
- **Classifier Dropout:** a somewhat large classifier dropout of 0.3, to reduce overreliance on teacher scores.
|
82 |
+
- **Objective:** STS-B scores from `cross-encoder/stsb-roberta-large`.
|
83 |
|
84 |
### Fine-Tuning
|
85 |
Fine-tuning was performed on the [`sentence-transformers/stsb`](https://huggingface.co/datasets/sentence-transformers/stsb) dataset.
|