hellonlp commited on
Commit
61070a0
·
verified ·
1 Parent(s): 561a030

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -12
README.md CHANGED
@@ -8,18 +8,6 @@ pipeline_tag: sentence-similarity
8
  # SimCSE(sup)
9
 
10
 
11
- ## Model List
12
- The evaluation dataset is in Chinese, and we used the same language model **RoBERTa base** on different methods.
13
- | Model | STS-B(w-avg) | ATEC | BQ | LCQMC | PAWSX | Avg. |
14
- |:-----------------------:|:------------:|:-----------:|:----------|:-------------|:------------:|:----------:|
15
- | BERT-Whitening | 65.27| -| -| -| -| -|
16
- | SimBERT | 70.01| -| -| -| -| -|
17
- | SBERT-Whitening | 71.75| -| -| -| -| -|
18
- | [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh) | 78.61| -| -| -| -| -|
19
- | [hellonlp/simcse-base-zh(sup)](https://huggingface.co/hellonlp/simcse-roberta-base-zh) | **80.96**| -| -| -| -| -|
20
-
21
-
22
-
23
  ## Data List
24
  The following datasets are all in Chinese.
25
  | Data | size(train) | size(valid) | size(test) |
@@ -35,6 +23,28 @@ The following datasets are all in Chinese.
35
 
36
 
37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  ## Uses
39
  You can use our model for encoding sentences into embeddings
40
  ```python
 
8
  # SimCSE(sup)
9
 
10
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ## Data List
12
  The following datasets are all in Chinese.
13
  | Data | size(train) | size(valid) | size(test) |
 
23
 
24
 
25
 
26
+ ## Model List
27
+ The evaluation dataset is in Chinese, and we used the same language model **RoBERTa base** on different methods. In addition, considering that the test set of some datasets is small, which may lead to a large deviation in evaluation accuracy, the evaluation data here uses train, valid and test at the same time, and the final evaluation result adopts the **weighted average (w-avg)** method.
28
+
29
+ | Model | STS-B(w-avg) | ATEC | BQ | LCQMC | PAWSX | Avg. |
30
+ |:-----------------------:|:------------:|:-----------:|:----------|:----------|:----------:|:----------:|
31
+ | [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh) | 78.61| -| -| -| -| -|
32
+ | [BAAI/bge-large-zh-v1.5](https://huggingface.co/BAAI/bge-large-zh-v1.5) | 79.07| -| -| -| -| -|
33
+ | [hellonlp/simcse-large-zh](https://huggingface.co/hellonlp/simcse-roberta-large-zh) | 81.32| -| -| -| -| -|
34
+
35
+
36
+ | Model | STS-B(w-avg) | ATEC | BQ | LCQMC | PAWSX | Avg. |
37
+ |:-----------------------:|:------------:|:-----------:|:----------|:-------------|:------------:|:----------:|
38
+ | BERT-Whitening | 65.27| -| -| -| -| -|
39
+ | SimBERT | 70.01| -| -| -| -| -|
40
+ | SBERT-Whitening | 71.75| -| -| -| -| -|
41
+ | [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh) | 78.61| -| -| -| -| -|
42
+ | [hellonlp/simcse-base-zh(sup)](https://huggingface.co/hellonlp/simcse-roberta-base-zh) | **80.96**| -| -| -| -| -|
43
+
44
+
45
+
46
+
47
+
48
  ## Uses
49
  You can use our model for encoding sentences into embeddings
50
  ```python