yuz9yuz commited on
Commit
58c927e
1 Parent(s): 1f22ab1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -12,14 +12,14 @@ We release four variants of SciMult here:
12
  **scimult_moe_pmcpatients_par.ckpt**
13
  **scimult_moe_pmcpatients_ppr.ckpt**
14
 
15
- **scimult_vanilla.ckpt** and **scimult_moe.ckpt** can be used for various scientific literature understanding tasks in general. Their difference is that **scimult_vanilla.ckpt** adopts a typical 12-layer Transformer architecture (i.e., the same as [BERT base](https://huggingface.co/bert-base-uncased)), whereas **scimult_moe.ckpt** adopts a Mixture-of-Experts Transformer architecture with task-specific multi-head attention (MHA) sublayers. Experimental results show that **scimult_moe.ckpt** achieves better performance in general.
16
 
17
  **scimult_moe_pmcpatients_par.ckpt** and **scimult_moe_pmcpatients_ppr.ckpt** are initialized from **scimult_moe.ckpt** and continuously pre-trained on the training sets of [PMC-Patients](https://github.com/pmc-patients/pmc-patients) patient-to-article retrieval and patient-to-patient retrieval tasks, respectively. As of October 2023, these two models rank 1st in their corresponding tasks on the [PMC-Patients Leaderboard](https://pmc-patients.github.io/).
18
 
19
 
20
  ## Pre-training Data
21
  SciMult is pre-trained on the following data:
22
- [MAPLE](https://github.com/yuzhimanhua/MAPLE) for paper classification
23
  [Citation Prediction Triplets](https://huggingface.co/datasets/allenai/scirepeval/viewer/cite_prediction) for link prediction
24
  [SciRepEval-Search](https://huggingface.co/datasets/allenai/scirepeval/viewer/search) for literature retrieval
25
 
 
12
  **scimult_moe_pmcpatients_par.ckpt**
13
  **scimult_moe_pmcpatients_ppr.ckpt**
14
 
15
+ **scimult_vanilla.ckpt** and **scimult_moe.ckpt** can be used for various scientific literature understanding tasks. Their difference is that **scimult_vanilla.ckpt** adopts a typical 12-layer Transformer architecture (i.e., the same as [BERT base](https://huggingface.co/bert-base-uncased)), whereas **scimult_moe.ckpt** adopts a Mixture-of-Experts Transformer architecture with task-specific multi-head attention (MHA) sublayers. Experimental results show that **scimult_moe.ckpt** achieves better performance in general.
16
 
17
  **scimult_moe_pmcpatients_par.ckpt** and **scimult_moe_pmcpatients_ppr.ckpt** are initialized from **scimult_moe.ckpt** and continuously pre-trained on the training sets of [PMC-Patients](https://github.com/pmc-patients/pmc-patients) patient-to-article retrieval and patient-to-patient retrieval tasks, respectively. As of October 2023, these two models rank 1st in their corresponding tasks on the [PMC-Patients Leaderboard](https://pmc-patients.github.io/).
18
 
19
 
20
  ## Pre-training Data
21
  SciMult is pre-trained on the following data:
22
+ [MAPLE](https://zenodo.org/records/7611544) for paper classification
23
  [Citation Prediction Triplets](https://huggingface.co/datasets/allenai/scirepeval/viewer/cite_prediction) for link prediction
24
  [SciRepEval-Search](https://huggingface.co/datasets/allenai/scirepeval/viewer/search) for literature retrieval
25