File size: 2,372 Bytes
8027944
 
 
1f22ab1
 
 
 
 
 
 
 
 
 
 
58c927e
1f22ab1
bc68314
1f22ab1
 
 
 
58c927e
1f22ab1
 
 
 
 
 
 
00ad649
1f22ab1
 
00ad649
 
1f22ab1
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
license: mit
---

# SciMult

SciMult is a pre-trained language model for scientific literature understanding. It is pre-trained on data from (extreme multi-label) paper classification, citation prediction, and literature retrieval tasks via a multi-task contrastive learning framework. For more details, please refer to the [paper](https://arxiv.org/abs/2305.14232).

We release four variants of SciMult here:      
**scimult_vanilla.ckpt**     
**scimult_moe.ckpt**      
**scimult_moe_pmcpatients_par.ckpt**      
**scimult_moe_pmcpatients_ppr.ckpt**

**scimult_vanilla.ckpt** and **scimult_moe.ckpt** can be used for various scientific literature understanding tasks. Their difference is that **scimult_vanilla.ckpt** adopts a typical 12-layer Transformer architecture (i.e., the same as [BERT base](https://huggingface.co/bert-base-uncased)), whereas **scimult_moe.ckpt** adopts a Mixture-of-Experts Transformer architecture with task-specific multi-head attention (MHA) sublayers. Experimental results show that **scimult_moe.ckpt** achieves better performance in general.

**scimult_moe_pmcpatients_par.ckpt** and **scimult_moe_pmcpatients_ppr.ckpt** are initialized from **scimult_moe.ckpt** and continuously pre-trained on the training sets of [PMC-Patients](https://github.com/pmc-patients/pmc-patients) patient-to-article retrieval and patient-to-patient retrieval tasks, respectively. As of December 2023, these two models rank 1st and 2nd in their corresponding tasks, respectively, on the [PMC-Patients Leaderboard](https://pmc-patients.github.io/).


## Pre-training Data
SciMult is pre-trained on the following data:     
[MAPLE](https://zenodo.org/records/7611544) for paper classification     
[Citation Prediction Triplets](https://huggingface.co/datasets/allenai/scirepeval/viewer/cite_prediction) for link prediction     
[SciRepEval-Search](https://huggingface.co/datasets/allenai/scirepeval/viewer/search) for literature retrieval


## Citation
If you find SciMult useful in your research, please cite the following paper:
```
@inproceedings{zhang2023pre,
  title={Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding},
  author={Zhang, Yu and Cheng, Hao and Shen, Zhihong and Liu, Xiaodong and Wang, Ye-Yi and Gao, Jianfeng},
  booktitle={Findings of EMNLP'23},
  pages={12259--12275},
  year={2023}
}
```