SciMult

SciMult is a pre-trained language model for scientific literature understanding. It is pre-trained on data from (extreme multi-label) paper classification, citation prediction, and literature retrieval tasks via a multi-task contrastive learning framework. For more details, please refer to the paper.

We release four variants of SciMult here:
scimult_vanilla.ckpt
scimult_moe.ckpt
scimult_moe_pmcpatients_par.ckpt
scimult_moe_pmcpatients_ppr.ckpt

scimult_vanilla.ckpt and scimult_moe.ckpt can be used for various scientific literature understanding tasks. Their difference is that scimult_vanilla.ckpt adopts a typical 12-layer Transformer architecture (i.e., the same as BERT base), whereas scimult_moe.ckpt adopts a Mixture-of-Experts Transformer architecture with task-specific multi-head attention (MHA) sublayers. Experimental results show that scimult_moe.ckpt achieves better performance in general.

scimult_moe_pmcpatients_par.ckpt and scimult_moe_pmcpatients_ppr.ckpt are initialized from scimult_moe.ckpt and continuously pre-trained on the training sets of PMC-Patients patient-to-article retrieval and patient-to-patient retrieval tasks, respectively. As of December 2023, these two models rank 1st and 2nd in their corresponding tasks, respectively, on the PMC-Patients Leaderboard.

Pre-training Data

SciMult is pre-trained on the following data:
MAPLE for paper classification
Citation Prediction Triplets for link prediction
SciRepEval-Search for literature retrieval

Citation

If you find SciMult useful in your research, please cite the following paper:

@inproceedings{zhang2023pre,
  title={Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding},
  author={Zhang, Yu and Cheng, Hao and Shen, Zhihong and Liu, Xiaodong and Wang, Ye-Yi and Gao, Jianfeng},
  booktitle={Findings of EMNLP'23},
  pages={12259--12275},
  year={2023}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.