|
--- |
|
license: mit |
|
--- |
|
|
|
<span style="color:blue">**Note: please check [DeepKPG](https://github.com/uclanlp/DeepKPG#scibart) for using this model in huggingface, including setting up the newly trained tokenizer.**</span>. |
|
|
|
Paper: [Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study](https://arxiv.org/abs/2212.10233) |
|
|
|
``` |
|
@article{https://doi.org/10.48550/arxiv.2212.10233, |
|
doi = {10.48550/ARXIV.2212.10233}, |
|
url = {https://arxiv.org/abs/2212.10233}, |
|
author = {Wu, Di and Ahmad, Wasi Uddin and Chang, Kai-Wei}, |
|
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, |
|
title = {Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study}, |
|
publisher = {arXiv}, |
|
year = {2022}, |
|
copyright = {Creative Commons Attribution 4.0 International} |
|
} |
|
``` |
|
|
|
Pre-training Corpus: [S2ORC (titles and abstracts)](https://github.com/allenai/s2orc) |
|
|
|
Pre-training Details: |
|
- **Pre-trained from scratch with science vocabulary** |
|
- Batch size: 2048 |
|
- Total steps: 250k |
|
- Learning rate: 3e-4 |
|
- LR schedule: polynomial with 10k warmup steps |
|
- Masking ratio: 30%, Poisson lambda = 3.5 |