|
--- |
|
library_name: transformers |
|
license: mit |
|
datasets: |
|
- allenai/peer_read |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
- f1 |
|
--- |
|
|
|
# Model Card for PaperPub |
|
|
|
*Paper pub*lication prediction based on English computer science abstracts. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
PaperPub is a SciBERT ([Beltagy et al 2019](https://arxiv.org/abs/1903.10676)) model fine-tuned to predict paper acceptance from computer science abstracts. |
|
Acceptance is modeled as a binary decision of accept or reject. |
|
The training and evaluation data is based on the arXiv subsection of PeerRead ([Kang et al. 2018](https://aclanthology.org/N18-1149/)). |
|
Our main use case for PaperPub is to research how attribution scores derived from acceptance predictions can inform reflecting about content and writing quality of abstracts. |
|
|
|
- **Developed by:** Semantic Computing Group, Bielefeld University, in particular Jan-Philipp Töberg, Christoph Düsing, Jonas Belouadi and Matthias Orlikowski |
|
- **Model type:** BERT for binary classification |
|
- **Language(s) (NLP):** English |
|
- **License:** MIT |
|
- **Finetuned from model:** SciBERT |
|
|
|
### Model Sources |
|
|
|
We will add a public demo of PaperPub used in an application which uses attribution scores to highlight words in an abstract that contribute to acceptance/rejection predcitions. |
|
|
|
- **Repository:** tba |
|
- **Demo:** tba |
|
|
|
## Uses |
|
|
|
PaperPub can only be meaningfully used in a research setting. The model should not be used for any consequential paper quality judgements. |
|
|
|
### Direct Use |
|
|
|
The intended use case in research into how attribution scores computed from paper acceptance decisions reflect the abstract's content quality. |
|
|
|
### Out-of-Scope Use |
|
|
|
This model must not be used as part of any type of paper quality judgements, but in particular not in a peer review process. PaperPub is explicitly not meant to automate paper acceptance decisions. |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
Bias, Risks, and Limitations are mainly related to the used datset. In addition to limitations that apply to the SciBERT pre-training corpus, our training data represents |
|
only a very specific subset of papers. PaperPub was trained in a hackathon-like setting, so performance is not optimized and not our main goal. |
|
|
|
### Recommendations |
|
|
|
Users should be aware that the dataset (computer science arXive preprints from a specific period) used for fine-tuning represents a very specific idea of what papers |
|
and in particular papers fit for publication look like. |
|
|
|
## How to Get Started with the Model |
|
|
|
tba |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
Custom stratified split of the arXiv subsection of PeerRead ([Kang et al. 2018](https://aclanthology.org/N18-1149/)). We use the data from their GitHub repository, not the Huggingface Hub version. |
|
|
|
### Training Procedure |
|
|
|
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** bf16 mixed precision |
|
- **Epochs:** 2 |
|
- **Initial Learning Rate:** 2^-5 |
|
|
|
## Evaluation |
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
|
|
Custom stratified split of the arXiv subsection of PeerRead ([Kang et al. 2018](https://aclanthology.org/N18-1149/)). We use the data from their GitHub repository, not the Huggingface Hub version. |
|
|
|
#### Factors |
|
|
|
Models, we compare to a naive most-frequent-class baseline. |
|
|
|
#### Metrics |
|
|
|
Accuracy, Macro F1 |
|
|
|
### Results |
|
|
|
- Majority Baseline |
|
- Acc. - 0.75 |
|
- Macro F1 - 0.43 |
|
- PaperPub |
|
- Acc. - 0.82 |
|
- Macro F1 - 0.76 |
|
|
|
|
|
## Environmental Impact |
|
|
|
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly --> |
|
|
|
- **Hardware Type:** 1xA40 |
|
- **Hours used:** 0.3 |
|
- **Cloud Provider:** Private Infrastructure |
|
- **Compute Region:** Europe |
|
|
|
## Technical Specifications |
|
|
|
### Compute Infrastructure |
|
|
|
We are using an internal SLURM cluster with A40 GPUs |
|
|
|
## Citation |
|
|
|
tba |
|
|
|
## Model Card Contact |
|
|
|
[Matthias Orlikowski](https://orlikow.ski) |