TimSchopf
/

nlp_survey_classifier

@@ -14,6 +14,24 @@ pipeline_tag: text-classification
 This is a fine-tuned BERT-based language model to classify NLP-related research papers as "survey" or "non-survey" papers. The model is fine-tuned on a dataset of 787 survey and 11,805 non-survey papers from the ACL Anthology and the arXiv cs.CL category. Prior to fine-tuning, the model is initialized with weights from Prior to fine-tuning, the model is initialized with weights from [malteos/scincl](https://huggingface.co/malteos/scincl).
 ## Evaluation Results

 This is a fine-tuned BERT-based language model to classify NLP-related research papers as "survey" or "non-survey" papers. The model is fine-tuned on a dataset of 787 survey and 11,805 non-survey papers from the ACL Anthology and the arXiv cs.CL category. Prior to fine-tuning, the model is initialized with weights from Prior to fine-tuning, the model is initialized with weights from [malteos/scincl](https://huggingface.co/malteos/scincl).
+## How to use the fine-tuned model
+```python
+from transformers import pipeline
+classifier = pipeline("text-classification", model="TimSchopf/nlp_survey_classifier", truncation=True, max_length=512, padding=True)
+# prepare data
+papers = [{'title': 'Attention Is All You Need', 'abstract': 'The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.'},
+          {'title': 'SimCSE: Simple Contrastive Learning of Sentence Embeddings', 'abstract': 'This paper presents SimCSE, a simple contrastive learning framework that greatly advances state-of-the-art sentence embeddings. We first describe an unsupervised approach, which takes an input sentence and predicts itself in a contrastive objective, with only standard dropout used as noise. This simple method works surprisingly well, performing on par with previous supervised counterparts. We find that dropout acts as minimal data augmentation, and removing it leads to a representation collapse. Then, we propose a supervised approach, which incorporates annotated pairs from natural language inference datasets into our contrastive learning framework by using "entailment" pairs as positives and "contradiction" pairs as hard negatives. We evaluate SimCSE on standard semantic textual similarity (STS) tasks, and our unsupervised and supervised models using BERT base achieve an average of 76.3% and 81.6% Spearmans correlation respectively, a 4.2% and 2.2% improvement compared to the previous best results. We also show -- both theoretically and empirically -- that the contrastive learning objective regularizes pre-trained embeddings anisotropic space to be more uniform, and it better aligns positive pairs when supervised signals are available.'},
+          {'title': 'A Survey on Explainable Artificial Intelligence (XAI): Towards Medical XAI', 'abstract': 'Recently, artificial intelligence and machine learning in general have demonstrated remarkable performances in many tasks, from image processing to natural language processing, especially with the advent of deep learning. Along with research progress, they have encroached upon many different fields and disciplines. Some of them require high level of accountability and thus transparency, for example the medical sector. Explanations for machine decisions and predictions are thus needed to justify their reliability. This requires greater interpretability, which often means we need to understand the mechanism underlying the algorithms. Unfortunately, the blackbox nature of the deep learning is still unresolved, and many machine decisions are still poorly understood. We provide a review on interpretabilities suggested by different research works and categorize them. The different categories show different dimensions in interpretability research, from approaches that provide "obviously" interpretable information to the studies of complex patterns. By applying the same categorization to interpretability in medical research, it is hoped that (1) clinicians and practitioners can subsequently approach these methods with caution, (2) insights into interpretability will be born with more considerations for medical practices, and (3) initiatives to push forward data-based, mathematically- and technically-grounded medical education are encouraged.'}]
+# concatenate title and abstract with [SEP] token
+title_abs = [d['title'] + tokenizer.sep_token + (d.get('abstract') or '') for d in papers]
+# classify papers
+classifier(title_abs)
+```
 ## Evaluation Results