BART fine-tuned for keyphrase generation

This is the bart-base (Lewis et al.. 2019) model finetuned for the keyphrase generation task (Glazkova & Morozov, 2023) on the fragments of the following corpora:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("aglazkova/bart_finetuned_keyphrase_extraction")
model = AutoModelForSeq2SeqLM.from_pretrained("aglazkova/bart_finetuned_keyphrase_extraction")

text = "In this paper, we investigate cross-domain limitations of keyphrase generation using the models for abstractive text summarization.\
        We present an evaluation of BART fine-tuned for keyphrase generation across three types of texts, \
        namely scientific texts from computer science and biomedical domains and news texts. \
        We explore the role of transfer learning between different domains to improve the model performance on small text corpora."

tokenized_text = tokenizer.prepare_seq2seq_batch([text], return_tensors='pt')
translation = model.generate(**tokenized_text)
translated_text = tokenizer.batch_decode(translation, skip_special_tokens=True)[0]
print(translated_text)

Training Hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-5
  • train_batch_size: 8
  • optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
  • num_epochs: 6

BibTeX:

@InProceedings{10.1007/978-3-031-67826-4_19,
author="Glazkova, Anna
and Morozov, Dmitry",
title="Cross-Domain Robustness of Transformer-Based Keyphrase Generation",
booktitle="Data Analytics and Management in Data Intensive Domains",
year="2024",
publisher="Springer Nature Switzerland",
address="Cham",
pages="249--265"
}
Downloads last month
100,347
Inference API

Datasets used to train aglazkova/bart_finetuned_keyphrase_extraction