Hungarian Abstractive Summarization BART model

For further models, scripts and details, see our repository or our demo site.

  • BART base model (see Results Table - bold):
    • Pretrained on Webcorpus 2.0
    • Finetuned NOL corpus (nol.hu)
      • Segments: 397,343

Limitations

  • tokenized input text (tokenizer: HuSpaCy)
  • max_source_length = 512
  • max_target_length = 256

Results

Model HI NOL
BART-base-512 30.18/13.86/22.92 46.48/32.40/39.45
BART-base-1024 31.86/14.59/23.79 47.01/32.91/39.97

Citation

If you use this model, please cite the following paper:

@inproceedings {yang-bart,
    title = {{BARTerezzünk! - Messze, messze, messze a világtól, - BART kísérleti modellek magyar nyelvre}},
    booktitle = {XVIII. Magyar Számítógépes Nyelvészeti Konferencia},
    year = {2022},
    publisher = {Szegedi Tudományegyetem, Informatikai Intézet},
    address = {Szeged, Magyarország},
    author = {Yang, Zijian Győző},
    pages = {15--29}
}
Downloads last month
20
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.