--- language: - de tags: - medical - ggponc widget: - text: "Vitamin C, E und A" example_title: "Forward Ellipsis" - text: "Chemo- und Strahlentherapie" example_title: "Backward Ellipsis" - text: "HPV-16- und/oder -18-Positivität" example_title: "Complex Ellipsis" --- ## Model Fine-tuned [mt5-base](https://huggingface.co/google/mt5-base) model for resolving elliptical coordinated compound noun phrases (ECCNPs) in German text. ECCNPs are are special type of coordination ellipses, where a part of a compound noun is omitted due to coordination (e.g., "and", "or", "/"). For instance, *Chemo- und Strahlentherapie* (chemo- and radiotherapy) is the elliptical form of *Chemotherapie und Strahlentherapie* (chemotherapy and radiotherapy). ## Dataset The model has been fine-tuned with a subset of sentences of [GGPONC 2.0](https://huggingface.co/datasets/bigbio/ggponc2) containing manually annotated ECCNPs and their resolution. The annotated dataset is available on Zenodo: https://zenodo.org/records/12529883 ## Usage The model can be loaded as a `Text2TextGenerationPipeline`: ``` from transformers import pipeline pipe = pipeline(model="phlobo/german-ellipses-resolver-mt5-base") ``` ``` pipe("Chemo- und Strahlentherapie") >>> [{'generated_text': 'Chemotherapie und Strahlentherapie'}] ``` ``` pipe("Vitamin C, E und A") >>> [{'generated_text': 'Vitamin C, Vitamin E und Vitamin A'}] ``` It is recommended to set `max_length` to control the maximum output length. For most German sentences, a value of `256` should be enough: ``` pipe = pipeline(model="phlobo/german-ellipses-resolver-mt5-base", max_length=256) ``` ## Paper Our approach and its evaluation have been published at the ACL BioNLP'23 workshop. Please cite the following paper if you find our model useful: ```bibtex @inproceedings{kammer-etal-2023-resolving, title = "Resolving Elliptical Compounds in {G}erman Medical Text", author = "Kammer, Niklas and Borchert, Florian and Winkler, Silvia and de Melo, Gerard and Schapranow, Matthieu-P.", editor = "Demner-fushman, Dina and Ananiadou, Sophia and Cohen, Kevin", booktitle = "The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks", month = jul, year = "2023", address = "Toronto, Canada", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.bionlp-1.26", doi = "10.18653/v1/2023.bionlp-1.26", pages = "292--305" } ```