metadata
datasets:
- HiTZ/AbstRCT-ES
language:
- es
- en
pipeline_tag: token-classification
widget:
- text: >-
The dysuria resolved faster in patients implanted with 103Pd but was
unaffected by the use of supplemental radiotherapy and/or androgen
deprivation therapy.
- text: >-
La disuria se resolvió más rápidamente en los pacientes implantados con
103Pd, pero no se vio afectada por el uso de radioterapia suplementaria
y/o terapia de privación de andrógenos.
Cross-lingual Argument Mining in the Medical Domain
This model is a fine-tuned version of mBERT for the argument mining task using AbstRCT data in English and Spanish.
The dataset consists of abstracts of 5 disease types for argument component detection and argument relation classification:
neoplasm
: 350 train, 100 dev and 50 test abstractsglaucoma_test
: 100 abstractsmixed_test
: 100 abstracts (20 on glaucoma, 20 on neoplasm, 20 on diabetes, 20 on hypertension, 20 on hepatitis)
The results (F1 macro averaged at token level) achieved for each test set:
Test | F1-macro | F1-Claim | F1-Premise |
---|---|---|---|
Neoplasm | 82.36 | 74.89 | 89.07 |
Glaucoma | 80.52 | 75.22 | 84.86 |
Mixed | 81.69 | 75.06 | 88.57 |
You can find more information:
- 📖 Paper: Crosslingual Argument Mining in the Medical Domain
- 💻Code: https://github.com/ragerri/abstrct-projections
You can load the model as follows:
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained('HiTZ/mbert-argument-mining-es')
Citation
@misc{yeginbergen2024crosslingual,
title={Cross-lingual Argument Mining in the Medical Domain},
author={Anar Yeginbergen and Rodrigo Agerri},
year={2024},
eprint={2301.10527},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Contact: Anar Yeginbergen and Rodrigo Agerri HiTZ Center - Ixa, University of the Basque Country UPV/EHU