metadata
datasets:
- HiTZ/AbstRCT-ES
language:
- es
- en
pipeline_tag: token-classification
widget:
- text: >-
In the comparison of responders versus patients with both SD (6m) and PD,
responders indicated better physical well-being (P=.004) and mood (P=.02)
at month 3.
example_title: English
- text: >-
En la comparación de los que respondieron frente a los pacientes tanto con
SD (6m) como con EP, los que respondieron indicaron un mejor bienestar
físico (P=.004) y estado de ánimo (P=.02) en el mes 3
example_title: Spanish
Cross-lingual Argument Mining in the Medical Domain
This model is a fine-tuned version of mBERT for the argument mining task using AbstRCT data in English and Spanish.
The dataset consists of abstracts of 5 disease types for argument component detection and argument relation classification:
neoplasm
: 350 train, 100 dev and 50 test abstractsglaucoma_test
: 100 abstractsmixed_test
: 100 abstracts (20 on glaucoma, 20 on neoplasm, 20 on diabetes, 20 on hypertension, 20 on hepatitis)
The results (F1 macro averaged at token level) achieved for each test set:
Test | F1-macro | F1-Claim | F1-Premise |
---|---|---|---|
Neoplasm | 82.36 | 74.89 | 89.07 |
Glaucoma | 80.52 | 75.22 | 84.86 |
Mixed | 81.69 | 75.06 | 88.57 |
You can find more information:
- 📖 Paper: Crosslingual Argument Mining in the Medical Domain
- Code: https://github.com/ragerri/abstrct-projections
You can load the model as follows:
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained('HiTZ/mbert-argument-mining-es')
Citation
@misc{yeginbergen2024crosslingual,
title={Cross-lingual Argument Mining in the Medical Domain},
author={Anar Yeginbergen and Rodrigo Agerri},
year={2024},
eprint={2301.10527},
archivePrefix={arXiv},
primaryClass={cs.CL}
}