ragerri's picture
Update README.md
798aa8c verified
|
raw
history blame
2.19 kB
metadata
datasets:
  - HiTZ/AbstRCT-ES
language:
  - es
  - en
pipeline_tag: token-classification
widget:
  - text: >-
      In the comparison of responders versus patients with both SD (6m) and PD,
      responders indicated better physical well-being (P=.004) and mood (P=.02)
      at month 3.
  - text: >-
      En la comparación de los que respondieron frente a los pacientes tanto con
      SD (6m) como con EP, los que respondieron indicaron un mejor bienestar
      físico (P=.004) y estado de ánimo (P=.02) en el mes 3

Cross-lingual Argument Mining in the Medical Domain

This model is a fine-tuned version of mBERT for the argument mining task using AbstRCT data in English and Spanish.
The dataset consists of abstracts of 5 disease types for argument component detection and argument relation classification:

  • neoplasm: 350 train, 100 dev and 50 test abstracts
  • glaucoma_test: 100 abstracts
  • mixed_test: 100 abstracts (20 on glaucoma, 20 on neoplasm, 20 on diabetes, 20 on hypertension, 20 on hepatitis)

The results (F1 macro averaged at token level) achieved for each test set:

Test F1-macro F1-Claim F1-Premise
Neoplasm 82.36 74.89 89.07
Glaucoma 80.52 75.22 84.86
Mixed 81.69 75.06 88.57

You can find more information:

You can load the model as follows:

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained('HiTZ/mbert-argument-mining-es')

Citation

@misc{yeginbergen2024crosslingual,
      title={Cross-lingual Argument Mining in the Medical Domain}, 
      author={Anar Yeginbergen and Rodrigo Agerri},
      year={2024},
      eprint={2301.10527},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Contact: Anar Yeginbergen and Rodrigo Agerri HiTZ Center - Ixa, University of the Basque Country UPV/EHU