|
# Argument Quality model |
|
|
|
We reproduce the argument prediction model from [Gretz et al. (2020)](https://ojs.aaai.org/index.php/AAAI/article/view/6285), which has been accessible for some years in the context of IBM's Debater project ([Bar-Haim et al. 2021](https://aclanthology.org/2021.emnlp-demo.31.pdf)). |
|
|
|
Provided are two models, trained for 2 epochs. As described in Gretz et al.'s paper, the two versions are based on two different scoring functions for the manual annotations: weighted average (WA) and MACE-P ([Hovy et al. 2013](https://aclanthology.org/N13-1132/)). |
|
|
|
The repository for the retraining of the models can be found [here](https://github.com/webis-de/argmining25-reproducing-ibm-arg-quality-api). |
|
|
|
The model whose predictions have the higher correlation with the original model's predictions, is the WA model, so we recommend to use this rather than the MACE-P model. |
|
|
|
If you use the models or the code in your research, please cite the following paper describing the retraining and evaluation process: |
|
> Ines Zelch, Matthias Hagen, Benno Stein, and Johannes Kiesel. [Reproducing the Argument Quality Prediction of Project Debater.](https://webis.de/publications.html#zelch_2025b), In Proceedings of the _12th Workshop on Argument Mining_, July 2025. |