Update README.md
Browse files
README.md
CHANGED
@@ -128,6 +128,8 @@ base_model:
|
|
128 |
News topic classification model based on [`xlm-roberta-large`](https://huggingface.co/FacebookAI/xlm-roberta-large)
|
129 |
and fine-tuned on a [news corpus in 4 languages](http://hdl.handle.net/11356/1991) (Croatian, Slovenian, Catalan and Greek), annotated with the [top-level IPTC
|
130 |
Media Topic NewsCodes labels](https://www.iptc.org/std/NewsCodes/treeview/mediatopic/mediatopic-en-GB.html).
|
|
|
|
|
131 |
|
132 |
The model can be used for classification into topic labels from the
|
133 |
[IPTC NewsCodes schema](https://iptc.org/std/NewsCodes/guidelines/#_what_are_the_iptc_newscodes) and can be
|
@@ -316,15 +318,19 @@ model_args ={
|
|
316 |
|
317 |
## Citation
|
318 |
|
319 |
-
|
320 |
|
321 |
```
|
322 |
-
@
|
323 |
-
|
324 |
-
|
325 |
-
|
326 |
-
|
327 |
-
}
|
|
|
|
|
|
|
|
|
328 |
```
|
329 |
|
330 |
## Funding
|
|
|
128 |
News topic classification model based on [`xlm-roberta-large`](https://huggingface.co/FacebookAI/xlm-roberta-large)
|
129 |
and fine-tuned on a [news corpus in 4 languages](http://hdl.handle.net/11356/1991) (Croatian, Slovenian, Catalan and Greek), annotated with the [top-level IPTC
|
130 |
Media Topic NewsCodes labels](https://www.iptc.org/std/NewsCodes/treeview/mediatopic/mediatopic-en-GB.html).
|
131 |
+
The development and evaluation of the model is described in the paper
|
132 |
+
[LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification](https://doi.org/10.1109/ACCESS.2025.3544814) (Kuzman and Ljubešić, 2025).
|
133 |
|
134 |
The model can be used for classification into topic labels from the
|
135 |
[IPTC NewsCodes schema](https://iptc.org/std/NewsCodes/guidelines/#_what_are_the_iptc_newscodes) and can be
|
|
|
318 |
|
319 |
## Citation
|
320 |
|
321 |
+
If you use the model, please cite [this paper](https://doi.org/10.1109/ACCESS.2025.3544814):
|
322 |
|
323 |
```
|
324 |
+
@ARTICLE{10900365,
|
325 |
+
author={Kuzman, Taja and Ljubešić, Nikola},
|
326 |
+
journal={IEEE Access},
|
327 |
+
title={LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification},
|
328 |
+
year={2025},
|
329 |
+
volume={},
|
330 |
+
number={},
|
331 |
+
pages={1-1},
|
332 |
+
keywords={Data models;Annotations;Media;Manuals;Multilingual;Computational modeling;Training;Training data;Transformers;Text categorization;Multilingual text classification;IPTC;large language models;LLMs;news topic;topic classification;training data preparation;data annotation},
|
333 |
+
doi={10.1109/ACCESS.2025.3544814}}
|
334 |
```
|
335 |
|
336 |
## Funding
|