metadata
license: mit
datasets:
- ljvmiranda921/tlunified-ner
language:
- tl
metrics:
- f1
library_name: spacy
pipeline_tag: token-classification
model-index:
- name: tl_gliner_small
results:
- task:
type: token-classification
name: Named Entity Recognition
dataset:
type: tlunified-ner
name: TLUnified-NER
split: test
revision: 3f7dab9d232414ec6204f8d6934b9a35f90a254f
metrics:
- type: f1
value: 0.8483
name: F1
GLiNER (small) model finetuned on Tagalog data
This model was finetuned using the GLiNER v2.5 suite of models. You can find and replicate the training pipeline on Github.
Citation
Please cite the following papers when using these models:
@misc{zaratiana2023gliner,
title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer},
author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
year={2023},
eprint={2311.08526},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@inproceedings{miranda-2023-calamancy,
title = "calaman{C}y: A {T}agalog Natural Language Processing Toolkit",
author = "Miranda, Lester James",
booktitle = "Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)",
month = dec,
year = "2023",
address = "Singapore, Singapore",
publisher = "Empirical Methods in Natural Language Processing",
url = "https://aclanthology.org/2023.nlposs-1.1",
pages = "1--7",
}
If you're using the NER dataset:
@inproceedings{miranda-2023-developing,
title = "Developing a Named Entity Recognition Dataset for {T}agalog",
author = "Miranda, Lester James",
booktitle = "Proceedings of the First Workshop in South East Asian Language Processing",
month = nov,
year = "2023",
address = "Nusa Dua, Bali, Indonesia",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.sealp-1.2",
doi = "10.18653/v1/2023.sealp-1.2",
pages = "13--20",
}