--- license: mit datasets: - ljvmiranda921/tlunified-ner language: - tl metrics: - f1 library_name: spacy pipeline_tag: token-classification model-index: - name: tl_gliner_small results: - task: type: token-classification name: Named Entity Recognition dataset: type: tlunified-ner name: TLUnified-NER split: test revision: 3f7dab9d232414ec6204f8d6934b9a35f90a254f metrics: - type: f1 value: 0.8483 name: F1 --- # GLiNER (small) model finetuned on Tagalog data This model was finetuned using the [GLiNER v2.5 suite](https://github.com/urchade/GLiNER) of models. You can find and replicate the training pipeline on [Github](https://github.com/ljvmiranda921/calamanCy/tree/master/models/v0.1.0-gliner). ## Citation Please cite the following papers when using these models: ``` @misc{zaratiana2023gliner, title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer}, author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois}, year={2023}, eprint={2311.08526}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` ``` @inproceedings{miranda-2023-calamancy, title = "calaman{C}y: A {T}agalog Natural Language Processing Toolkit", author = "Miranda, Lester James", booktitle = "Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)", month = dec, year = "2023", address = "Singapore, Singapore", publisher = "Empirical Methods in Natural Language Processing", url = "https://aclanthology.org/2023.nlposs-1.1", pages = "1--7", } ``` If you're using the NER dataset: ``` @inproceedings{miranda-2023-developing, title = "Developing a Named Entity Recognition Dataset for {T}agalog", author = "Miranda, Lester James", booktitle = "Proceedings of the First Workshop in South East Asian Language Processing", month = nov, year = "2023", address = "Nusa Dua, Bali, Indonesia", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.sealp-1.2", doi = "10.18653/v1/2023.sealp-1.2", pages = "13--20", } ```