marianaossilva
/

LitBERT-CRF

Token Classification

Model card Files Files and versions Community

marianaossilva commited on Jul 12, 2024

Commit

30b438d

·

verified ·

1 Parent(s): a3d069a

Update README.md

Files changed (1) hide show

README.md +83 -3

README.md CHANGED Viewed

@@ -1,3 +1,83 @@
----
-license: mit
----

+---
+license: mit
+language:
+- pt
+metrics:
+- name: Precision
+  type: Precision
+  value: 0.783
+- name: Recall
+  type: Recall
+  value: 0.774
+- name: F1-Score
+  type: F1-Score
+  value: 0.779
+library_name: transformers
+pipeline_tag: token-classification
+tags:
+- BERT
+- CRF
+- NER
+- Portuguese
+- Literature
+---
+# LitBERT-CRF
+<!-- Provide a quick summary of what the model is/does. -->
+LitBERT-CRF model is a fine-tuned BERT-CRF architecture specifically designed for Named Entity Recognition (NER) in Portuguese-written literature.
+## Model Details
+### Model Description
+LitBERT-CRF leverages a BERT-CRF architecture, initially pre-trained on the brWaC corpus and fine-tuned on the HAREM dataset for enhanced NER performance in Portuguese.
+It incorporates domain-specific literary data through Masked Language Modeling (MLM), making it well-suited for identifying named entities in literary texts.
+- **Model type:** BERT-CRF for NER
+- **Language:** Portuguese
+- **Fine-tuned from model:** BERT-CRF on brWaC and HAREM
+## Evaluation
+### Testing Data, Factors & Metrics
+#### Testing Data
+PPORTAL_ner dataset
+#### Metrics
+- **Precision**: 0.783
+- **Recall**: 0.774
+- **F1-score**: 0.779
+## Citation
+**BibTeX:**
+```
+@inproceedings{silva-moro-2024-evaluating,
+    title = "Evaluating Pre-training Strategies for Literary Named Entity Recognition in {P}ortuguese",
+    author = "Silva, Mariana O.  and
+      Moro, Mirella M.",
+    editor = "Gamallo, Pablo  and
+      Claro, Daniela  and
+      Teixeira, Ant{\'o}nio  and
+      Real, Livy  and
+      Garcia, Marcos  and
+      Oliveira, Hugo Gon{\c{c}}alo  and
+      Amaro, Raquel",
+    booktitle = "Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1",
+    month = mar,
+    year = "2024",
+    address = "Santiago de Compostela, Galicia/Spain",
+    publisher = "Association for Computational Lingustics",
+    url = "https://aclanthology.org/2024.propor-1.39",
+    pages = "384--393",
+}
+```
+**APA:**
+Mariana O. Silva and Mirella M. Moro. 2024. Evaluating Pre-training Strategies for Literary Named Entity Recognition in Portuguese. In Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1, pages 384–393, Santiago de Compostela, Galicia/Spain. Association for Computational Lingustics.