--- license: mit language: - en library_name: transformers pipeline_tag: token-classification inference: parameters: aggregation_strategy: "first" widget: - text: "Prince Raoden went to Elantris." --- # bert-base-cased-literary-NER A NER model trained on a literary dataset of the first chapter of 40 novels. The model supports the following NER class: `PER`, `ORG` and `LOC`. If you use the model in a huggingface pipeline, pass `aggregation_strategy="first"`. ## Dataset We corrected the dataset of Dekker et al. (2019) and added LOC and ORG annotations. ## Citation If you use this model in your research, please cite: ```bibtex @InProceedings{amalvy:hal-03972448, title = {{Data Augmentation for Robust Character Detection in Fantasy Novels}}, author = {Amalvy, Arthur and Labatut, Vincent and Dufour, Richard}, url = {https://hal.science/hal-03972448}, booktitle = {{Workshop on Computational Methods in the Humanities 2022}}, YEAR = {2022}, hal_id = {hal-03972448}, hal_version = {v1}, } ``` The dataset was originally published and annotated by Dekker et al (2019): ```bibtex @Article{dekker-2019-evaluation_ner_social_networks_novels, author = {Dekker, N. and Kuhn, T. and van Erp, M.}, journal = {PeerJ Computer Science}, title = {Evaluating named entity recognition tools for extracting social networks from novels}, year = {2019}, pages = {e189}, volume = {5}, doi = {10.7717/peerj-cs.189}, } ```