|
--- |
|
annotations_creators: |
|
- machine-generated |
|
language_creators: |
|
- machine-generated |
|
widget: |
|
- text: George Washington went to Washington. |
|
- text: What is the seventh tallest mountain in North America? |
|
tags: |
|
- named-entity-recognition |
|
- sequence-tagger-model |
|
datasets: |
|
- Babelscape/cner |
|
language: |
|
- en |
|
pretty_name: cner-model |
|
source_datasets: |
|
- original |
|
task_categories: |
|
- structure-prediction |
|
task_ids: |
|
- named-entity-recognition |
|
--- |
|
|
|
# CNER: Concept and Named Entity Recognition |
|
This is the model card for the NAACL 2024 paper [CNER: Concept and Named Entity Recognition](https://aclanthology.org/2024.naacl-long.461/). |
|
We fine-tuned a language model (DeBERTa-v3-base) for 1 epoch on our [CNER dataset](https://huggingface.co/datasets/Babelscape/cner) using the default hyperparameters, optimizer and architecture of Hugging Face, therefore the results of this model may differ from the ones presented in the paper. |
|
The resulting CNER model is able to jointly identifying and classifying concepts and named entities with fine-grained tags. |
|
|
|
**If you use the model, please reference this work in your paper**: |
|
|
|
```bibtex |
|
@inproceedings{martinelli-etal-2024-cner, |
|
title = "{CNER}: Concept and Named Entity Recognition", |
|
author = "Martinelli, Giuliano and |
|
Molfese, Francesco and |
|
Tedeschi, Simone and |
|
Fern{\'a}ndez-Castro, Alberte and |
|
Navigli, Roberto", |
|
editor = "Duh, Kevin and |
|
Gomez, Helena and |
|
Bethard, Steven", |
|
booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)", |
|
month = jun, |
|
year = "2024", |
|
address = "Mexico City, Mexico", |
|
publisher = "Association for Computational Linguistics", |
|
url = "https://aclanthology.org/2024.naacl-long.461", |
|
pages = "8329--8344", |
|
} |
|
``` |
|
|
|
The original repository for the paper can be found at [https://github.com/Babelscape/cner](https://github.com/Babelscape/cner). |
|
|
|
## How to use |
|
|
|
You can use this model with Transformers NER *pipeline*. |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForTokenClassification |
|
from transformers import pipeline |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Babelscape/cner-model") |
|
model = AutoModelForTokenClassification.from_pretrained("Babelscape/cner-model") |
|
|
|
nlp = pipeline("ner", model=model, tokenizer=tokenizer, grouped_entities=True) |
|
example = "What is the seventh tallest mountain in North America?" |
|
|
|
ner_results = nlp(example) |
|
print(ner_results) |
|
``` |
|
|
|
## Classes |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/65e9ccd84ce78d665a50f78b/2K3NZ79go3Zjf3qFeHO0O.png" alt="drawing" /> |
|
|
|
## Licensing Information |
|
Contents of this repository are restricted to only non-commercial research purposes under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright of the dataset contents and models belongs to the original copyright holders. |
|
|
|
`microsoft/deberta-v3-base` is released under the [MIT license](https://choosealicense.com/licenses/mit/). |
|
|