File size: 3,164 Bytes
99a7fd4
25e3675
91f4e49
25e3675
91f4e49
25e3675
91f4e49
 
25e3675
91f4e49
 
25e3675
91f4e49
25e3675
91f4e49
25e3675
 
91f4e49
25e3675
91f4e49
25e3675
91f4e49
99a7fd4
 
25e3675
0f0ae56
82dd2ee
91f4e49
99a7fd4
25e3675
99a7fd4
25e3675
 
0f0ae56
 
 
 
 
25e3675
0f0ae56
 
 
 
25e3675
0f0ae56
25e3675
 
0f0ae56
 
25e3675
 
 
 
99a7fd4
25e3675
99a7fd4
578f626
99a7fd4
25e3675
 
 
99a7fd4
25e3675
 
99a7fd4
25e3675
 
99a7fd4
25e3675
 
 
99a7fd4
457acb8
b9907f1
457acb8
25e3675
0829e18
646039e
0829e18
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
annotations_creators:
- machine-generated
language_creators:
- machine-generated
widget:
- text: George Washington went to Washington.
- text: What is the seventh tallest mountain in North America?
tags:
- named-entity-recognition
- sequence-tagger-model
datasets:
- Babelscape/cner
language:
- en
pretty_name: cner-model
source_datasets:
- original
task_categories:
- structure-prediction
task_ids:
- named-entity-recognition
---

# CNER: Concept and Named Entity Recognition
This is the model card for the NAACL 2024 paper [CNER: Concept and Named Entity Recognition](https://aclanthology.org/2024.naacl-long.461/). 
We fine-tuned a language model (DeBERTa-v3-base) for 1 epoch on our [CNER dataset](https://huggingface.co/datasets/Babelscape/cner) using the default hyperparameters, optimizer and architecture of Hugging Face, therefore the results of this model may differ from the ones presented in the paper.
The resulting CNER model is able to jointly identifying and classifying concepts and named entities with fine-grained tags.

**If you use the model, please reference this work in your paper**:

```bibtex
@inproceedings{martinelli-etal-2024-cner,
    title = "{CNER}: Concept and Named Entity Recognition",
    author = "Martinelli, Giuliano  and
      Molfese, Francesco  and
      Tedeschi, Simone  and
      Fern{\'a}ndez-Castro, Alberte  and
      Navigli, Roberto",
    editor = "Duh, Kevin  and
      Gomez, Helena  and
      Bethard, Steven",
    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)",
    month = jun,
    year = "2024",
    address = "Mexico City, Mexico",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.naacl-long.461",
    pages = "8329--8344",
}
```
    
The original repository for the paper can be found at [https://github.com/Babelscape/cner](https://github.com/Babelscape/cner).

## How to use

You can use this model with Transformers NER *pipeline*. 

```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

tokenizer = AutoTokenizer.from_pretrained("Babelscape/cner-model")
model = AutoModelForTokenClassification.from_pretrained("Babelscape/cner-model")

nlp = pipeline("ner", model=model, tokenizer=tokenizer, grouped_entities=True)
example = "What is the seventh tallest mountain in North America?"

ner_results = nlp(example)
print(ner_results)
```

## Classes
<img src="https://cdn-uploads.huggingface.co/production/uploads/65e9ccd84ce78d665a50f78b/2K3NZ79go3Zjf3qFeHO0O.png" alt="drawing" />

## Licensing Information
Contents of this repository are restricted to only non-commercial research purposes under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright of the dataset contents and models belongs to the original copyright holders.

`microsoft/deberta-v3-base` is released under the [MIT license](https://choosealicense.com/licenses/mit/).