Update README.md
Browse files
README.md
CHANGED
@@ -14,12 +14,17 @@ tags:
|
|
14 |
- zero-shot
|
15 |
---
|
16 |
|
17 |
-
|
18 |
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
<p align="center">
|
25 |
<img src="zero_shot_performance_unzero_token.png">
|
|
|
14 |
- zero-shot
|
15 |
---
|
16 |
|
17 |
+
NuNER Zero is a zero-shot Entity Recognition Model. For few-shot learning, we recommend using fine-tuning the original NuNER [NuNER](https://huggingface.co/collections/numind/nuner-token-classification-and-ner-backbones-65e1f6e14639e2a465af823b).
|
18 |
|
19 |
+
NuNER Zero uses a variation of the [GLiNER](https://huggingface.co/papers/2311.08526) architecture, and takes the same input (concatenation entity types and text).
|
20 |
+
|
21 |
+
Unlike GliNER, NuNER Zero is a token classifier: it returns the infered probabilities for each token to belong to each entity type. You need to merge consecutive tokens that have a probability >50% to obtain the entities.
|
22 |
+
|
23 |
+
NuNER Zero was trained on NuNER v2.0 dataset, which combines subsets of Pile and C4 annotated via LLMs using [NuNER's procedure](https://huggingface.co/papers/2402.15343).
|
24 |
+
|
25 |
+
Key benefits of using NuNER Zero compared to GLiNER:
|
26 |
+
* Possibility to **detect arbitrary long entities** as NuNER Zero is a token classifier and do not return span.
|
27 |
+
* Surpassing GLiNER-large-v2.1 by **+3.1% on average** on its own benchmark.
|
28 |
|
29 |
<p align="center">
|
30 |
<img src="zero_shot_performance_unzero_token.png">
|