etiennebcp commited on
Commit
94adae0
·
verified ·
1 Parent(s): efee1b3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -5
README.md CHANGED
@@ -14,12 +14,17 @@ tags:
14
  - zero-shot
15
  ---
16
 
17
- NuNerZero - is the family of Zero-Shot Entity Recognition models inspired by [GLiNER](https://huggingface.co/papers/2311.08526) and built with insights we gathered throughout our work on [NuNER](https://huggingface.co/collections/numind/nuner-token-classification-and-ner-backbones-65e1f6e14639e2a465af823b).
18
 
19
- The key differences between NuNerZero Long in comparison to GLiNER are:
20
- * The possibility to **detect entities that are longer than 12 tokens**, as NuZero Token operates on the token level rather than on the span level.
21
- * a more powerful version of GLiNER-large-v2.1, surpassing it by **+3.1% on average**
22
- * NuNerZero family is trained on the **diverse dataset tailored for real-life use cases** - NuNER v2.0 dataset
 
 
 
 
 
23
 
24
  <p align="center">
25
  <img src="zero_shot_performance_unzero_token.png">
 
14
  - zero-shot
15
  ---
16
 
17
+ NuNER Zero is a zero-shot Entity Recognition Model. For few-shot learning, we recommend using fine-tuning the original NuNER [NuNER](https://huggingface.co/collections/numind/nuner-token-classification-and-ner-backbones-65e1f6e14639e2a465af823b).
18
 
19
+ NuNER Zero uses a variation of the [GLiNER](https://huggingface.co/papers/2311.08526) architecture, and takes the same input (concatenation entity types and text).
20
+
21
+ Unlike GliNER, NuNER Zero is a token classifier: it returns the infered probabilities for each token to belong to each entity type. You need to merge consecutive tokens that have a probability >50% to obtain the entities.
22
+
23
+ NuNER Zero was trained on NuNER v2.0 dataset, which combines subsets of Pile and C4 annotated via LLMs using [NuNER's procedure](https://huggingface.co/papers/2402.15343).
24
+
25
+ Key benefits of using NuNER Zero compared to GLiNER:
26
+ * Possibility to **detect arbitrary long entities** as NuNER Zero is a token classifier and do not return span.
27
+ * Surpassing GLiNER-large-v2.1 by **+3.1% on average** on its own benchmark.
28
 
29
  <p align="center">
30
  <img src="zero_shot_performance_unzero_token.png">