Fusion NER Models

Here you can find our NER models:

model name	model description	model path	datasets	link to dataset	base model
Basic	Basic training on IAHALT	FusioNER/Basic_IAHALT	IAHALT	FusioNER/Basic	HeRo
Vitaly	Vitaly training on IAHALT (with BI-BI problem[3])	FusioNER/Vitaly_NER	IAHALT	FusioNER/Vitaly	HeRo
Name-Sentences	Training on IAHALT + Name-Sentences[1]	FusioNER/Name-Sentences	IAHALT	FusioNER/Name_Sentences	HeRo
Entity-Injection	Training on IAHALT + Entity-Injection[2]	FusioNER/Entity-Injection	IAHALT	FusioNER/Entity_Injection	HeRo
Smart_Injection	Training on IAHALT + Name-Sentences[1] + Entity-Injection[2]	FusioNER/Smart_Injection	IAHALT	FusioNER/Smart_Injection	HeRo
NEMO	Basic training on NEMO dataset	FusioNER/Nemo	NEMO	FusioNER/NEMO	HeRo
IAHALT_and_NEMO	Basic training on IAHALT + NEMO	FusioNER/IAHALT_and_NEMO	IAHALT + NEMO	FusioNER/IAHALT_and_NEMO	HeRo
IAHALT_and_NEMO_PP	Training on IAHALT + NEMO + Name-Sentences[1] + Entity-Injection[2]	FusioNER/IAHALT_and_NEMO_and_PP	IAHALT + NEMO	FusioNER/IAHALT_and_NEMO_PP	HeRo
Animals	Training on IAHALT + Entity-Injection[2] (of animals names as PER entities)	FusioNER/Animals	IAHALT	FusioNER/Animals	HeRo
PRS-Injection	Training on IAHALT + Entity-Injection[2] (of PRS names as PER entities)	FusioNER/PRS-Injection	IAHALT	FusioNER/PRS_locations	HeRo
DICTA_Basic	Training the DICTA model on the basic IAHALT dataset	FusioNER/Dicta_Small_Basic	IAHALT	FusioNER/Smart_Injection	DICTA
DICTA_Small_Smart	Training the DICTA model on IAHALT + Name-Sentences[1] + Entity-Injection[2]] dataset	FusioNER/Dicta_Small_Smart	IAHALT	FusioNER/Smart_Injection	DICTA
DICTA_basic_NER	Training the DICTA-ner model on the basic IAHALT dataset	FusioNER/DICTA_basic	IAHALT	FusioNER/Basic	DICTA-ner
DICTA_smart_NER	Training the DICTA-ner model on IAHALT + Name-Sentences[1] + Entity-Injection[2]] dataset	FusioNER/DICTA_Smart	IAHALT	FusioNER/Smart_Injection	DICTA-ner
DICTA_Large_Smart	Training the DICTA Large model on IAHALT + Name-Sentences[1] + Entity-Injection[2]] dataset	FusioNER/Dicta_Large_Smart	IAHALT	FusioNER/Smart_Injection	DICTA Large

[1] Name-Sentences: Adding to the corpus sentences that contain only the entity we want the network to learn.

[2] Entity-Injection: Replace a tagged entity in the original corpus with a new entity. By using, this method, the model can learn new entities (not labels!) which the model not extracted before.

[3] BI-BI Problem: Building training corpus when entities from the same type appear in sequence, labeled as continuations of one another.

For example, the text "הארי פוטר ורון וויזלי" would tagged as SINGLE entity. That problem prevent the model to extract entities correctly.

Results

We test our models on the IAHALT test set. We also check another models, such as DictaBert and HeBert. This is the performence results:

Model name	Precision	Recall	F1 - Score	Time (in seconds)
IAHALT_and_NEMO_PP	0.714	0.353	0.461	83.128
HeBert	0.574	0.474	0.494	86.483
NEMO	0.553	0.51	0.525	81.422
IAHALT_and_NEMO	0.692	0.678	0.684	83.702
Vitaly	0.883	0.794	0.836	83.773
DictaBert	0.916	0.834	0.872	70.465
DICTA_large	0.917	0.845	0.879	206.251
Name-Sentences	0.895	0.865	0.879	82.674
Basic	0.897	0.866	0.881	84.479
Smart_Injection	0.898	0.867	0.881	82.253
DICTA_Basic	0.903	0.875	0.888	69.419
DICTA_Large_Smart	0.904	0.875	0.889	204.324
DICTA_Small_Smart	0.904	0.875	0.889	70.29

According to the results, we recommend to use DICTA_Small_Smart model.

Hebrew NLP models

You can find in the table Hebrew NLP models:

Model name	Link	Creator
HeNLP/HeRo	https://huggingface.co/HeNLP/HeRo	Vitaly Shalumov and Harel Haskey
dicta-il/dictabert	https://huggingface.co/dicta-il/dictabert	Shaltiel Shmidman and Avi Shmidman and Moshe Koppel
dicta-il/dictabert-large	https://huggingface.co/dicta-il/dictabert-large	Shaltiel Shmidman and Avi Shmidman and Moshe Koppel
avichr/heBERT	https://huggingface.co/avichr/heBERT	Avihay Chriqui and Inbal Yahav

MIT License