Spaces:

FusioNER
/

README

Configuration error

App Files Files Community

etzion commited on Mar 28, 2024

Commit

331ca5a

verified ·

1 Parent(s): 2b2e5c5

Update README.md

Browse files

Files changed (1) hide show

README.md +33 -28

README.md CHANGED Viewed

@@ -30,33 +30,6 @@ Here you can find a description on each of our models. Each row contains the mod
 | **DICTA_smart_NER** | Training the [DICTA-ner](https://huggingface.co/dicta-il/dictabert-ner) model on IAHALT + Name-Sentences[1] + Entity-Injection[2]] [dataset](https://huggingface.co/datasets/FusioNER/Smart_Injection) | [FusioNER/DICTA_Smart](https://huggingface.co/FusioNER/DICTA_Smart) | IAHALT | [FusioNER/Smart_Injection](https://huggingface.co/datasets/FusioNER/Smart_Injection) | [DICTA-ner](https://huggingface.co/dicta-il/dictabert-ner) | classic[4]
 | **DICTA_Large_Smart** | Training the [DICTA Large](https://huggingface.co/dicta-il/dictabert-large) model on IAHALT + Name-Sentences[1] + Entity-Injection[2]] [dataset](https://huggingface.co/datasets/FusioNER/Smart_Injection) | [FusioNER/Dicta_Large_Smart](https://huggingface.co/FusioNER/Dicta_Large_Smart) | IAHALT | [FusioNER/Smart_Injection](https://huggingface.co/datasets/FusioNER/Smart_Injection) | [DICTA Large](https://huggingface.co/dicta-il/dictabert-large) | classic[4]
 | **TEC_NER** | Basic technology NER model | model path | TEC_NER | https://huggingface.co/datasets/FusioNER/tec_ner/tree/main | base model | technology
-[1] **Name-Sentences**: Adding to the corpus sentences that contain only the entity we want the network to learn.
-[2] **Entity-Injection**: Replace a tagged entity in the original corpus with a new entity. By using, this method, the model can learn new entities (not labels!) which the model not extracted before.
-[3] **BI-BI Problem**: Building training corpus when entities from the same type appear in sequence, labeled as continuations of one another.
-For example, the text "הארי פוטר ורון וויזלי" would tagged as **SINGLE** entity. That problem prevent the model to extract entities correctly.
-[4] **Classic**: The classic NER types:
-| entity type | full name | examples |
-|:-----------:|:----------| --------:|
-| **PER** | Person | אדולף היטלר, רודולף הס, מרדכי אנילביץ |
-| **GPE** | Geopolitical Entity | גרמניה, פולין, ברלין, וורשה |
-| **LOC** | Location | מזרח אירופה, אגן הים התיכון, הגליל |
-| **FAC** | Facility | אוושוויץ, מגדלי התאומים, נתב"ג 2000, רחוב קפלן |
-| **ORG** | Organization | המפלגה הנאצית, חברת גוגל, ממשלת חוף השנהב |
-| **TIMEX** | Time Expression | 1945, שנת 1993, יום השואה, שנות ה-90 |
-| **EVE** | Event | השואה, מלחמת העולם השנייה, שלטון האפרטהייד |
-| **TTL** | Title | פיהרר, קיסר, מנכ"ל |
-| **ANG** | Language | עברית, ערבית, גרמנית |
-| **DUC** | Product | פייסבוק, F-16, תנובה |
-| **WOA** | Work of Art | דו"ח מבקר המדינה, עיתון הארץ, הארי פוטר, תיק 2000, |
-| **MISC** | Miscellaneous  | קורונה, התו הירוק, מדלית זהב, ביטקוין |
 # Results
 We test our models on the **IAHALT test set**. We also check another models, such as [DictaBert](https://huggingface.co/dicta-il/dictabert) and [HeBert](https://huggingface.co/avichr/heBERT). This is the performence results:
@@ -79,7 +52,6 @@ We test our models on the **IAHALT test set**. We also check another models, suc
 According to the results, we recommend to use [**DICTA_Small_Smart**](https://huggingface.co/FusioNER/Dicta_Small_Smart) model.
 # Hebrew NLP models
 You can find in the table Hebrew NLP models:
@@ -90,4 +62,37 @@ You can find in the table Hebrew NLP models:
 | dicta-il/dictabert-large | [https://huggingface.co/dicta-il/dictabert-large](https://huggingface.co/dicta-il/dictabert-large) | Shaltiel Shmidman and Avi Shmidman and Moshe Koppel |
 | avichr/heBERT | [https://huggingface.co/avichr/heBERT](https://huggingface.co/avichr/heBERT) | Avihay Chriqui and Inbal Yahav |
 **MIT License**

 | **DICTA_smart_NER** | Training the [DICTA-ner](https://huggingface.co/dicta-il/dictabert-ner) model on IAHALT + Name-Sentences[1] + Entity-Injection[2]] [dataset](https://huggingface.co/datasets/FusioNER/Smart_Injection) | [FusioNER/DICTA_Smart](https://huggingface.co/FusioNER/DICTA_Smart) | IAHALT | [FusioNER/Smart_Injection](https://huggingface.co/datasets/FusioNER/Smart_Injection) | [DICTA-ner](https://huggingface.co/dicta-il/dictabert-ner) | classic[4]
 | **DICTA_Large_Smart** | Training the [DICTA Large](https://huggingface.co/dicta-il/dictabert-large) model on IAHALT + Name-Sentences[1] + Entity-Injection[2]] [dataset](https://huggingface.co/datasets/FusioNER/Smart_Injection) | [FusioNER/Dicta_Large_Smart](https://huggingface.co/FusioNER/Dicta_Large_Smart) | IAHALT | [FusioNER/Smart_Injection](https://huggingface.co/datasets/FusioNER/Smart_Injection) | [DICTA Large](https://huggingface.co/dicta-il/dictabert-large) | classic[4]
 | **TEC_NER** | Basic technology NER model | model path | TEC_NER | https://huggingface.co/datasets/FusioNER/tec_ner/tree/main | base model | technology
 # Results
 We test our models on the **IAHALT test set**. We also check another models, such as [DictaBert](https://huggingface.co/dicta-il/dictabert) and [HeBert](https://huggingface.co/avichr/heBERT). This is the performence results:
 According to the results, we recommend to use [**DICTA_Small_Smart**](https://huggingface.co/FusioNER/Dicta_Small_Smart) model.
 # Hebrew NLP models
 You can find in the table Hebrew NLP models:
 | dicta-il/dictabert-large | [https://huggingface.co/dicta-il/dictabert-large](https://huggingface.co/dicta-il/dictabert-large) | Shaltiel Shmidman and Avi Shmidman and Moshe Koppel |
 | avichr/heBERT | [https://huggingface.co/avichr/heBERT](https://huggingface.co/avichr/heBERT) | Avihay Chriqui and Inbal Yahav |
+# Footnotes
+#### [1] **Name-Sentences**:
+Adding to the corpus sentences that contain only the entity we want the network to learn.
+#### [2] **Entity-Injection**:
+Replace a tagged entity in the original corpus with a new entity.
+By using, this method, the model can learn new entities (not labels!) which the model not extracted before.
+#### [3] **BI-BI Problem**:
+Building training corpus when entities from the same type appear in sequence, labeled as continuations of one another.
+For example, the text "הארי פוטר ורון וויזלי" would tagged as **SINGLE** entity.
+That problem prevent the model to extract entities correctly.
+#### [4] **Classic**:
+The classic NER types:
+| entity type | full name | examples |
+|:-----------:|:----------| --------:|
+| **PER** | Person | אדולף היטלר, רודולף הס, מרדכי אנילביץ |
+| **GPE** | Geopolitical Entity | גרמניה, פולין, ברלין, וורשה |
+| **LOC** | Location | מזרח אירופה, אגן הים התיכון, הגליל |
+| **FAC** | Facility | אוושוויץ, מגדלי התאומים, נתב"ג 2000, רחוב קפלן |
+| **ORG** | Organization | המפלגה הנאצית, חברת גוגל, ממשלת חוף השנהב |
+| **TIMEX** | Time Expression | 1945, שנת 1993, יום השואה, שנות ה-90 |
+| **EVE** | Event | השואה, מלחמת העולם השנייה, שלטון האפרטהייד |
+| **TTL** | Title | פיהרר, קיסר, מנכ"ל |
+| **ANG** | Language | עברית, ערבית, גרמנית |
+| **DUC** | Product | פייסבוק, F-16, תנובה |
+| **WOA** | Work of Art | דו"ח מבקר המדינה, עיתון הארץ, הארי פוטר, תיק 2000, |
+| **MISC** | Miscellaneous  | קורונה, התו הירוק, מדלית זהב, ביטקוין |
 **MIT License**