File size: 2,558 Bytes
0e55bc2 ed4d191 0e55bc2 ed4d191 0e55bc2 ed4d191 0e55bc2 ed4d191 0e55bc2 ed4d191 0e55bc2 ed4d191 0e55bc2 670b793 0e55bc2 670b793 ed4d191 0e55bc2 ed4d191 0e55bc2 ed4d191 0e55bc2 ed4d191 0e55bc2 ed4d191 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
---
language:
- ar
thumbnail: url to a thumbnail used in social sharing
tags:
- ner
- token-classification
- Arabic-NER
metrics:
- accuracy
- f1
- precision
- recall
widget:
- text: النجم محمد صلاح لاعب المنتخب المصري يعيش في مصر بالتحديد من نجريج, الشرقية
example_title: Mohamed Salah
- text: انا ساكن في حدايق الزتون و بدرس في جامعه عين شمس
example_title: Egyptian Dialect
- text: يقع نهر الأمازون في قارة أمريكا الجنوبية
example_title: Standard Arabic
datasets:
- Fine-grained-Arabic-Named-Entity-Corpora
pipeline_tag: token-classification
---
# Arabic Named Entity Recognition
This project is made to enrich the Arabic Named Entity Recognition(ANER). Arabic is a tough language to deal with and has alot of difficulties.
We managed to made a model based on Arabert to support 50 entities.
# Paper:
This is the paper for the system, where you can find all the details: https://arxiv.org/abs/2308.14669
# Dataset
- [Fine-grained Arabic Named Entity Corpora](https://fsalotaibi.kau.edu.sa/Pages-Arabic-NE-Corpora.aspx)
# Evaluation results
The model achieves the following results:
| Dataset | WikiFANE Gold | WikiFANE Gold | WikiFANE Gold | NewsFANE Gold | NewsFANE Gold | NewsFANE Gold
|:--------:|:-------:|:-------:|:------:|:------:|:---------:|:------:|
| (metric) | (Recall) | (Precision) | (F1) | (Recall) | (Precision) | (F1)
| | 87.0 | 90.5 | 88.7 | 78.1 | 77.4 | 77.7
# Usage
The model is available on the HuggingFace model page under the name: [boda/ANER](https://huggingface.co/boda/ANER). Checkpoints are available only in PyTorch at the time.
### Use in python:
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("boda/ANER")
model = AutoModelForTokenClassification.from_pretrained("boda/ANER")
```
# Acknowledgments
Thanks to [Arabert](https://github.com/aub-mind/arabert) for providing the Arabic Bert model, which we used as a base model for our work.
We also would like to thank [Prof. Fahd Saleh S Alotaibi](https://fsalotaibi.kau.edu.sa/Pages-Arabic-NE-Corpora.aspx) at the Faculty of Computing and Information Technology King Abdulaziz University, for providing the dataset which we used to train our model with.
# Contacts
**Abdelrahman Atef**
- [LinkedIn](linkedin.com/in/boda-sadalla)
- [Github](https://github.com/BodaSadalla98)
- <[email protected]> |