File size: 2,252 Bytes
0e55bc2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
language:
  - ar

thumbnail: "url to a thumbnail used in social sharing"
tags:
  - ner
  - token-classification
  - Arabic-NER

metrics:
  - accuracy
  - f1
  - precision
  - recall

widget:
  - text: "النجم محمد صلاح لاعب المنتخب المصري يعيش في مصر بالتحديد من نجريج, الشرقية"
    example_title: "Mohamed Salah"
  - text: "انا ساكن في حدايق الزتون و بدرس في جامعه عين شمس"
    example_title: "Egyptian Dialect"
  - text: "يقع نهر الأمازون في قارة أمريكا الجنوبية"
    example_title: "Standard Arabic"

datasets:
  - Fine-grained-Arabic-Named-Entity-Corpora
---

# Arabic Named Entity Recognition

This project is made to enrich the Arabic Named Entity Recognition(ANER). Arabic is a tough language to deal with and has alot of difficulties.
We managed to made a model based on Arabert to support 50 entities.

## Paper

Here's the paper that contains all the details for our model, our approach, and the training results

- [ANER Paper](https://drive.google.com/file/d/1jJn3iWqOeLzaNvO-6aKfgidzJlWOtvti/view?usp=sharing)

# Usage

The model is available in HuggingFace model page under the name: [boda/ANER](https://huggingface.co/boda/ANER). Checkpoints are available only in PyTorch at the time.

### Use in python:

```python
from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("boda/ANER")

model = AutoModelForTokenClassification.from_pretrained("boda/ANER")
```

# Dataset

- [Fine-grained Arabic Named Entity Corpora](https://fsalotaibi.kau.edu.sa/Pages-Arabic-NE-Corpora.aspx)

# Acknowledgments

Thanks for [Arabert](https://github.com/aub-mind/arabert) for providing the Arabic Bert model, which we used as a base model for our work.

We also would like to thank [Prof. Fahd Saleh S Alotaibi](https://fsalotaibi.kau.edu.sa/Pages-Arabic-NE-Corpora.aspx) at Faculty of Computing and Information Technology King Abdulaziz University, for providing the dataset which we used to train our model with.

# Contacts

**Abdelrahman Atef**

- [LinkedIn](linkedin.com/in/boda-sadalla)
- [Github](https://github.com/BodaSadalla98)
- <[email protected]>