--- language: - ar thumbnail: url to a thumbnail used in social sharing tags: - ner - token-classification - Arabic-NER metrics: - accuracy - f1 - precision - recall widget: - text: النجم محمد صلاح لاعب المنتخب المصري يعيش في مصر بالتحديد من نجريج, الشرقية example_title: Mohamed Salah - text: انا ساكن في حدايق الزتون و بدرس في جامعه عين شمس example_title: Egyptian Dialect - text: يقع نهر الأمازون في قارة أمريكا الجنوبية example_title: Standard Arabic datasets: - Fine-grained-Arabic-Named-Entity-Corpora pipeline_tag: token-classification --- # Arabic Named Entity Recognition This project is made to enrich the Arabic Named Entity Recognition(ANER). Arabic is a tough language to deal with and has alot of difficulties. We managed to made a model based on Arabert to support 50 entities. # Paper: This is the paper for the system, where you can find all the details: https://arxiv.org/abs/2308.14669 # Dataset - [Fine-grained Arabic Named Entity Corpora](https://fsalotaibi.kau.edu.sa/Pages-Arabic-NE-Corpora.aspx) # Evaluation results The model achieves the following results: | Dataset | WikiFANE Gold | WikiFANE Gold | WikiFANE Gold | NewsFANE Gold | NewsFANE Gold | NewsFANE Gold |:--------:|:-------:|:-------:|:------:|:------:|:---------:|:------:| | (metric) | (Recall) | (Precision) | (F1) | (Recall) | (Precision) | (F1) | | 87.0 | 90.5 | 88.7 | 78.1 | 77.4 | 77.7 # Usage The model is available on the HuggingFace model page under the name: [boda/ANER](https://huggingface.co/boda/ANER). Checkpoints are available only in PyTorch at the time. ### Use in python: ```python from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("boda/ANER") model = AutoModelForTokenClassification.from_pretrained("boda/ANER") ``` # Acknowledgments Thanks to [Arabert](https://github.com/aub-mind/arabert) for providing the Arabic Bert model, which we used as a base model for our work. We also would like to thank [Prof. Fahd Saleh S Alotaibi](https://fsalotaibi.kau.edu.sa/Pages-Arabic-NE-Corpora.aspx) at the Faculty of Computing and Information Technology King Abdulaziz University, for providing the dataset which we used to train our model with. # Contacts **Abdelrahman Atef** - [LinkedIn](linkedin.com/in/boda-sadalla) - [Github](https://github.com/BodaSadalla98) -