boda
/

ANER

Token Classification

Inference Endpoints

Model card Files Files and versions Community

ANER / README.md

boda's picture

add paper link

670b793 about 1 year ago

|

history blame contribute delete

2.56 kB

	---
	language:
	- ar
	thumbnail: url to a thumbnail used in social sharing
	tags:
	- ner
	- token-classification
	- Arabic-NER
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	widget:
	- text: النجم محمد صلاح لاعب المنتخب المصري يعيش في مصر بالتحديد من نجريج, الشرقية
	example_title: Mohamed Salah
	- text: انا ساكن في حدايق الزتون و بدرس في جامعه عين شمس
	example_title: Egyptian Dialect
	- text: يقع نهر الأمازون في قارة أمريكا الجنوبية
	example_title: Standard Arabic
	datasets:
	- Fine-grained-Arabic-Named-Entity-Corpora
	pipeline_tag: token-classification
	---





	# Arabic Named Entity Recognition

	This project is made to enrich the Arabic Named Entity Recognition(ANER). Arabic is a tough language to deal with and has alot of difficulties.
	We managed to made a model based on Arabert to support 50 entities.

	# Paper:

	This is the paper for the system, where you can find all the details: https://arxiv.org/abs/2308.14669


	# Dataset

	- [Fine-grained Arabic Named Entity Corpora](https://fsalotaibi.kau.edu.sa/Pages-Arabic-NE-Corpora.aspx)


	# Evaluation results

	The model achieves the following results:

	\| Dataset \| WikiFANE Gold \| WikiFANE Gold \| WikiFANE Gold \| NewsFANE Gold \| NewsFANE Gold \| NewsFANE Gold
	\|:--------:\|:-------:\|:-------:\|:------:\|:------:\|:---------:\|:------:\|
	\| (metric) \| (Recall) \| (Precision) \| (F1) \| (Recall) \| (Precision) \| (F1)
	\| \| 87.0 \| 90.5 \| 88.7 \| 78.1 \| 77.4 \| 77.7


	# Usage

	The model is available on the HuggingFace model page under the name: [boda/ANER](https://huggingface.co/boda/ANER). Checkpoints are available only in PyTorch at the time.

	### Use in python:

	```python
	from transformers import AutoTokenizer, AutoModelForTokenClassification

	tokenizer = AutoTokenizer.from_pretrained("boda/ANER")

	model = AutoModelForTokenClassification.from_pretrained("boda/ANER")
	```


	# Acknowledgments

	Thanks to [Arabert](https://github.com/aub-mind/arabert) for providing the Arabic Bert model, which we used as a base model for our work.

	We also would like to thank [Prof. Fahd Saleh S Alotaibi](https://fsalotaibi.kau.edu.sa/Pages-Arabic-NE-Corpora.aspx) at the Faculty of Computing and Information Technology King Abdulaziz University, for providing the dataset which we used to train our model with.

	# Contacts

	Abdelrahman Atef

	- [LinkedIn](linkedin.com/in/boda-sadalla)
	- [Github](https://github.com/BodaSadalla98)
	- <[email protected]>