Fine-Tuned Arabic Sentiment Analysis with BERT 🚀

This repository contains a fine-tuned BERT model for sentiment analysis of Arabic reviews. The model is trained on the Arabic 100k Reviews dataset and can classify reviews into three sentiment categories: Positive, Negative, and Mixed.

Author 🧑‍💻

Khaled Soudy
GitHub: khaledsoudy-1

Source Code 💻

You can find the source code and full implementation of this project on my GitHub repository.

The repository contains the Google Colab notebook, dataset, and scripts used to fine-tune the model for Arabic sentiment analysis.

How to Use the Model

1. Install Required Libraries

Make sure you have the transformers and tensorflow libraries installed:

!pip install transformers

!pip install tensorflow

2. Load the Fine-Tuned Model

You can load the fine-tuned model and tokenizer directly from Hugging Face using the following code:

from transformers import TFBertForSequenceClassification, BertTokenizer

# Load model and tokenizer from Hugging Face
model_name = "khaledsoudy/arabic-sentiment-bert-model"

# Load model
model = TFBertForSequenceClassification.from_pretrained(model_name)

# Load tokenizer
tokenizer = BertTokenizer.from_pretrained(model_name)

3. Use the Model for Prediction

To use the model for sentiment analysis on an Arabic text, follow these steps:

import tensorflow as tf


# Sample Arabic text for sentiment prediction
text = "الفندق رائع و الخدمة ممتازة"

# Tokenize the input text
inputs = tokenizer(text, return_tensors="tf")

# Get the model's prediction
outputs = model(**inputs)

# Get the predicted sentiment (assuming 3 classes: Positive, Negative, Mixed)
predicted_class = tf.argmax(outputs.logits, axis=-1).numpy()

# Map the predicted class index to sentiment labels
sentiment_labels = ['Mixed', 'Negative', 'Positive']
print(f"Predicted sentiment: {sentiment_labels[predicted_class[0]]}")

4. Input Format

The model expects Arabic text input. The text should be preprocessed to remove unnecessary characters or diacritics for better results.

5. Sentiment Labels

The model classifies the sentiment into three categories:

Positive 🌟
Negative 😠
Mixed 🤔

Model Details

Model Name: khaledsoudy/arabic-sentiment-bert-model
Model Type: TFBertForSequenceClassification
Language: Arabic
Sentiment Classes: Positive, Negative, Mixed

How to Fine-Tune This Model

You can fine-tune this model further using your own dataset. Check out the source code and related notebooks on my GitHub for detailed steps and guidance.

License 📜

This model is licensed under the MIT License.

Acknowledgments 🙏

Hugging Face for providing the platform to host models.
Google BERT for the pre-trained model.
Kaggle for the Arabic 100k Reviews dataset.

This README is ready for use on your Hugging Face model page! It includes detailed usage instructions, links to your GitHub, and other relevant information.

khaledsoudy
/

arabic-sentiment-bert-model