NerdyPy
/

fine_tuned_model_sentiment_analysis

Text Classification

sentiment-analysis

Moroccan-Darija

Model card Files Files and versions Community

fine_tuned_model_sentiment_analysis / README.md

NerdyPy's picture

Update README.md

6acce71 verified 3 months ago

|

history blame contribute delete

3.29 kB

	---
	pipeline_tag: text-classification
	tags:
	- sentiment-analysis
	- Moroccan-Darija
	- MSA
	base_model: CAMeL-Lab/bert-base-arabic-camelbert-da-sentiment
	metrics:
	- Accuracy
	- Precision
	- Recall
	- F1-Score
	language:
	- ar
	---

	# Fine-tuned CAMeL-BERT Model for Sentiment Analysis in Moroccan Darija

	Model Name: CAMeL-BERT Fine-Tuned for Moroccan Darija Sentiment Analysis
	Model ID: `NerdyPy/fine_tuned_model_sentiment_analysis`
	Language: Arabic (Modern Standard Arabic and Moroccan Darija)
	Task: Sentiment Analysis (Negative, Neutral, Positive)

	---

	## Model Description

	This model is a fine-tuned version of the [CAMeL-Lab BERT](https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-da-sentiment) model, specifically adapted for sentiment analysis in Moroccan Darija, a highly under-resourced Arabic dialect. The model has been trained to classify Arabic text—including both Modern Standard Arabic (MSA) and Moroccan Darija—into three sentiment categories:

	- Negative
	- Neutral
	- Positive

	By focusing on Moroccan Darija, this model addresses the scarcity of NLP resources for this dialect, enhancing sentiment analysis capabilities in mixed-language contexts common in Moroccan user-generated content.

	---

	## Intended Use

	### Primary Use Case

	- Sentiment analysis of user-generated content, such as comments and reviews, in Moroccan Darija and MSA.

	### Applications

	- Analyzing public opinion on social media platforms and electronic journals.
	- Assisting researchers in understanding societal attitudes and trends.
	- Supporting policymakers and organizations in gauging public sentiment.

	### Users

	- Researchers and data scientists in NLP.
	- Organizations analyzing Arabic-language social media.
	- Developers building sentiment analysis tools for Arabic dialects.

	---

	## Limitations and Risks

	### Dialectal Variations

	- Performance may vary on other Arabic dialects not represented in the training data.

	### Data Bias

	- The model may reflect biases present in the training datasets.

	### Language Mixing (Code-Switching)

	The model may face challenges when processing text that heavily mixes Moroccan Darija with other languages (e.g., French, English, Spanish). This could affect the accuracy of sentiment classification in such cases. For example:
	"واش كتفهم le français؟" In this sentence, the speaker switches from Moroccan Darija to French within the same sentence. The model, primarily trained on Arabic text, may not accurately interpret the sentiment due to unfamiliarity with the non-Arabic portion.

	### Generalization

	- Limited performance on topics or vocabulary outside the training data.

	---

	## How to Use

	You can use this model with the Hugging Face Transformers library:

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	# Load the tokenizer and model
	tokenizer = AutoTokenizer.from_pretrained("NerdyPy/fine_tuned_model_sentiment_analysis")
	model = AutoModelForSequenceClassification.from_pretrained("NerdyPy/fine_tuned_model_sentiment_analysis")

	# Example text in Arabic
	text = "العمل في هذا المكان كان رائعاً، ولكن شي مرات ما كاينش التنظيم"