File size: 3,850 Bytes

---
license: mit
datasets:
- HausaNLP/NaijaSenti-Twitter
language:
- ha
metrics:
- accuracy
- f1
- precision
- recall
base_model: google-bert/bert-base-cased
pipeline_tag: text-classification
library_name: transformers
tags:
- NLP
- sentiment-analysis
- hausa
---

**Model Name**: Hausa Sentiment Analysis  
**Model ID**: `Kumshe/Hausa-sentiment-analysis`  
**Language**: Hausa

---

### **Model Description**
This model is a BERT-based model fine-tuned for sentiment analysis in the Hausa language. It is trained to classify social media text into different sentiment categories: positive, negative, or neutral.

### **Intended Use**
- **Primary Use Case**: Sentiment analysis for Hausa social media content, such as tweets or Facebook posts.
- **Target Users**: NLP researchers, businesses analyzing social media, and developers building sentiment analysis tools for Hausa language content.
- **Example Usage**: 
  ```python
  from transformers import AutoTokenizer, AutoModelForSequenceClassification

  # Load the model and tokenizer
  tokenizer = AutoTokenizer.from_pretrained("Kumshe/Hausa-sentiment-analysis")
  model = AutoModelForSequenceClassification.from_pretrained("Kumshe/Hausa-sentiment-analysis")

  # Encode the input text
  inputs = tokenizer("Your Hausa text here", return_tensors="pt")
  
  # Get model predictions
  outputs = model(**inputs)
  ```

### **Model Architecture**
- **Base Model**: BERT (Bidirectional Encoder Representations from Transformers)
- **Pre-trained Model**: `bert-base-cased` from Hugging Face Transformers library.
- **Fine-Tuned Model**: Fine-tuned for 40 epochs on a Hausa sentiment dataset.

### **Training Data**
- **Data Source**: The model was trained on a dataset containing 35,000 examples from social media platforms such as Twitter and Facebook.
- **Data Split**: 
  - **Training Set**: 80% of the data
  - **Validation Set**: 20% of the data

### **Training Details**
- **Number of Epochs**: 40
- **Batch Size**: 
  - Per device training batch size: 32
  - Per device evaluation batch size: 64
- **Learning Rate Schedule**: Warm-up steps: 10, Weight decay: 0.01
- **Optimizer**: AdamW
- **Training Hardware**: Trained on Kaggle using 2 NVIDIA T4 GPUs.

### **Evaluation Metrics**
- **Evaluation Loss**: 0.6265
- **Accuracy**: 73.47%
- **F1 Score**: 73.47%
- **Precision**: 73.54%
- **Recall**: 73.47%

### **Model Performance**
The model performs well on the given dataset, achieving a balanced performance between precision, recall, and F1 score, making it suitable for general sentiment analysis tasks in Hausa language text.

### **Limitations**
- The model may not generalize well to other types of Hausa text outside of social media (e.g., formal writing or literature).
- Performance may degrade on text containing slang or regional dialects not well-represented in the training data.
- The model is biased towards the examples in the training dataset; biases in the data may affect predictions.

### **Ethical Considerations**
- Sentiment analysis models can potentially amplify biases present in the training data.
- Use cautiously in sensitive applications to avoid unintended consequences.
- Consider the impact on privacy and data protection laws, especially when analyzing social media content.

### **License**
-  

### **Citation**
If you use this model in your work, please cite it as follows:
```
@misc{Kumshe2024HausaSentimentAnalysis,
  author = {Umar Muhammad Mustapha Kumshe},
  title = {Hausa Sentiment Analysis},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Kumshe/Hausa-sentiment-analysis}},
}
```

### **Contributions**
This model was fine-tuned by Umar Muhammad Mustapha Kumshe. Feel free to contribute, provide feedback, or raise issues on the [model repository](https://huggingface.co/Kumshe/Hausa-sentiment-analysis).