Model Name: Hausa Sentiment Analysis
Model ID: Kumshe/Hausa-sentiment-analysis
Language: Hausa


Model Description

This model is a BERT-based model fine-tuned for sentiment analysis in the Hausa language. It is trained to classify social media text into different sentiment categories: positive, negative, or neutral.

Intended Use

  • Primary Use Case: Sentiment analysis for Hausa social media content, such as tweets or Facebook posts.
  • Target Users: NLP researchers, businesses analyzing social media, and developers building sentiment analysis tools for Hausa language content.
  • Example Usage:
    from transformers import AutoTokenizer, AutoModelForSequenceClassification
    
    # Load the model and tokenizer
    tokenizer = AutoTokenizer.from_pretrained("Kumshe/Hausa-sentiment-analysis")
    model = AutoModelForSequenceClassification.from_pretrained("Kumshe/Hausa-sentiment-analysis")
    
    # Encode the input text
    inputs = tokenizer("Your Hausa text here", return_tensors="pt")
    
    # Get model predictions
    outputs = model(**inputs)
    

Model Architecture

  • Base Model: BERT (Bidirectional Encoder Representations from Transformers)
  • Pre-trained Model: bert-base-cased from Hugging Face Transformers library.
  • Fine-Tuned Model: Fine-tuned for 40 epochs on a Hausa sentiment dataset.

Training Data

  • Data Source: The model was trained on a dataset containing 35,000 examples from social media platforms such as Twitter and Facebook.
  • Data Split:
    • Training Set: 80% of the data
    • Validation Set: 20% of the data

Training Details

  • Number of Epochs: 40
  • Batch Size:
    • Per device training batch size: 32
    • Per device evaluation batch size: 64
  • Learning Rate Schedule: Warm-up steps: 10, Weight decay: 0.01
  • Optimizer: AdamW
  • Training Hardware: Trained on Kaggle using 2 NVIDIA T4 GPUs.

Evaluation Metrics

  • Evaluation Loss: 0.6265
  • Accuracy: 73.47%
  • F1 Score: 73.47%
  • Precision: 73.54%
  • Recall: 73.47%

Model Performance

The model performs well on the given dataset, achieving a balanced performance between precision, recall, and F1 score, making it suitable for general sentiment analysis tasks in Hausa language text.

Limitations

  • The model may not generalize well to other types of Hausa text outside of social media (e.g., formal writing or literature).
  • Performance may degrade on text containing slang or regional dialects not well-represented in the training data.
  • The model is biased towards the examples in the training dataset; biases in the data may affect predictions.

Ethical Considerations

  • Sentiment analysis models can potentially amplify biases present in the training data.
  • Use cautiously in sensitive applications to avoid unintended consequences.
  • Consider the impact on privacy and data protection laws, especially when analyzing social media content.

License

Citation

If you use this model in your work, please cite it as follows:

@misc{Kumshe2024HausaSentimentAnalysis,
  author = {Umar Muhammad Mustapha Kumshe},
  title = {Hausa Sentiment Analysis},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Kumshe/Hausa-sentiment-analysis}},
}

Contributions

This model was fine-tuned by Umar Muhammad Mustapha Kumshe. Feel free to contribute, provide feedback, or raise issues on the model repository.

Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Kumshe/Hausa-sentiment-analysis

Finetuned
(2001)
this model

Dataset used to train Kumshe/Hausa-sentiment-analysis