|
--- |
|
datasets: |
|
- stanfordnlp/sst2 |
|
language: |
|
- en |
|
metrics: |
|
- accuracy: 0.91789 |
|
--- |
|
|
|
|
|
# Fine-Tuned RoBERTa Model for Sentiment Analysis |
|
|
|
## Overview |
|
|
|
This is a fine-tuned [RoBERTa](https://huggingface.co/docs/transformers/model_doc/robertal) model for sentiment analysis, trained on the [SST-2 dataset](https://huggingface.co/datasets/stanfordnlp/sst2). It classifies text into two sentiment categories: |
|
- **0**: Negative |
|
- **1**: Positive |
|
|
|
The model achieves an accuracy of **91.789%** on the SST-2 test set, making it a robust choice for sentiment classification tasks. |
|
|
|
--- |
|
|
|
## Model Details |
|
|
|
- **Model architecture**: RoBERTa |
|
- **Dataset**: `stanfordnlp/sst2` |
|
- **Language**: English |
|
- **Model size**: 125 million parameters |
|
- **Precision**: FP32 |
|
- **File format**: [SafeTensor](https://github.com/huggingface/safetensors) |
|
- **Tags**: Text Classification, Transformers, SafeTensors, SST-2, English, RoBERTa, Inference Endpoints |
|
|
|
--- |
|
|
|
## Usage |
|
|
|
### Installation |
|
|
|
Ensure you have the necessary libraries installed: |
|
|
|
```bash |
|
pip install transformers torch safetensors |
|
``` |
|
|
|
### Loading the Model |
|
|
|
The model can be loaded from Hugging Face's `transformers` library as follows: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
# Load the tokenizer and model |
|
model_name = "syedkhalid076/RoBERTa-Sentimental-Analysis-v1" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForSequenceClassification.from_pretrained(model_name) |
|
|
|
# Example text |
|
text = "This is an amazing product!" |
|
|
|
# Tokenize input |
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) |
|
|
|
# Perform inference |
|
outputs = model(**inputs) |
|
logits = outputs.logits |
|
predicted_class = logits.argmax().item() |
|
|
|
# Map the prediction to sentiment |
|
sentiments = {0: "Negative", 1: "Positive"} |
|
print(f"Sentiment: {sentiments[predicted_class]}") |
|
``` |
|
|
|
--- |
|
|
|
## Performance |
|
|
|
### Dataset |
|
|
|
The model was trained and evaluated on the **SST-2** dataset, which is widely used for sentiment analysis tasks. |
|
|
|
### Metrics |
|
|
|
| Metric | Value | |
|
|----------|----------| |
|
| Accuracy | 91.789% | |
|
|
|
--- |
|
|
|
## Deployment |
|
|
|
The model is hosted on Hugging Face and can be used directly via their [Inference Endpoints](https://huggingface.co/inference-endpoints). |
|
|
|
--- |
|
|
|
## Applications |
|
|
|
This model can be used in a variety of applications, such as: |
|
- Customer feedback analysis |
|
- Social media sentiment monitoring |
|
- Product review classification |
|
- Opinion mining for research purposes |
|
|
|
--- |
|
|
|
## Limitations |
|
|
|
While the model performs well on the SST-2 dataset, consider these limitations: |
|
1. It may not generalize well to domains with language or sentiment nuances different from the training data. |
|
2. It supports only binary sentiment classification (positive/negative). |
|
|
|
For fine-tuning on custom datasets or additional labels, refer to the [Hugging Face documentation](https://huggingface.co/docs/transformers/training). |
|
|
|
--- |
|
|
|
## Model Card |
|
|
|
| **Feature** | **Details** | |
|
|---------------------|-----------------------------------------------------------------------------| |
|
| **Language** | English | |
|
| **Model size** | 125M parameters | |
|
| **File format** | SafeTensor | |
|
| **Precision** | FP32 | |
|
| **Dataset** | stanfordnlp/sst2 | |
|
| **Accuracy** | 91.789% | |
|
|
|
--- |
|
|
|
## Contributing |
|
|
|
Contributions to improve the model or extend its capabilities are welcome. Fork this repository, make your changes, and submit a pull request. |
|
|
|
--- |
|
|
|
## Acknowledgments |
|
|
|
- The [Hugging Face Transformers library](https://github.com/huggingface/transformers) for model implementation and fine-tuning utilities. |
|
- The [Stanford Sentiment Treebank 2 (SST-2)](https://huggingface.co/datasets/stanfordnlp/sst2) dataset for providing high-quality sentiment analysis data. |
|
|
|
--- |
|
|
|
**Author**: Syed Khalid Hussain |
|
|