File size: 2,575 Bytes
5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 5d60430 e78d420 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
---
library_name: transformers
license: apache-2.0
base_model:
- openai-community/gpt2
---
# Fine-Tuned GPT-2 for Sentiment Analysis
## Model Description
This is a fine-tuned version of the GPT-2 model for sentiment analysis on tweets. The model has been trained on the `mteb/tweet_sentiment_extraction` dataset to classify tweets into three sentiment categories: **Positive**, **Neutral**, and **Negative**. It uses the Hugging Face Transformers library and achieves an evaluation accuracy of **76%**.
---
## Model Details
### Developer/Owner
- **Creator**: KUSURU SRESHTA
- **Contact**: [email protected]
### Model Type
- **Architecture**: GPT-2
- **Fine-Tuned Task**: Sentiment Analysis
### Dataset
- **Name**: `mteb/tweet_sentiment_extraction`
- **Description**: A dataset for extracting and classifying sentiment in tweets.
- **Language**: English
- **Size**: 1,000 samples used for training and 1,000 for evaluation.
### Training Configuration
- **Tokenizer**: GPT-2 Tokenizer (with EOS token as pad token)
- **Optimizer**: AdamW
- **Learning Rate**: 1e-5
- **Epochs**: 3
- **Batch Size**: 1
- **Hardware Used**: A100
### Performance
- **Accuracy**: 76%
- **Evaluation Metric**: Accuracy
- **Validation Split**: 10% of the dataset.
---
## Usage
This model is designed for sentiment analysis of tweets or other short social media text. Given an input text, it predicts the sentiment as Positive, Neutral, or Negative.
### Example Code
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("charlie1898/gpt2_finetuned_twitter_sentiment_analysis")
model = AutoModelForSequenceClassification.from_pretrained("charlie1898/gpt2_finetuned_twitter_sentiment_analysis")
# Example input
text = "I love using Hugging Face models!"
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
predicted_class = torch.argmax(outputs.logits).item()
print(f"Predicted sentiment class: {predicted_class}")
# Limitations
- ** Bias **: The dataset may contain biased or harmful text, potentially influencing predictions.
- ** Domain Limitations **: Optimized for English tweets; performance may degrade on other text types or languages.
# Ethical Considerations
This model should be used responsibly. Be aware of biases in the training data and avoid deploying the model in sensitive or high-stakes applications without further validation.
# Acknowledgments
- Hugging Face Transformers library
- mteb/tweet_sentiment_extraction dataset |