sreshtakusuru's picture
Update README.md
e78d420 verified
---
library_name: transformers
license: apache-2.0
base_model:
- openai-community/gpt2
---
# Fine-Tuned GPT-2 for Sentiment Analysis
## Model Description
This is a fine-tuned version of the GPT-2 model for sentiment analysis on tweets. The model has been trained on the `mteb/tweet_sentiment_extraction` dataset to classify tweets into three sentiment categories: **Positive**, **Neutral**, and **Negative**. It uses the Hugging Face Transformers library and achieves an evaluation accuracy of **76%**.
---
## Model Details
### Developer/Owner
- **Creator**: KUSURU SRESHTA
- **Contact**: [email protected]
### Model Type
- **Architecture**: GPT-2
- **Fine-Tuned Task**: Sentiment Analysis
### Dataset
- **Name**: `mteb/tweet_sentiment_extraction`
- **Description**: A dataset for extracting and classifying sentiment in tweets.
- **Language**: English
- **Size**: 1,000 samples used for training and 1,000 for evaluation.
### Training Configuration
- **Tokenizer**: GPT-2 Tokenizer (with EOS token as pad token)
- **Optimizer**: AdamW
- **Learning Rate**: 1e-5
- **Epochs**: 3
- **Batch Size**: 1
- **Hardware Used**: A100
### Performance
- **Accuracy**: 76%
- **Evaluation Metric**: Accuracy
- **Validation Split**: 10% of the dataset.
---
## Usage
This model is designed for sentiment analysis of tweets or other short social media text. Given an input text, it predicts the sentiment as Positive, Neutral, or Negative.
### Example Code
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("charlie1898/gpt2_finetuned_twitter_sentiment_analysis")
model = AutoModelForSequenceClassification.from_pretrained("charlie1898/gpt2_finetuned_twitter_sentiment_analysis")
# Example input
text = "I love using Hugging Face models!"
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
predicted_class = torch.argmax(outputs.logits).item()
print(f"Predicted sentiment class: {predicted_class}")
# Limitations
- ** Bias **: The dataset may contain biased or harmful text, potentially influencing predictions.
- ** Domain Limitations **: Optimized for English tweets; performance may degrade on other text types or languages.
# Ethical Considerations
This model should be used responsibly. Be aware of biases in the training data and avoid deploying the model in sensitive or high-stakes applications without further validation.
# Acknowledgments
- Hugging Face Transformers library
- mteb/tweet_sentiment_extraction dataset