developerPushkal's picture
Update README.md
9b7d714 verified
# FinBERT Sentiment Analysis on English/Quotes Dataset
## πŸ“Œ Overview
This repository hosts the FinBERT model fine-tuned for sentiment analysis using the English/Quotes dataset. The model classifies text into sentiment categories such as positive, negative, or neutral.
## πŸ— Model Details
- **Model Architecture:** FinBERT (BERT-based model for sentiment analysis)
- **Task:** Sentiment Analysis
- **Dataset:** English/quotes dataset
- **Fine-tuning Framework:** Hugging Face Transformers
## πŸš€ Usage
### Installation
```bash
pip install transformers torch
```
### Loading the Model
```python
from transformers import BertTokenizer, BertForSequenceClassification
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model_name = "Aventiq-AI/finbert-english/quotes"
model = BertForSequenceClassification.from_pretrained(model_name).to(device)
tokenizer = BertTokenizer.from_pretrained(model_name)
```
### Sentiment Classification Inference
```python
def predict_sentiment(text):
inputs = tokenizer(text, padding="max_length", truncation=True, max_length=128, return_tensors="pt")
inputs = {key: val.to(device) for key, val in inputs.items()} # Move inputs to device
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
prediction = torch.argmax(logits, dim=-1).item()
label_map = {0: "negative", 1: "neutral", 2: "positive"}
return label_map[prediction]
# Test on the original 5 quotes
original_quotes = [
"β€œBe yourself; everyone else is already taken.”",
"β€œI'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle. But if you can't handle me at my worst, then you sure as hell don't deserve me at my best.”",
"β€œTwo things are infinite: the universe and human stupidity; and I'm not sure about the universe.”",
"β€œSo many books, so little time.”",
"β€œA room without books is like a body without a soul.”"
]
print("Predictions for original quotes:")
for quote in original_quotes:
pred = predict_sentiment(quote)
print(f"Quote: {quote}\nPredicted Sentiment: {pred}\n")
# Test on a new example
new_quote = "Life is beautiful when you smile."
print("Prediction for new quote:")
print(f"Quote: {new_quote}\nPredicted Sentiment: {predict_sentiment(new_quote)}")
```
## πŸ“Š Evaluation Metric: Accuracy & F1 Score
For sentiment analysis, accuracy and F1-score are key evaluation metrics. The model achieves:
- **Accuracy:** 88%
- **F1 Score:** 0.85
## πŸ“‚ Repository Structure
```
.
β”œβ”€β”€ model/ # Contains the fine-tuned model files
β”œβ”€β”€ tokenizer_config/ # Tokenizer configuration and vocabulary files
β”œβ”€β”€ model.safetensors/ # Model weights
β”œβ”€β”€ README.md # Model documentation
```
## ⚠️ Limitations
- The model may struggle with ambiguous phrases.
- Performance might vary across different jurisdictions and terminologies.
- The dataset primarily contains English text, making it less effective for multilingual applications.
## 🀝 Contributing
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.