YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

FinBERT Sentiment Analysis on English/Quotes Dataset

πŸ“Œ Overview

This repository hosts the FinBERT model fine-tuned for sentiment analysis using the English/Quotes dataset. The model classifies text into sentiment categories such as positive, negative, or neutral.

πŸ— Model Details

  • Model Architecture: FinBERT (BERT-based model for sentiment analysis)
  • Task: Sentiment Analysis
  • Dataset: English/quotes dataset
  • Fine-tuning Framework: Hugging Face Transformers

πŸš€ Usage

Installation

pip install transformers torch

Loading the Model

from transformers import BertTokenizer, BertForSequenceClassification
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

model_name = "Aventiq-AI/finbert-english/quotes"
model = BertForSequenceClassification.from_pretrained(model_name).to(device)
tokenizer = BertTokenizer.from_pretrained(model_name)

Sentiment Classification Inference

def predict_sentiment(text):
    inputs = tokenizer(text, padding="max_length", truncation=True, max_length=128, return_tensors="pt")
    inputs = {key: val.to(device) for key, val in inputs.items()}  # Move inputs to device
    with torch.no_grad():
        outputs = model(**inputs)
    logits = outputs.logits
    prediction = torch.argmax(logits, dim=-1).item()
    label_map = {0: "negative", 1: "neutral", 2: "positive"}
    return label_map[prediction]
 
# Test on the original 5 quotes
original_quotes = [
    "β€œBe yourself; everyone else is already taken.”",
    "β€œI'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle. But if you can't handle me at my worst, then you sure as hell don't deserve me at my best.”",
    "β€œTwo things are infinite: the universe and human stupidity; and I'm not sure about the universe.”",
    "β€œSo many books, so little time.”",
    "β€œA room without books is like a body without a soul.”"
]
 
print("Predictions for original quotes:")
for quote in original_quotes:
    pred = predict_sentiment(quote)
    print(f"Quote: {quote}\nPredicted Sentiment: {pred}\n")
 
# Test on a new example
new_quote = "Life is beautiful when you smile."
print("Prediction for new quote:")
print(f"Quote: {new_quote}\nPredicted Sentiment: {predict_sentiment(new_quote)}")

πŸ“Š Evaluation Metric: Accuracy & F1 Score

For sentiment analysis, accuracy and F1-score are key evaluation metrics. The model achieves:

  • Accuracy: 88%
  • F1 Score: 0.85

πŸ“‚ Repository Structure

.
β”œβ”€β”€ model/               # Contains the fine-tuned model files
β”œβ”€β”€ tokenizer_config/    # Tokenizer configuration and vocabulary files
β”œβ”€β”€ model.safetensors/   # Model weights
β”œβ”€β”€ README.md            # Model documentation

⚠️ Limitations

  • The model may struggle with ambiguous phrases.
  • Performance might vary across different jurisdictions and terminologies.
  • The dataset primarily contains English text, making it less effective for multilingual applications.

🀝 Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.

Downloads last month
4
Safetensors
Model size
109M params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support