asier-gutierrez
commited on
Commit
·
a96d54e
1
Parent(s):
f77e7fe
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
license: apache-2.0
|
5 |
+
tags:
|
6 |
+
- "finance"
|
7 |
+
- "sentiment analysis"
|
8 |
+
- "regression"
|
9 |
+
- "sentence bert"
|
10 |
+
datasets:
|
11 |
+
- "RavenPack"
|
12 |
+
metrics:
|
13 |
+
- "rmse"
|
14 |
+
|
15 |
+
---
|
16 |
+
|
17 |
+
# SentenceBERT for Financial News Sentiment Regression
|
18 |
+
|
19 |
+
## Introduction
|
20 |
+
Analyzing the sentiment of financial news is a complex task that requires a large understanding of the financial slang, as well as the knowledge of the context of each one of the companies, and the interactions of the whole economy as an ecosystem.
|
21 |
+
|
22 |
+
The [FinBERT](https://huggingface.co/ProsusAI/finbert) model binary classifies the sentiment being positive or negative. However, the idea of binary classification is too simple and does not comply with the reality.
|
23 |
+
|
24 |
+
RavenPack has an excellent hand-labelled large dataset with a continuous sentiment label variable that goes from -1 to 1. We have collected data from two previous years and tested it with data from the next two weeks. Additionally we have cut the dataset taking only both one year and six months subsamples to see how the model scales with more data, and to know whether more data helps the model or not.
|
25 |
+
|
26 |
+
In this repository you can find the different models by changing the branch name. The main branch is the one with the model trained on the whole dataset. We also uploaded the FinBERT regressor to the Hub:
|
27 |
+
|
28 |
+
## Evaluation
|
29 |
+
|
30 |
+
|
31 |
+
## Code
|
32 |
+
You can find the code for this model in the following link: https://github.com/lhf-labs/finance-news-analysis-bert
|
33 |
+
|
34 |
+
## Citation
|
35 |
+
```
|
36 |
+
TBA
|
37 |
+
```
|