Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Danish Sentiment Analysis
|
2 |
+
## Information
|
3 |
+
- Dataset : [DDSC/angry-tweets](https://huggingface.co/datasets/DDSC/angry-tweets)
|
4 |
+
- Base model : [Danish bert botxo](https://huggingface.co/Maltehb/danish-bert-botxo)
|
5 |
+
|
6 |
+
## Approach
|
7 |
+
- Preprocessing
|
8 |
+
- Links and Usernames are replaced with @USER and [LINK], removing those keyholders
|
9 |
+
- Removing hashtags as they generally donot contribute to sentiment
|
10 |
+
- Removing emoji as models used in this notebook donot take emojis into consideration (replacing with their meaning could also be tested)
|
11 |
+
- lowercase
|
12 |
+
- Stopwords removal, danish stopwords from NLTK
|
13 |
+
|
14 |
+
- Training with HF trainer
|
15 |
+
- Training with pytorch loop
|
16 |
+
- Uploading model to Huggingface hub
|
17 |
+
- FastAPI endpoint
|
18 |
+
- Packaged the api service as a docker container
|
19 |
+
-
|