lukasweber
commited on
Commit
·
2ba71c4
1
Parent(s):
a585556
Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ tags:
|
|
8 |
WG-BERT (Warranty and Goodwill) is a pretrained encoder based model to analyze automotive entities in automotive-related texts. WG-BERT is trained by continually
|
9 |
pretraining the BERT language model in the automotive domain by using a corpus of automotive (workshop feedback) texts via the masked language modeling (MLM) approach.
|
10 |
WG-BERT is further fine-tuned for automotive entity recognition (subtask of Named Entity Recognition (NER)) to extract components and their complaints out of automotive texts.
|
11 |
-
The dataset for continual pretraining consists of ~
|
12 |
The dataset for fine-tuning consists of ~5.500 gold annotated sentences by automotive domain experts.
|
13 |
We choose as the training architecture the BERT-base-uncased version.
|
14 |
|
|
|
8 |
WG-BERT (Warranty and Goodwill) is a pretrained encoder based model to analyze automotive entities in automotive-related texts. WG-BERT is trained by continually
|
9 |
pretraining the BERT language model in the automotive domain by using a corpus of automotive (workshop feedback) texts via the masked language modeling (MLM) approach.
|
10 |
WG-BERT is further fine-tuned for automotive entity recognition (subtask of Named Entity Recognition (NER)) to extract components and their complaints out of automotive texts.
|
11 |
+
The dataset for continual pretraining consists of 1.8 million workshop feedback texts which contain ~4 million sentences.
|
12 |
The dataset for fine-tuning consists of ~5.500 gold annotated sentences by automotive domain experts.
|
13 |
We choose as the training architecture the BERT-base-uncased version.
|
14 |
|