lukasweber commited on
Commit
2ba71c4
·
1 Parent(s): a585556

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -8,7 +8,7 @@ tags:
8
  WG-BERT (Warranty and Goodwill) is a pretrained encoder based model to analyze automotive entities in automotive-related texts. WG-BERT is trained by continually
9
  pretraining the BERT language model in the automotive domain by using a corpus of automotive (workshop feedback) texts via the masked language modeling (MLM) approach.
10
  WG-BERT is further fine-tuned for automotive entity recognition (subtask of Named Entity Recognition (NER)) to extract components and their complaints out of automotive texts.
11
- The dataset for continual pretraining consists of ~ 4 million sentences.
12
  The dataset for fine-tuning consists of ~5.500 gold annotated sentences by automotive domain experts.
13
  We choose as the training architecture the BERT-base-uncased version.
14
 
 
8
  WG-BERT (Warranty and Goodwill) is a pretrained encoder based model to analyze automotive entities in automotive-related texts. WG-BERT is trained by continually
9
  pretraining the BERT language model in the automotive domain by using a corpus of automotive (workshop feedback) texts via the masked language modeling (MLM) approach.
10
  WG-BERT is further fine-tuned for automotive entity recognition (subtask of Named Entity Recognition (NER)) to extract components and their complaints out of automotive texts.
11
+ The dataset for continual pretraining consists of 1.8 million workshop feedback texts which contain ~4 million sentences.
12
  The dataset for fine-tuning consists of ~5.500 gold annotated sentences by automotive domain experts.
13
  We choose as the training architecture the BERT-base-uncased version.
14