erikycd commited on
Commit
5323a2b
·
1 Parent(s): f2aba79

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md CHANGED
@@ -1,3 +1,69 @@
1
  ---
2
  license: gpl-3.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: gpl-3.0
3
  ---
4
+ ---
5
+ language: en
6
+ tags:
7
+ - exbert
8
+ license:
9
+ datasets:
10
+ - bookcorpus
11
+ - wikipedia
12
+ ---
13
+
14
+ # BERT base model (uncased)
15
+
16
+ Pretrained model on English language using a masked language modeling (MLM) objective.
17
+
18
+ ## Model description
19
+
20
+ BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it
21
+ was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of
22
+ publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it
23
+ was pretrained with two objectives:
24
+
25
+ - Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run
26
+ the entire masked sentence through the model and has to predict the masked words. This is different from traditional
27
+ recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like
28
+ GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of the
29
+ sentence.
30
+ - Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes
31
+ they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to
32
+ predict if the two sentences were following each other or not.
33
+
34
+ This way, the model learns an inner representation of the English language that can then be used to extract features
35
+ useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard
36
+ classifier using the features produced by the BERT model as inputs.
37
+
38
+ ### How to use
39
+
40
+ You can use this model directly with a pipeline for masked language modeling:
41
+
42
+ ```python
43
+ >>> from transformers import pipeline
44
+ >>> unmasker = pipeline('fill-mask', model='bert-base-uncased')
45
+ >>> unmasker("Hello I'm a [MASK] model.")
46
+ ```
47
+
48
+ Here is how to use this model to get the features of a given text in PyTorch:
49
+
50
+ ```python
51
+ from transformers import BertTokenizer, BertModel
52
+ tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
53
+ model = BertModel.from_pretrained("bert-base-uncased")
54
+ text = "Replace me by any text you'd like."
55
+ encoded_input = tokenizer(text, return_tensors='pt')
56
+ output = model(**encoded_input)
57
+ ```
58
+
59
+ and in TensorFlow:
60
+
61
+ ```python
62
+ from transformers import BertTokenizer, TFBertModel
63
+ tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
64
+ model = TFBertModel.from_pretrained("bert-base-uncased")
65
+ text = "Replace me by any text you'd like."
66
+ encoded_input = tokenizer(text, return_tensors='tf')
67
+ output = model(encoded_input)
68
+ ```
69
+