Update README.md
Browse files
README.md
CHANGED
@@ -19,6 +19,15 @@ What is BERTweet-FA?
|
|
19 |
|
20 |
BERTweet-FA is a transformer-based model trained on 20665964 Persian tweets. The model has been trained on the data only for 1 epoch (322906 steps), and yet it has the ability to recognize the meaning of most of the conversational sentences used in Farsi. Note that the architecture of this model follows the original BERT.
|
21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
The Training Data
|
23 |
---
|
24 |
The first version of the model was trained on the "[Large Scale Colloquial Persian Dataset](https://iasbs.ac.ir/~ansari/lscp/)" containing more than 20 million tweets in Farsi, gathered by Khojasteh et al., and published on 2020.
|
|
|
19 |
|
20 |
BERTweet-FA is a transformer-based model trained on 20665964 Persian tweets. The model has been trained on the data only for 1 epoch (322906 steps), and yet it has the ability to recognize the meaning of most of the conversational sentences used in Farsi. Note that the architecture of this model follows the original BERT.
|
21 |
|
22 |
+
How to use the Model
|
23 |
+
---
|
24 |
+
```
|
25 |
+
import torch
|
26 |
+
from transformers import BertForMaskedLM, BertTokenizer
|
27 |
+
model = BertForMaskedLM.from_pretrained('arm-on/BERTweet-FA')
|
28 |
+
tokenizer = BertTokenizer.from_pretrained('arm-on/BERTweet-FA')
|
29 |
+
```
|
30 |
+
|
31 |
The Training Data
|
32 |
---
|
33 |
The first version of the model was trained on the "[Large Scale Colloquial Persian Dataset](https://iasbs.ac.ir/~ansari/lscp/)" containing more than 20 million tweets in Farsi, gathered by Khojasteh et al., and published on 2020.
|