Update README.md
Browse files
README.md
CHANGED
@@ -18,6 +18,8 @@ tags:
|
|
18 |
### Model Description
|
19 |
The `PersianBPETokenizer` is a custom tokenizer specifically designed for the Persian (Farsi) language. It leverages the Byte-Pair Encoding (BPE) algorithm to create a robust vocabulary that can effectively handle the unique characteristics of Persian text. This tokenizer is optimized for use with advanced language models like BERT and RoBERTa, making it a valuable tool for various Persian NLP tasks.
|
20 |
|
|
|
|
|
21 |
### Model Type
|
22 |
- **Tokenization Algorithm**: Byte-Pair Encoding (BPE)
|
23 |
- **Normalization**: NFD, StripAccents, Lowercase, Strip, Replace (ZWNJ)
|
|
|
18 |
### Model Description
|
19 |
The `PersianBPETokenizer` is a custom tokenizer specifically designed for the Persian (Farsi) language. It leverages the Byte-Pair Encoding (BPE) algorithm to create a robust vocabulary that can effectively handle the unique characteristics of Persian text. This tokenizer is optimized for use with advanced language models like BERT and RoBERTa, making it a valuable tool for various Persian NLP tasks.
|
20 |
|
21 |
+

|
22 |
+
|
23 |
### Model Type
|
24 |
- **Tokenization Algorithm**: Byte-Pair Encoding (BPE)
|
25 |
- **Normalization**: NFD, StripAccents, Lowercase, Strip, Replace (ZWNJ)
|