mshojaei77
/

PersianBPETokenizer

Text Generation

Model card Files Files and versions Community

mshojaei77 commited on Feb 11

Commit

4a2317f

·

verified ·

1 Parent(s): 2aa89c1

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -18,6 +18,8 @@ tags:
 ### Model Description
 The `PersianBPETokenizer` is a custom tokenizer specifically designed for the Persian (Farsi) language. It leverages the Byte-Pair Encoding (BPE) algorithm to create a robust vocabulary that can effectively handle the unique characteristics of Persian text. This tokenizer is optimized for use with advanced language models like BERT and RoBERTa, making it a valuable tool for various Persian NLP tasks.
 ### Model Type
 - **Tokenization Algorithm**: Byte-Pair Encoding (BPE)
 - **Normalization**: NFD, StripAccents, Lowercase, Strip, Replace (ZWNJ)

 ### Model Description
 The `PersianBPETokenizer` is a custom tokenizer specifically designed for the Persian (Farsi) language. It leverages the Byte-Pair Encoding (BPE) algorithm to create a robust vocabulary that can effectively handle the unique characteristics of Persian text. This tokenizer is optimized for use with advanced language models like BERT and RoBERTa, making it a valuable tool for various Persian NLP tasks.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6556b1bb85d43542fa1a8f91/lZJKqsi4BZ8mJiY_I-vhA.png)
 ### Model Type
 - **Tokenization Algorithm**: Byte-Pair Encoding (BPE)
 - **Normalization**: NFD, StripAccents, Lowercase, Strip, Replace (ZWNJ)