abdulhade commited on
Commit
f434b1d
·
verified ·
1 Parent(s): 5f145ff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -1,3 +1,6 @@
 
 
 
1
  # Kurmanji Tokenizer
2
 
3
  This repository contains the Kurmanji Tokenizer trained on a 50 million token text corpus. The tokenizer was specifically developed to support the Kurmanji dialect of Kurdish, ensuring accurate and efficient tokenization for natural language processing tasks in this language.
@@ -34,4 +37,4 @@ tokenizer = PreTrainedTokenizerFast.from_pretrained("asosoft/KurmanjiTokenizer-W
34
  # Example usage
35
  text = "Navê min Ali ye."
36
  tokens = tokenizer.encode(text)
37
- print(tokens)
 
1
+ ---
2
+ pipeline_tag: feature-extraction
3
+ ---
4
  # Kurmanji Tokenizer
5
 
6
  This repository contains the Kurmanji Tokenizer trained on a 50 million token text corpus. The tokenizer was specifically developed to support the Kurmanji dialect of Kurdish, ensuring accurate and efficient tokenization for natural language processing tasks in this language.
 
37
  # Example usage
38
  text = "Navê min Ali ye."
39
  tokens = tokenizer.encode(text)
40
+ print(tokens)