Update README.md
Browse files
README.md
CHANGED
@@ -6,9 +6,13 @@ library_name: transformers
|
|
6 |
pipeline_tag: fill-mask
|
7 |
datasets:
|
8 |
- tahrirchi/uzbek-corpus
|
|
|
|
|
|
|
|
|
9 |
---
|
10 |
|
11 |
-
#
|
12 |
|
13 |
The TahrirchiBERT-base is an encoder-only Transformer text model with 110 million parameters.
|
14 |
It is pretrained model on Uzbek language (latin script) using a masked language modeling (MLM) objective. This model is case-sensitive: it does make a difference between uzbek and Uzbek.
|
@@ -19,7 +23,7 @@ For full details of this model please read our paper (coming soon!) and [release
|
|
19 |
|
20 |
This model is part of the family of **TahrirchiBERT models** trained with different number of parameters that will continuously expanded in the future.
|
21 |
|
22 |
-
| Model |
|
23 |
|------------------------|--------------------------------|-------|-------|
|
24 |
| [`tahrirchi-bert-small`](https://huggingface.co/tahrirchi/tahrirchi-bert-small) | 67M | Uzbek | Latin
|
25 |
| [`tahrirchi-bert-base`](https://huggingface.co/tahrirchi/tahrirchi-bert-base) | 110M | Uzbek | Latin
|
@@ -60,7 +64,7 @@ You can use this model directly with a pipeline for masked language modeling:
|
|
60 |
'sequence': 'Alisher Navoiy – ulug‘ o‘zbek va boshqa turkiy xalqlarning farzandi, mutafakkiri va davlat arbobi bo‘lgan.'}]
|
61 |
|
62 |
|
63 |
-
>>> unmasker(
|
64 |
|
65 |
[{'score': 0.1740381121635437,
|
66 |
'token': 12571,
|
|
|
6 |
pipeline_tag: fill-mask
|
7 |
datasets:
|
8 |
- tahrirchi/uzbek-corpus
|
9 |
+
tags:
|
10 |
+
- bert
|
11 |
+
widget:
|
12 |
+
- text: "Alisher Navoiy – ulug‘ o‘zbek va boshqa turkiy xalqlarning <mask>, mutafakkiri va davlat arbobi bo‘lgan."
|
13 |
---
|
14 |
|
15 |
+
# TahrirchiBERT base mode
|
16 |
|
17 |
The TahrirchiBERT-base is an encoder-only Transformer text model with 110 million parameters.
|
18 |
It is pretrained model on Uzbek language (latin script) using a masked language modeling (MLM) objective. This model is case-sensitive: it does make a difference between uzbek and Uzbek.
|
|
|
23 |
|
24 |
This model is part of the family of **TahrirchiBERT models** trained with different number of parameters that will continuously expanded in the future.
|
25 |
|
26 |
+
| Model | Number of parameters | Language | Script
|
27 |
|------------------------|--------------------------------|-------|-------|
|
28 |
| [`tahrirchi-bert-small`](https://huggingface.co/tahrirchi/tahrirchi-bert-small) | 67M | Uzbek | Latin
|
29 |
| [`tahrirchi-bert-base`](https://huggingface.co/tahrirchi/tahrirchi-bert-base) | 110M | Uzbek | Latin
|
|
|
64 |
'sequence': 'Alisher Navoiy – ulug‘ o‘zbek va boshqa turkiy xalqlarning farzandi, mutafakkiri va davlat arbobi bo‘lgan.'}]
|
65 |
|
66 |
|
67 |
+
>>> unmasker("Egiluvchan boʻgʻinlari va <mask>, yarim bukilgan tirnoqlari tik qiyaliklar hamda daraxtlarga oson chiqish imkonini beradi.")
|
68 |
|
69 |
[{'score': 0.1740381121635437,
|
70 |
'token': 12571,
|