dkagramanyan
commited on
Commit
·
25af56c
1
Parent(s):
13c4e3e
Update README.md
Browse files
README.md
CHANGED
@@ -3,4 +3,16 @@ datasets:
|
|
3 |
- armvectores/hy_wikipedia_2023
|
4 |
library_name: fasttext
|
5 |
pipeline_tag: feature-extraction
|
6 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
- armvectores/hy_wikipedia_2023
|
4 |
library_name: fasttext
|
5 |
pipeline_tag: feature-extraction
|
6 |
+
---
|
7 |
+
|
8 |
+
414M tokens
|
9 |
+
1) 73M hy wikipedia
|
10 |
+
2) 341M arlis database
|
11 |
+
|
12 |
+
74951 unique words
|
13 |
+
|
14 |
+
3-5 ngrams
|
15 |
+
|
16 |
+
minimum number of words 150
|
17 |
+
|
18 |
+
|