ClassCat commited on
Commit
6202d6e
1 Parent(s): f17c859

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -0
README.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: eu
3
+ license: cc-by-sa-4.0
4
+ datasets:
5
+ - cc100
6
+ widget:
7
+ - text: "Euria egingo <mask> gaur ?"
8
+ - text: "<mask> umeari liburua eman dio."
9
+ - text: "Zein da zure <mask> ?"
10
+ ---
11
+
12
+ ## RoBERTa Basque x-small model (Uncased)
13
+
14
+ ### Prerequisites
15
+
16
+ transformers==4.19.2
17
+
18
+ ### Model architecture
19
+
20
+ This model uses half the size of RoBERTa base setttings.
21
+
22
+ ### Tokenizer
23
+
24
+ Using BPE tokenizer with vocabulary size 50,000.
25
+
26
+ ### Training Data
27
+
28
+ * Subset of [CC-100/eu](https://data.statmt.org/cc-100/) : Monolingual Datasets from Web Crawl Data
29
+
30
+ ### Usage
31
+
32
+ ```python
33
+ from transformers import pipeline
34
+
35
+ unmasker = pipeline('fill-mask', model='ClassCat/roberta-xsmall-basque')
36
+ unmasker("Zein da zure <mask> ?")
37
+ ```