fathan commited on
Commit
678080b
1 Parent(s): 2fd04d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -10
README.md CHANGED
@@ -17,8 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
17
  # Code-mixed IJEBERT
18
 
19
  ## About
20
- Code-mixed IJEBERT is a pre-trained maksed language model for code-mixed Indonesian-Javanese-English tweets data.
21
- This model is trained based on [BERT](https://huggingface.co/bert-base-multilingual-cased) model utilizing
22
  Hugging Face's [Transformers]((https://huggingface.co/transformers)) library.
23
 
24
  ## Pre-training Data
@@ -45,12 +45,20 @@ In the second stage pre-processing, we do the following pre-processing tasks:
45
  - convert ‘@username’ to ‘@USER’,
46
  - convert URL to HTTPURL.
47
 
48
- Finally, we have 28,121,693 sentences for our pre-training task.
49
 
50
  ## Model
51
- | Model name | #params | Arch. | Size of training data | Size of validation data |
52
- |----------------------|---------|----------|----------------------------|-------------------------|
53
- | `code-mixed-ijebert` | | BERT | 2.24 GB of text | 249 MB of text |
 
 
 
 
 
 
 
 
54
 
55
  ### Training hyperparameters
56
 
@@ -63,10 +71,6 @@ The following hyperparameters were used during training:
63
  - lr_scheduler_type: linear
64
  - num_epochs: 3.0
65
 
66
- ### Training results
67
-
68
-
69
-
70
  ### Framework versions
71
 
72
  - Transformers 4.26.0
 
17
  # Code-mixed IJEBERT
18
 
19
  ## About
20
+ Code-mixed IJEBERT is a pre-trained masked language model for code-mixed Indonesian-Javanese-English tweets data.
21
+ This model is trained based on [BERT](https://arxiv.org/abs/1810.04805) model utilizing
22
  Hugging Face's [Transformers]((https://huggingface.co/transformers)) library.
23
 
24
  ## Pre-training Data
 
45
  - convert ‘@username’ to ‘@USER’,
46
  - convert URL to HTTPURL.
47
 
48
+ Finally, we have 28,121,693 sentences for the training process.
49
 
50
  ## Model
51
+ | Model name | Architecture | Size of training data | Size of validation data |
52
+ |----------------------|-----------------|----------------------------|-------------------------|
53
+ | `code-mixed-ijebert` | BERT | 2.24 GB of text | 249 MB of text |
54
+
55
+ ## Evaluation Results
56
+ We train the data with 3 epochs and total steps of 296598 for 12 days.
57
+ The following are the results obtained from the training:
58
+
59
+ | train loss | valid loss | perplexity |
60
+ |------------|------------|------------|
61
+ | 3.5057 | 3.0559 | 21.2398 |
62
 
63
  ### Training hyperparameters
64
 
 
71
  - lr_scheduler_type: linear
72
  - num_epochs: 3.0
73
 
 
 
 
 
74
  ### Framework versions
75
 
76
  - Transformers 4.26.0