fathan
/

indojave-codemixed-bert-base

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

fathan commited on Feb 24, 2023

Commit

678080b

·

1 Parent(s): 2fd04d1

Update README.md

Files changed (1) hide show

README.md +14 -10

README.md CHANGED Viewed

@@ -17,8 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
 # Code-mixed IJEBERT
 ## About
-Code-mixed IJEBERT is a pre-trained maksed language model for code-mixed Indonesian-Javanese-English tweets data.
-This model is trained based on [BERT](https://huggingface.co/bert-base-multilingual-cased) model utilizing
 Hugging Face's [Transformers]((https://huggingface.co/transformers)) library.
 ## Pre-training Data
@@ -45,12 +45,20 @@ In the second stage pre-processing, we do the following pre-processing tasks:
 - convert ‘@username’ to ‘@USER’,
 - convert URL to HTTPURL.
-Finally, we have 28,121,693 sentences for our pre-training task.
 ## Model
-| Model name           | #params | Arch.    | Size of training data      | Size of validation data |
-|----------------------|---------|----------|----------------------------|-------------------------|
-| `code-mixed-ijebert` |         | BERT     | 2.24 GB of text            | 249 MB of text          |
 ### Training hyperparameters
@@ -63,10 +71,6 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: linear
 - num_epochs: 3.0
-### Training results
 ### Framework versions
 - Transformers 4.26.0

 # Code-mixed IJEBERT
 ## About
+Code-mixed IJEBERT is a pre-trained masked language model for code-mixed Indonesian-Javanese-English tweets data.
+This model is trained based on [BERT](https://arxiv.org/abs/1810.04805) model utilizing
 Hugging Face's [Transformers]((https://huggingface.co/transformers)) library.
 ## Pre-training Data
 - convert ‘@username’ to ‘@USER’,
 - convert URL to HTTPURL.
+Finally, we have 28,121,693 sentences for the training process.
 ## Model
+| Model name           | Architecture    | Size of training data      | Size of validation data |
+|----------------------|-----------------|----------------------------|-------------------------|
+| `code-mixed-ijebert` | BERT            | 2.24 GB of text            | 249 MB of text          |
+## Evaluation Results
+We train the data with 3 epochs and total steps of 296598 for 12 days.
+The following are the results obtained from the training:
+| train loss | valid loss | perplexity |
+|------------|------------|------------|
+|   3.5057   |   3.0559   |  21.2398   |
 ### Training hyperparameters
 - lr_scheduler_type: linear
 - num_epochs: 3.0
 ### Framework versions
 - Transformers 4.26.0