ijelid-bert-base-multilingual

This model is a fine-tuned version of BERT multilingual base model (cased) on the Indonesian-Javanese-English code-mixed Twitter dataset.

Label ID and its corresponding name:

Label ID Label Name
LABEL_0 English (EN)
LABEL_1 Indonesian (ID)
LABEL_2 Javanese (JV)
LABEL_3 Mixed Indonesian-English (MIX-ID-EN)
LABEL_4 Mixed Indonesian-Javanese (MIX-ID-JV)
LABEL_5 Mixed Javanese-English (MIX-JV-EN)
LABEL_6 Other (O)

It achieves the following results on the evaluation set:

  • Loss: 0.3553
  • Precision: 0.9189
  • Recall: 0.9188
  • F1: 0.9187
  • Accuracy: 0.9451

It achieves the following results on the test set:

  • Overall Precision: 0.9249
  • Overall Recall: 0.9251
  • Overall F1: 0.925
  • Overall Accuracy: 0.951

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
No log 1.0 386 0.2340 0.8956 0.8507 0.8715 0.9239
0.3379 2.0 772 0.2101 0.9057 0.8904 0.8962 0.9342
0.1603 3.0 1158 0.2231 0.9252 0.8896 0.9065 0.9367
0.1079 4.0 1544 0.2013 0.9272 0.8902 0.9070 0.9420
0.1079 5.0 1930 0.2179 0.9031 0.9179 0.9103 0.9425
0.0701 6.0 2316 0.2330 0.9075 0.9165 0.9114 0.9435
0.051 7.0 2702 0.2433 0.9117 0.9190 0.9150 0.9432
0.0384 8.0 3088 0.2545 0.9001 0.9167 0.9078 0.9439
0.0384 9.0 3474 0.2629 0.9164 0.9159 0.9158 0.9444
0.0293 10.0 3860 0.2881 0.9263 0.9096 0.9178 0.9427
0.022 11.0 4246 0.2882 0.9167 0.9222 0.9191 0.9450
0.0171 12.0 4632 0.3028 0.9203 0.9152 0.9177 0.9447
0.0143 13.0 5018 0.3236 0.9155 0.9167 0.9158 0.9440
0.0143 14.0 5404 0.3301 0.9237 0.9163 0.9199 0.9444
0.0109 15.0 5790 0.3290 0.9187 0.9154 0.9169 0.9442
0.0092 16.0 6176 0.3308 0.9213 0.9178 0.9194 0.9448
0.0075 17.0 6562 0.3501 0.9273 0.9142 0.9206 0.9445
0.0075 18.0 6948 0.3520 0.9200 0.9184 0.9190 0.9447
0.0062 19.0 7334 0.3524 0.9238 0.9183 0.9210 0.9458
0.0051 20.0 7720 0.3553 0.9189 0.9188 0.9187 0.9451

Framework versions

  • Transformers 4.21.2
  • Pytorch 1.7.1
  • Datasets 2.5.1
  • Tokenizers 0.12.1
Downloads last month
14
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.