BERiT_2000_2_layers_40_epochs

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.8375

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40
  • label_smoothing_factor: 0.2

Training results

Training Loss Epoch Step Validation Loss
15.0851 0.19 500 8.5468
7.8971 0.39 1000 7.3376
7.3108 0.58 1500 7.1632
7.134 0.77 2000 7.0700
7.0956 0.97 2500 7.0723
7.0511 1.16 3000 6.9560
7.0313 1.36 3500 6.9492
7.0028 1.55 4000 6.9048
6.9563 1.74 4500 6.8456
6.9214 1.94 5000 6.8019
11.1596 2.13 5500 7.5882
7.5824 2.32 6000 7.1291
7.2581 2.52 6500 7.1123
7.2232 2.71 7000 7.1059
7.1734 2.9 7500 7.1120
7.1504 3.1 8000 7.0946
7.1314 3.29 8500 7.0799
7.1236 3.49 9000 7.1175
7.1275 3.68 9500 7.0905
7.1087 3.87 10000 7.0839
7.1212 4.07 10500 7.0822
7.1136 4.26 11000 7.0703
7.1025 4.45 11500 7.1035
7.0931 4.65 12000 7.0759
7.0899 4.84 12500 7.0883
7.0834 5.03 13000 7.1307
7.0761 5.23 13500 7.0642
7.0706 5.42 14000 7.0324
7.0678 5.62 14500 7.0704
7.0614 5.81 15000 7.0317
7.0569 6.0 15500 7.0421
7.057 6.2 16000 7.0250
7.0503 6.39 16500 7.0129
7.0529 6.58 17000 7.0316
7.0453 6.78 17500 7.0436
7.0218 6.97 18000 7.0064
7.0415 7.16 18500 7.0385
7.0338 7.36 19000 6.9756
7.0488 7.55 19500 7.0054
7.0347 7.75 20000 6.9946
7.0464 7.94 20500 7.0055
7.017 8.13 21000 7.0158
7.0159 8.33 21500 7.0052
7.0223 8.52 22000 6.9925
6.9989 8.71 22500 7.0307
7.0218 8.91 23000 6.9767
6.9998 9.1 23500 7.0096
7.01 9.3 24000 6.9599
6.9964 9.49 24500 6.9896
6.9906 9.68 25000 6.9903
7.0336 9.88 25500 6.9807
7.0053 10.07 26000 6.9776
6.9826 10.26 26500 6.9836
6.9897 10.46 27000 6.9886
6.9829 10.65 27500 6.9991
6.9849 10.84 28000 6.9651
6.9901 11.04 28500 6.9822
6.9852 11.23 29000 6.9921
6.9757 11.43 29500 6.9636
6.991 11.62 30000 6.9952
6.9818 11.81 30500 6.9799
6.9911 12.01 31000 6.9725
6.9423 12.2 31500 6.9540
6.9885 12.39 32000 6.9771
6.9636 12.59 32500 6.9475
6.9567 12.78 33000 6.9653
6.9749 12.97 33500 6.9711
6.9739 13.17 34000 6.9691
6.9651 13.36 34500 6.9569
6.9599 13.56 35000 6.9608
6.957 13.75 35500 6.9531
6.9539 13.94 36000 6.9704
6.958 14.14 36500 6.9478
6.9597 14.33 37000 6.9510
6.9466 14.52 37500 6.9625
6.9518 14.72 38000 6.9787
6.9509 14.91 38500 6.9391
6.9505 15.1 39000 6.9694
6.9311 15.3 39500 6.9440
6.9513 15.49 40000 6.9425
6.9268 15.69 40500 6.9223
6.9415 15.88 41000 6.9435
6.9308 16.07 41500 6.9281
6.9216 16.27 42000 6.9415
6.9265 16.46 42500 6.9164
6.9023 16.65 43000 6.9237
6.9407 16.85 43500 6.9100
6.9211 17.04 44000 6.9295
6.9147 17.23 44500 6.9131
6.9224 17.43 45000 6.9188
6.9215 17.62 45500 6.9077
6.915 17.82 46000 6.9371
6.906 18.01 46500 6.8932
6.91 18.2 47000 6.9100
6.8999 18.4 47500 6.9251
6.9113 18.59 48000 6.9078
6.9197 18.78 48500 6.9099
6.8985 18.98 49000 6.9074
6.9009 19.17 49500 6.8971
6.8937 19.36 50000 6.8982
6.9094 19.56 50500 6.9077
6.9069 19.75 51000 6.9006
6.8991 19.95 51500 6.8912
6.8924 20.14 52000 6.8881
6.899 20.33 52500 6.8899
6.9028 20.53 53000 6.8938
6.8997 20.72 53500 6.8822
6.8943 20.91 54000 6.9005
6.8804 21.11 54500 6.9048
6.8848 21.3 55000 6.9062
6.9072 21.49 55500 6.9104
6.8783 21.69 56000 6.9069
6.8879 21.88 56500 6.8938
6.8922 22.08 57000 6.8797
6.8892 22.27 57500 6.9168
6.8863 22.46 58000 6.8820
6.8822 22.66 58500 6.9130
6.8752 22.85 59000 6.8973
6.8823 23.04 59500 6.8933
6.8813 23.24 60000 6.8919
6.8787 23.43 60500 6.8855
6.8886 23.63 61000 6.8956
6.8744 23.82 61500 6.9092
6.8799 24.01 62000 6.8944
6.879 24.21 62500 6.8850
6.8797 24.4 63000 6.8782
6.8724 24.59 63500 6.8691
6.8803 24.79 64000 6.8965
6.8899 24.98 64500 6.8986
6.8873 25.17 65000 6.9034
6.8777 25.37 65500 6.8658
6.8784 25.56 66000 6.8803
6.8791 25.76 66500 6.8727
6.8736 25.95 67000 6.8832
6.8865 26.14 67500 6.8811
6.8668 26.34 68000 6.8817
6.8709 26.53 68500 6.8945
6.8755 26.72 69000 6.8777
6.8635 26.92 69500 6.8747
6.8752 27.11 70000 6.8875
6.8729 27.3 70500 6.8696
6.8728 27.5 71000 6.8659
6.8692 27.69 71500 6.8856
6.868 27.89 72000 6.8689
6.8668 28.08 72500 6.8877
6.8576 28.27 73000 6.8783
6.8633 28.47 73500 6.8828
6.8737 28.66 74000 6.8717
6.8702 28.85 74500 6.8485
6.8785 29.05 75000 6.8771
6.8818 29.24 75500 6.8815
6.8647 29.43 76000 6.8877
6.8574 29.63 76500 6.8920
6.8474 29.82 77000 6.8936
6.8558 30.02 77500 6.8768
6.8645 30.21 78000 6.8921
6.8786 30.4 78500 6.8604
6.8693 30.6 79000 6.8603
6.855 30.79 79500 6.8559
6.8429 30.98 80000 6.8746
6.8688 31.18 80500 6.8774
6.8735 31.37 81000 6.8643
6.8541 31.56 81500 6.8767
6.8695 31.76 82000 6.8804
6.8607 31.95 82500 6.8674
6.8538 32.15 83000 6.8572
6.8472 32.34 83500 6.8683
6.8763 32.53 84000 6.8758
6.8405 32.73 84500 6.8764
6.8658 32.92 85000 6.8614
6.8834 33.11 85500 6.8641
6.8554 33.31 86000 6.8787
6.8738 33.5 86500 6.8747
6.848 33.69 87000 6.8699
6.8621 33.89 87500 6.8654
6.8543 34.08 88000 6.8639
6.8606 34.28 88500 6.8852
6.8666 34.47 89000 6.8840
6.8717 34.66 89500 6.8773
6.854 34.86 90000 6.8671
6.8526 35.05 90500 6.8762
6.8592 35.24 91000 6.8644
6.8641 35.44 91500 6.8599
6.8655 35.63 92000 6.8622
6.8557 35.82 92500 6.8671
6.8546 36.02 93000 6.8573
6.853 36.21 93500 6.8542
6.8597 36.41 94000 6.8518
6.8576 36.6 94500 6.8700
6.8549 36.79 95000 6.8628
6.8576 36.99 95500 6.8695
6.8505 37.18 96000 6.8870
6.8564 37.37 96500 6.8898
6.8627 37.57 97000 6.8619
6.8502 37.76 97500 6.8696
6.8548 37.96 98000 6.8663
6.8512 38.15 98500 6.8683
6.8484 38.34 99000 6.8605
6.8581 38.54 99500 6.8749
6.8525 38.73 100000 6.8849
6.8375 38.92 100500 6.8712
6.8423 39.12 101000 6.8905
6.8559 39.31 101500 6.8574
6.8441 39.5 102000 6.8722
6.8467 39.7 102500 6.8550
6.8389 39.89 103000 6.8375

Framework versions

  • Transformers 4.24.0
  • Pytorch 1.12.1+cu113
  • Datasets 2.7.0
  • Tokenizers 0.13.2
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.