pixel-small-wlr

This model is a fine-tuned version of on the wikipedia + bookcorpus dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8048

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 64
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 512
  • total_eval_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.05
  • training_steps: 500000

Training results

Training Loss Epoch Step Validation Loss
0.7216 0.03 1000 0.9141
0.7114 0.05 2000 0.9156
0.711 0.08 3000 0.9157
0.7102 0.1 4000 0.8821
0.709 0.13 5000 0.8774
0.7083 0.15 6000 0.8945
0.6516 0.18 7000 0.8945
0.6042 0.2 8000 0.8856
0.5732 0.23 9000 0.8907
0.5506 0.25 10000 0.8922
0.5385 0.28 11000 0.8904
0.5276 0.31 12000 0.8865
0.517 0.33 13000 0.8837
0.5077 0.36 14000 0.8863
0.498 0.38 15000 0.8770
0.4897 0.41 16000 0.8794
0.4791 0.43 17000 0.8796
0.4698 0.46 18000 0.8754
0.4592 0.48 19000 0.8825
0.4489 0.51 20000 0.8787
0.439 0.54 21000 0.8742
0.4292 0.56 22000 0.8849
0.4212 0.59 23000 0.8842
0.4142 0.61 24000 0.8812
0.4076 0.64 25000 0.8659
0.4017 0.66 26000 0.8744
0.3958 0.69 27000 0.8822
0.3907 0.71 28000 0.8762
0.3863 0.74 29000 0.8758
0.382 0.76 30000 0.8755
0.378 0.79 31000 0.8781
0.3748 0.82 32000 0.8815
0.3716 0.84 33000 0.8689
0.3689 0.87 34000 0.8759
0.3665 0.89 35000 0.8690
0.364 0.92 36000 0.8696
0.3614 0.94 37000 0.8684
0.3592 0.97 38000 0.8598
0.3571 0.99 39000 0.8572
0.3555 1.02 40000 0.8637
0.3535 1.04 41000 0.8638
0.3518 1.07 42000 0.8665
0.3502 1.1 43000 0.8559
0.3488 1.12 44000 0.8600
0.3469 1.15 45000 0.8528
0.3459 1.17 46000 0.8598
0.3444 1.2 47000 0.8607
0.3428 1.22 48000 0.8650
0.342 1.25 49000 0.8640
0.3408 1.27 50000 0.8549
0.3398 1.3 51000 0.8630
0.3387 1.33 52000 0.8541
0.3373 1.35 53000 0.8588
0.3368 1.38 54000 0.8639
0.3357 1.4 55000 0.8546
0.335 1.43 56000 0.8535
0.334 1.45 57000 0.8511
0.333 1.48 58000 0.8525
0.3322 1.5 59000 0.8536
0.3314 1.53 60000 0.8422
0.3307 1.55 61000 0.8627
0.3298 1.58 62000 0.8435
0.3292 1.61 63000 0.8569
0.3287 1.63 64000 0.8517
0.3278 1.66 65000 0.8488
0.3269 1.68 66000 0.8470
0.3262 1.71 67000 0.8487
0.3257 1.73 68000 0.8430
0.325 1.76 69000 0.8382
0.3248 1.78 70000 0.8480
0.3238 1.81 71000 0.8432
0.3235 1.83 72000 0.8572
0.3227 1.86 73000 0.8466
0.3224 1.89 74000 0.8524
0.3217 1.91 75000 0.8451
0.321 1.94 76000 0.8453
0.3207 1.96 77000 0.8389
0.3202 1.99 78000 0.8391
0.3195 2.01 79000 0.8535
0.3194 2.04 80000 0.8578
0.3194 2.06 81000 0.8517
0.3185 2.09 82000 0.8537
0.3183 2.12 83000 0.8353
0.3177 2.14 84000 0.8457
0.3174 2.17 85000 0.8515
0.3165 2.19 86000 0.8357
0.3164 2.22 87000 0.8555
0.3159 2.24 88000 0.8426
0.3159 2.27 89000 0.8529
0.3152 2.29 90000 0.8297
0.3149 2.32 91000 0.8462
0.3144 2.34 92000 0.8435
0.3141 2.37 93000 0.8363
0.3139 2.4 94000 0.8435
0.3139 2.42 95000 0.8472
0.3131 2.45 96000 0.8396
0.3125 2.47 97000 0.8420
0.3128 2.5 98000 0.8410
0.3121 2.52 99000 0.8418
0.3118 2.55 100000 0.8400
0.3112 2.57 101000 0.8347
0.3112 2.6 102000 0.8289
0.3112 2.63 103000 0.8456
0.3107 2.65 104000 0.8414
0.3101 2.68 105000 0.8327
0.3107 2.7 106000 0.8374
0.3103 2.73 107000 0.8471
0.3095 2.75 108000 0.8452
0.3094 2.78 109000 0.8513
0.3094 2.8 110000 0.8348
0.3089 2.83 111000 0.8334
0.3089 2.85 112000 0.8438
0.3088 2.88 113000 0.8328
0.3085 2.91 114000 0.8317
0.3097 2.93 115000 0.8462
0.3082 2.96 116000 0.8436
0.3077 2.98 117000 0.8436
0.3086 3.01 118000 0.8483
0.3072 3.03 119000 0.8355
0.3066 3.06 120000 0.8281
0.3072 3.08 121000 0.8393
0.3063 3.11 122000 0.8436
0.3061 3.13 123000 0.8346
0.3059 3.16 124000 0.8408
0.3062 3.19 125000 0.8384
0.307 3.21 126000 0.8374
0.3056 3.24 127000 0.8240
0.3049 3.26 128000 0.8263
0.3068 3.29 129000 0.8301
0.3053 3.31 130000 0.8350
0.3048 3.34 131000 0.8295
0.3051 3.36 132000 0.8297
0.3045 3.39 133000 0.8295
0.3043 3.42 134000 0.8245
0.3037 3.44 135000 0.8189
0.3042 3.47 136000 0.8286
0.3038 3.49 137000 0.8326
0.3033 3.52 138000 0.8184
0.3035 3.54 139000 0.8136
0.3027 3.57 140000 0.8287
0.303 3.59 141000 0.8184
0.3027 3.62 142000 0.8444
0.3024 3.64 143000 0.8402
0.3027 3.67 144000 0.8280
0.3029 3.7 145000 0.8255
0.3023 3.72 146000 0.8287
0.3024 3.75 147000 0.8176
0.302 3.77 148000 0.8372
0.3019 3.8 149000 0.8221
0.3016 3.82 150000 0.8251
0.3014 3.85 151000 0.8370
0.3012 3.87 152000 0.8285
0.3012 3.9 153000 0.8453
0.3007 3.92 154000 0.8195
0.3009 3.95 155000 0.8309
0.3007 3.98 156000 0.8357
0.3003 4.0 157000 0.8225
0.3014 4.03 158000 0.8343
0.3005 4.05 159000 0.8267
0.2994 4.08 160000 0.8258
0.2996 4.1 161000 0.8267
0.301 4.13 162000 0.8216
0.2987 4.15 163000 0.8304
0.2989 4.18 164000 0.8385
0.2995 4.21 165000 0.8305
0.2998 4.23 166000 0.8391
0.2991 4.26 167000 0.8364
0.2994 4.28 168000 0.8259
0.2977 4.31 169000 0.8347
0.2989 4.33 170000 0.8346
0.2997 4.36 171000 0.8418
0.2975 4.38 172000 0.8321
0.2988 4.41 173000 0.8193
0.2979 4.43 174000 0.8213
0.2973 4.46 175000 0.8220
0.2967 4.49 176000 0.8286
0.2969 4.51 177000 0.8219
0.2966 4.54 178000 0.8279
0.2966 4.56 179000 0.8254
0.2968 4.59 180000 0.8309
0.2962 4.61 181000 0.8313
0.2968 4.64 182000 0.8232
0.2967 4.66 183000 0.8215
0.2958 4.69 184000 0.8171
0.2958 4.71 185000 0.8280
0.2958 4.74 186000 0.8222
0.2958 4.77 187000 0.8303
0.2965 4.79 188000 0.8213
0.2958 4.82 189000 0.8167
0.297 4.84 190000 0.8272
0.2959 4.87 191000 0.8258
0.295 4.89 192000 0.8217
0.295 4.92 193000 0.8130
0.2968 4.94 194000 0.8097
0.2947 4.97 195000 0.8070
0.2941 5.0 196000 0.8227
0.294 5.02 197000 0.8133
0.2947 5.05 198000 0.8142
0.2941 5.07 199000 0.8159
0.294 5.1 200000 0.8274
0.2941 5.12 201000 0.8195
0.2938 5.15 202000 0.8285
0.2934 5.17 203000 0.8159
0.2932 5.2 204000 0.8073
0.2946 5.22 205000 0.8255
0.2939 5.25 206000 0.8250
0.2933 5.28 207000 0.8215
0.2927 5.3 208000 0.8153
0.2931 5.33 209000 0.8284
0.2928 5.35 210000 0.8204
0.2923 5.38 211000 0.8265
0.2925 5.4 212000 0.8269
0.2926 5.43 213000 0.8337
0.292 5.45 214000 0.8255
0.292 5.48 215000 0.8224
0.2915 5.5 216000 0.8217
0.2916 5.53 217000 0.8251
0.291 5.56 218000 0.8244
0.2918 5.58 219000 0.8229
0.2911 5.61 220000 0.8245
0.2911 5.63 221000 0.8201
0.2913 5.66 222000 0.8082
0.2912 5.68 223000 0.8194
0.2908 5.71 224000 0.8260
0.291 5.73 225000 0.8226
0.2908 5.76 226000 0.8231
0.2903 5.79 227000 0.8101
0.2917 5.81 228000 0.8148
0.2915 5.84 229000 0.8212
0.2901 5.86 230000 0.8126
0.2898 5.89 231000 0.8182
0.29 5.91 232000 0.8150
0.2905 5.94 233000 0.8126
0.2894 5.96 234000 0.8208
0.2894 5.99 235000 0.8262
0.2899 6.01 236000 0.8133
0.2891 6.04 237000 0.8039
0.2887 6.07 238000 0.8182
0.2889 6.09 239000 0.8066
0.2889 6.12 240000 0.8129
0.2899 6.14 241000 0.8204
0.2894 6.17 242000 0.8142
0.2893 6.19 243000 0.8167
0.2883 6.22 244000 0.8152
0.2882 6.24 245000 0.8129
0.2883 6.27 246000 0.8146
0.2886 6.29 247000 0.8157
0.2886 6.32 248000 0.8172
0.2885 6.35 249000 0.8210
0.2886 6.37 250000 0.8213
0.2877 6.4 251000 0.8104
0.2872 6.42 252000 0.8114
0.2871 6.45 253000 0.8148
0.2875 6.47 254000 0.8127
0.287 6.5 255000 0.8201
0.2869 6.52 256000 0.8101
0.2868 6.55 257000 0.8142
0.2869 6.58 258000 0.8158
0.2868 6.6 259000 0.8125
0.2865 6.63 260000 0.8167
0.2871 6.65 261000 0.8194
0.2863 6.68 262000 0.8059
0.2864 6.7 263000 0.8195
0.2863 6.73 264000 0.8099
0.2868 6.75 265000 0.8127
0.2863 6.78 266000 0.8069
0.2854 6.8 267000 0.8033
0.2855 6.83 268000 0.8097
0.2864 6.86 269000 0.8096
0.2865 6.88 270000 0.8194
0.2852 6.91 271000 0.8104
0.2852 6.93 272000 0.8214
0.2848 6.96 273000 0.8105
0.2857 6.98 274000 0.8124
0.2849 7.01 275000 0.8164
0.2848 7.03 276000 0.8183
0.2847 7.06 277000 0.8188
0.2846 7.08 278000 0.8136
0.2847 7.11 279000 0.8129
0.2845 7.14 280000 0.8166
0.2836 7.16 281000 0.8175
0.2839 7.19 282000 0.8130
0.284 7.21 283000 0.8058
0.2839 7.24 284000 0.8161
0.2842 7.26 285000 0.8232
0.2835 7.29 286000 0.8186
0.2837 7.31 287000 0.8180
0.2835 7.34 288000 0.8165
0.2835 7.37 289000 0.8122
0.2832 7.39 290000 0.8192
0.2829 7.42 291000 0.8085
0.2827 7.44 292000 0.8086
0.2829 7.47 293000 0.8102
0.2829 7.49 294000 0.8082
0.2828 7.52 295000 0.8098
0.2828 7.54 296000 0.8034
0.2831 7.57 297000 0.8072
0.2825 7.59 298000 0.8063
0.282 7.62 299000 0.8125
0.2823 7.65 300000 0.8154
0.2818 7.67 301000 0.8139
0.2818 7.7 302000 0.8098
0.2826 7.72 303000 0.8181
0.2825 7.75 304000 0.8146
0.2813 7.77 305000 0.8216
0.2814 7.8 306000 0.8134
0.2808 7.82 307000 0.8111
0.2808 7.85 308000 0.8111
0.2811 7.88 309000 0.8077
0.2812 7.9 310000 0.8111
0.281 7.93 311000 0.8070
0.2807 7.95 312000 0.8041
0.2811 7.98 313000 0.8100
0.2821 8.0 314000 0.8284
0.2808 8.03 315000 0.8073
0.2805 8.05 316000 0.8141
0.2801 8.08 317000 0.8067
0.28 8.1 318000 0.8123
0.2802 8.13 319000 0.8078
0.2799 8.16 320000 0.8211
0.28 8.18 321000 0.8135
0.2796 8.21 322000 0.8164
0.2793 8.23 323000 0.8119
0.2791 8.26 324000 0.8065
0.2793 8.28 325000 0.8142
0.2794 8.31 326000 0.8038
0.2792 8.33 327000 0.8117
0.2789 8.36 328000 0.8118
0.2793 8.38 329000 0.8092
0.279 8.41 330000 0.8081
0.2792 8.44 331000 0.8179
0.2788 8.46 332000 0.8141
0.2785 8.49 333000 0.8112
0.2786 8.51 334000 0.8080
0.2788 8.54 335000 0.8106
0.279 8.56 336000 0.8106
0.2781 8.59 337000 0.8100
0.278 8.61 338000 0.8252
0.2777 8.64 339000 0.8137
0.2778 8.67 340000 0.8187
0.2773 8.69 341000 0.8103
0.2779 8.72 342000 0.8094
0.2777 8.74 343000 0.8024
0.277 8.77 344000 0.8033
0.2771 8.79 345000 0.8085
0.2773 8.82 346000 0.8130
0.2775 8.84 347000 0.8052
0.2769 8.87 348000 0.8048
0.2769 8.89 349000 0.8069
0.2774 8.92 350000 0.8126
0.2766 8.95 351000 0.8036
0.2765 8.97 352000 0.8100
0.2762 9.0 353000 0.8091
0.2765 9.02 354000 0.8081
0.2763 9.05 355000 0.8072
0.2763 9.07 356000 0.8050
0.2763 9.1 357000 0.8132
0.2758 9.12 358000 0.8092
0.2758 9.15 359000 0.8033
0.2757 9.17 360000 0.8122
0.2757 9.2 361000 0.8061
0.2754 9.23 362000 0.8106
0.2755 9.25 363000 0.8048
0.2753 9.28 364000 0.8104
0.2753 9.3 365000 0.8095
0.2753 9.33 366000 0.8097
0.2752 9.35 367000 0.8090
0.2749 9.38 368000 0.8059
0.2749 9.4 369000 0.8114
0.2747 9.43 370000 0.8089
0.2745 9.46 371000 0.8080
0.2745 9.48 372000 0.8102
0.2747 9.51 373000 0.8059
0.2742 9.53 374000 0.8085
0.2742 9.56 375000 0.8031
0.274 9.58 376000 0.8067
0.274 9.61 377000 0.8057
0.274 9.63 378000 0.8031
0.2738 9.66 379000 0.8067
0.2737 9.68 380000 0.8090
0.2736 9.71 381000 0.8044
0.2739 9.74 382000 0.8078
0.2729 9.76 383000 0.8075
0.2735 9.79 384000 0.8107
0.2729 9.81 385000 0.8120
0.2731 9.84 386000 0.8059
0.2727 9.86 387000 0.8082
0.2726 9.89 388000 0.8090
0.2727 9.91 389000 0.8020
0.273 9.94 390000 0.8115
0.2727 9.96 391000 0.8077
0.2726 9.99 392000 0.8175
0.2722 10.02 393000 0.8073
0.2725 10.04 394000 0.8089
0.2721 10.07 395000 0.8181
0.2722 10.09 396000 0.8067
0.2721 10.12 397000 0.8155
0.2718 10.14 398000 0.8150
0.272 10.17 399000 0.8131
0.2721 10.19 400000 0.8092
0.2715 10.22 401000 0.8083
0.2717 10.25 402000 0.8100
0.2715 10.27 403000 0.8108
0.2715 10.3 404000 0.8090
0.2716 10.32 405000 0.8160
0.2712 10.35 406000 0.8142
0.2712 10.37 407000 0.8071
0.2712 10.4 408000 0.8115
0.2709 10.42 409000 0.8093
0.271 10.45 410000 0.8109
0.2712 10.47 411000 0.8162
0.2709 10.5 412000 0.8158
0.2706 10.53 413000 0.8103
0.2709 10.55 414000 0.8069
0.2706 10.58 415000 0.8130
0.2706 10.6 416000 0.8126
0.2704 10.63 417000 0.8181
0.2704 10.65 418000 0.8100
0.2702 10.68 419000 0.8089
0.2702 10.7 420000 0.8133
0.2699 10.73 421000 0.8155
0.2701 10.75 422000 0.8139
0.2701 10.78 423000 0.8133
0.2701 10.81 424000 0.8100
0.2696 10.83 425000 0.8077
0.2696 10.86 426000 0.8097
0.2698 10.88 427000 0.8036
0.2698 10.91 428000 0.8067
0.2699 10.93 429000 0.8131
0.2695 10.96 430000 0.8059
0.2695 10.98 431000 0.8142
0.2693 11.01 432000 0.8080
0.2695 11.04 433000 0.8101
0.2692 11.06 434000 0.8111
0.2693 11.09 435000 0.8064
0.2689 11.11 436000 0.8066
0.2688 11.14 437000 0.8145
0.2691 11.16 438000 0.8088
0.2689 11.19 439000 0.8115
0.2688 11.21 440000 0.8066
0.2689 11.24 441000 0.8038
0.2687 11.26 442000 0.8066
0.2688 11.29 443000 0.8125
0.2686 11.32 444000 0.8055
0.2686 11.34 445000 0.8065
0.2685 11.37 446000 0.8134
0.2684 11.39 447000 0.8068
0.2683 11.42 448000 0.8086
0.2684 11.44 449000 0.8025
0.2682 11.47 450000 0.8073
0.2682 11.49 451000 0.8042
0.2683 11.52 452000 0.8097
0.2678 11.54 453000 0.8062
0.2678 11.57 454000 0.8084
0.2681 11.6 455000 0.8135
0.2678 11.62 456000 0.8098
0.2681 11.65 457000 0.8079
0.2679 11.67 458000 0.8052
0.268 11.7 459000 0.8038
0.268 11.72 460000 0.8100
0.2677 11.75 461000 0.8057
0.2676 11.77 462000 0.8142
0.2679 11.8 463000 0.8076
0.2676 11.83 464000 0.8087
0.2677 11.85 465000 0.8066
0.2673 11.88 466000 0.8059
0.2676 11.9 467000 0.8067
0.2675 11.93 468000 0.8043
0.2675 11.95 469000 0.8103
0.2673 11.98 470000 0.8092
0.2676 12.0 471000 0.8069
0.2673 12.03 472000 0.8062
0.2673 12.05 473000 0.8025
0.2672 12.08 474000 0.8044
0.2671 12.11 475000 0.8068
0.2672 12.13 476000 0.8039
0.2673 12.16 477000 0.8078
0.2671 12.18 478000 0.8061
0.2673 12.21 479000 0.8022
0.267 12.23 480000 0.8065
0.2672 12.26 481000 0.8035
0.267 12.28 482000 0.8039
0.2669 12.31 483000 0.8074
0.267 12.33 484000 0.8040
0.267 12.36 485000 0.8028
0.2668 12.39 486000 0.8055
0.2669 12.41 487000 0.8062
0.2669 12.44 488000 0.8053
0.267 12.46 489000 0.8089
0.267 12.49 490000 0.8081
0.267 12.51 491000 0.8053
0.2668 12.54 492000 0.8053
0.2671 12.56 493000 0.8097
0.267 12.59 494000 0.8088
0.2669 12.62 495000 0.8081
0.2667 12.64 496000 0.8047
0.2667 12.67 497000 0.8043
0.2669 12.69 498000 0.8051
0.2669 12.72 499000 0.8085
0.2666 12.74 500000 0.8054

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.11.0
  • Datasets 2.5.0
  • Tokenizers 0.12.1
Downloads last month
8
Inference API
Unable to determine this model’s pipeline type. Check the docs .