Edit model card

gpt_train_2_768

This model is a fine-tuned version of openai-community/gpt2 on the gokuls/wiki_book_corpus_raw_dataset_tiny dataset. It achieves the following results on the evaluation set:

  • Loss: 7.4883
  • Accuracy: 0.1039

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 10
  • distributed_type: multi-GPU
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
10.9688 0.0001 1 10.9688 0.0000
10.9609 0.0002 2 10.9688 0.0000
10.9609 0.0003 3 10.9688 0.0000
10.9609 0.0004 4 10.9688 0.0000
10.9609 0.0005 5 10.9688 0.0000
10.9688 0.0006 6 10.9688 0.0000
10.9609 0.0007 7 10.9688 0.0000
10.9609 0.0008 8 10.9688 0.0000
10.9688 0.0009 9 10.9688 0.0000
10.9531 0.0010 10 10.9688 0.0000
10.9688 0.0011 11 10.9688 0.0000
10.9688 0.0012 12 10.9688 0.0000
10.9531 0.0013 13 10.9688 0.0000
10.9609 0.0014 14 10.9688 0.0000
10.9688 0.0015 15 10.9688 0.0000
10.9766 0.0015 16 10.9688 0.0000
10.9688 0.0016 17 10.9688 0.0000
10.9609 0.0017 18 10.8828 0.0007
10.8906 0.0018 19 10.8047 0.0051
10.8359 0.0019 20 10.7188 0.0112
10.75 0.0020 21 10.6484 0.0175
10.6719 0.0021 22 10.5781 0.0280
10.6172 0.0022 23 10.5 0.0392
10.5391 0.0023 24 10.4375 0.0447
10.5078 0.0024 25 10.3828 0.0478
10.4609 0.0025 26 10.3125 0.0499
10.3906 0.0026 27 10.2656 0.0511
10.3281 0.0027 28 10.2109 0.0521
10.2656 0.0028 29 10.1641 0.0531
10.25 0.0029 30 10.1172 0.0537
10.2031 0.0030 31 10.0703 0.0544
10.1641 0.0031 32 10.0312 0.0552
10.125 0.0032 33 9.9922 0.0558
10.0859 0.0033 34 9.9609 0.0562
10.0391 0.0034 35 9.9219 0.0566
10.0156 0.0035 36 9.8906 0.0568
9.9609 0.0036 37 9.8594 0.0567
9.9141 0.0037 38 9.8359 0.0566
9.875 0.0038 39 9.8047 0.0568
9.8672 0.0039 40 9.7812 0.0569
9.8438 0.0040 41 9.7578 0.0568
9.7969 0.0041 42 9.7344 0.0565
9.8203 0.0042 43 9.7109 0.0564
9.7891 0.0043 44 9.6875 0.0564
9.7031 0.0044 45 9.6719 0.0566
9.7344 0.0045 46 9.6484 0.0569
9.7266 0.0046 47 9.6328 0.0573
9.7031 0.0046 48 9.6172 0.0579
9.7109 0.0047 49 9.6016 0.0585
9.6406 0.0048 50 9.5781 0.0591
9.6797 0.0049 51 9.5625 0.0597
9.6328 0.0050 52 9.5469 0.0605
9.6172 0.0051 53 9.5312 0.0612
9.6172 0.0052 54 9.5234 0.0615
9.5703 0.0053 55 9.5078 0.0617
9.5781 0.0054 56 9.4922 0.0618
9.5938 0.0055 57 9.4766 0.0620
9.5391 0.0056 58 9.4688 0.0621
9.4922 0.0057 59 9.4531 0.0620
9.4688 0.0058 60 9.4375 0.0620
9.4922 0.0059 61 9.4297 0.0620
9.4609 0.0060 62 9.4141 0.0620
9.4297 0.0061 63 9.4062 0.0620
9.4844 0.0062 64 9.3906 0.0620
9.4531 0.0063 65 9.3828 0.0622
9.4375 0.0064 66 9.3672 0.0625
9.4375 0.0065 67 9.3594 0.0628
9.3984 0.0066 68 9.3438 0.0630
9.4062 0.0067 69 9.3359 0.0632
9.3984 0.0068 70 9.3203 0.0633
9.4375 0.0069 71 9.3125 0.0633
9.3828 0.0070 72 9.3047 0.0634
9.3594 0.0071 73 9.2891 0.0634
9.3438 0.0072 74 9.2812 0.0634
9.3672 0.0073 75 9.2734 0.0634
9.3125 0.0074 76 9.2578 0.0634
9.3047 0.0075 77 9.25 0.0633
9.2969 0.0076 78 9.2422 0.0632
9.2891 0.0077 79 9.2266 0.0631
9.2812 0.0077 80 9.2188 0.0631
9.2656 0.0078 81 9.2109 0.0632
9.2422 0.0079 82 9.2031 0.0633
9.2656 0.0080 83 9.1875 0.0635
9.25 0.0081 84 9.1797 0.0637
9.2344 0.0082 85 9.1719 0.0639
9.2266 0.0083 86 9.1562 0.0640
9.25 0.0084 87 9.1484 0.0641
9.1406 0.0085 88 9.1406 0.0641
9.1562 0.0086 89 9.1328 0.0642
9.2031 0.0087 90 9.1172 0.0641
9.1406 0.0088 91 9.1094 0.0642
9.1406 0.0089 92 9.1016 0.0643
9.1406 0.0090 93 9.0938 0.0644
9.1328 0.0091 94 9.0781 0.0644
9.125 0.0092 95 9.0703 0.0645
9.1016 0.0093 96 9.0625 0.0646
9.125 0.0094 97 9.0547 0.0648
9.0625 0.0095 98 9.0391 0.0652
9.0859 0.0096 99 9.0312 0.0655
9.0547 0.0097 100 9.0234 0.0657
9.0547 0.0098 101 9.0156 0.0658
9.0625 0.0099 102 9.0078 0.0659
9.0547 0.0100 103 8.9922 0.0661
9.0156 0.0101 104 8.9844 0.0662
9.0391 0.0102 105 8.9766 0.0664
9.0234 0.0103 106 8.9688 0.0664
9.0234 0.0104 107 8.9609 0.0664
8.9766 0.0105 108 8.9453 0.0664
8.9922 0.0106 109 8.9375 0.0665
8.9453 0.0107 110 8.9297 0.0665
8.9609 0.0108 111 8.9219 0.0664
8.9766 0.0108 112 8.9141 0.0664
8.9844 0.0109 113 8.8984 0.0666
8.9453 0.0110 114 8.8906 0.0669
8.9688 0.0111 115 8.8828 0.0673
8.9766 0.0112 116 8.875 0.0677
8.9297 0.0113 117 8.8672 0.0682
8.9297 0.0114 118 8.8594 0.0689
8.8672 0.0115 119 8.8516 0.0694
8.8906 0.0116 120 8.8359 0.0700
8.8984 0.0117 121 8.8281 0.0703
8.8984 0.0118 122 8.8203 0.0704
8.8828 0.0119 123 8.8125 0.0706
8.8594 0.0120 124 8.8047 0.0707
8.8281 0.0121 125 8.7969 0.0708
8.8359 0.0122 126 8.7812 0.0710
8.8359 0.0123 127 8.7734 0.0711
8.8281 0.0124 128 8.7656 0.0710
8.8438 0.0125 129 8.7578 0.0707
8.7578 0.0126 130 8.75 0.0702
8.7812 0.0127 131 8.7422 0.0698
8.7734 0.0128 132 8.7344 0.0697
8.7812 0.0129 133 8.7266 0.0701
8.7891 0.0130 134 8.7188 0.0707
8.7656 0.0131 135 8.7031 0.0713
8.7891 0.0132 136 8.6953 0.0719
8.7188 0.0133 137 8.6875 0.0726
8.7266 0.0134 138 8.6797 0.0733
8.75 0.0135 139 8.6719 0.0737
8.7188 0.0136 140 8.6641 0.0740
8.7344 0.0137 141 8.6562 0.0742
8.6641 0.0138 142 8.6484 0.0742
8.7031 0.0139 143 8.6406 0.0741
8.6797 0.0139 144 8.6328 0.0741
8.6797 0.0140 145 8.6172 0.0739
8.6719 0.0141 146 8.6094 0.0736
8.6641 0.0142 147 8.6016 0.0736
8.6484 0.0143 148 8.5938 0.0737
8.6172 0.0144 149 8.5859 0.0741
8.6719 0.0145 150 8.5781 0.0746
8.6406 0.0146 151 8.5703 0.0750
8.6172 0.0147 152 8.5625 0.0754
8.6094 0.0148 153 8.5547 0.0756
8.6016 0.0149 154 8.5469 0.0756
8.5625 0.0150 155 8.5391 0.0755
8.5312 0.0151 156 8.5312 0.0756
8.5703 0.0152 157 8.5234 0.0756
8.6172 0.0153 158 8.5156 0.0757
8.5781 0.0154 159 8.5078 0.0757
8.6016 0.0155 160 8.5 0.0759
8.5547 0.0156 161 8.4922 0.0762
8.5547 0.0157 162 8.4844 0.0766
8.5312 0.0158 163 8.4766 0.0767
8.5 0.0159 164 8.4688 0.0767
8.5312 0.0160 165 8.4609 0.0766
8.5312 0.0161 166 8.4531 0.0766
8.4531 0.0162 167 8.4453 0.0767
8.4766 0.0163 168 8.4375 0.0768
8.4766 0.0164 169 8.4297 0.0770
8.4688 0.0165 170 8.4219 0.0772
8.4922 0.0166 171 8.4141 0.0775
8.4375 0.0167 172 8.4141 0.0777
8.4609 0.0168 173 8.4062 0.0777
8.4141 0.0169 174 8.3984 0.0777
8.4531 0.0170 175 8.3906 0.0778
8.3984 0.0170 176 8.3828 0.0778
8.4141 0.0171 177 8.375 0.0779
8.4453 0.0172 178 8.3672 0.0781
8.4219 0.0173 179 8.3594 0.0783
8.4219 0.0174 180 8.3516 0.0785
8.4062 0.0175 181 8.3438 0.0785
8.3984 0.0176 182 8.3359 0.0787
8.3828 0.0177 183 8.3281 0.0790
8.375 0.0178 184 8.3203 0.0792
8.3594 0.0179 185 8.3125 0.0795
8.375 0.0180 186 8.3125 0.0797
8.3125 0.0181 187 8.3047 0.0796
8.3438 0.0182 188 8.2969 0.0796
8.3281 0.0183 189 8.2891 0.0795
8.3359 0.0184 190 8.2812 0.0795
8.3047 0.0185 191 8.2734 0.0798
8.3359 0.0186 192 8.2656 0.0800
8.3047 0.0187 193 8.2578 0.0803
8.2969 0.0188 194 8.2578 0.0805
8.3203 0.0189 195 8.25 0.0807
8.2734 0.0190 196 8.2422 0.0809
8.25 0.0191 197 8.2344 0.0809
8.2734 0.0192 198 8.2266 0.0810
8.2109 0.0193 199 8.2188 0.0809
8.25 0.0194 200 8.2109 0.0809
8.2734 0.0195 201 8.2031 0.0810
8.2188 0.0196 202 8.2031 0.0812
8.2578 0.0197 203 8.1953 0.0816
8.2344 0.0198 204 8.1875 0.0819
8.2969 0.0199 205 8.1797 0.0823
8.2812 0.0200 206 8.1719 0.0825
8.2578 0.0201 207 8.1641 0.0824
8.2031 0.0201 208 8.1641 0.0824
8.1953 0.0202 209 8.1562 0.0822
8.2344 0.0203 210 8.1484 0.0821
8.1484 0.0204 211 8.1406 0.0822
8.2188 0.0205 212 8.1328 0.0824
8.1406 0.0206 213 8.1328 0.0826
8.1641 0.0207 214 8.125 0.0829
8.1328 0.0208 215 8.1172 0.0831
8.1875 0.0209 216 8.1094 0.0833
8.1719 0.0210 217 8.1016 0.0835
8.125 0.0211 218 8.1016 0.0835
8.1172 0.0212 219 8.0938 0.0835
8.1172 0.0213 220 8.0859 0.0834
8.1562 0.0214 221 8.0781 0.0835
8.0781 0.0215 222 8.0781 0.0838
8.1094 0.0216 223 8.0703 0.0840
8.0938 0.0217 224 8.0625 0.0843
8.0938 0.0218 225 8.0547 0.0846
8.1016 0.0219 226 8.0469 0.0847
8.1094 0.0220 227 8.0469 0.0846
8.1016 0.0221 228 8.0391 0.0844
8.0859 0.0222 229 8.0312 0.0844
8.0859 0.0223 230 8.0312 0.0845
8.1094 0.0224 231 8.0234 0.0849
8.1016 0.0225 232 8.0156 0.0853
8.0859 0.0226 233 8.0078 0.0856
8.0859 0.0227 234 8.0078 0.0857
8.0781 0.0228 235 8.0 0.0857
8.0234 0.0229 236 7.9922 0.0856
8.0391 0.0230 237 7.9883 0.0855
8.0078 0.0231 238 7.9844 0.0855
8.0078 0.0232 239 7.9766 0.0857
7.9883 0.0232 240 7.9727 0.0862
7.9805 0.0233 241 7.9648 0.0865
8.0234 0.0234 242 7.9609 0.0868
7.9961 0.0235 243 7.9570 0.0870
8.0156 0.0236 244 7.9492 0.0870
7.9766 0.0237 245 7.9453 0.0869
7.9297 0.0238 246 7.9414 0.0866
7.9336 0.0239 247 7.9375 0.0865
7.9219 0.0240 248 7.9297 0.0866
7.957 0.0241 249 7.9258 0.0869
7.9453 0.0242 250 7.9180 0.0874
7.9805 0.0243 251 7.9141 0.0879
7.9531 0.0244 252 7.9102 0.0883
7.9102 0.0245 253 7.9062 0.0885
7.9844 0.0246 254 7.8984 0.0886
7.9414 0.0247 255 7.8945 0.0885
7.9453 0.0248 256 7.8906 0.0883
7.9219 0.0249 257 7.8867 0.0883
7.9141 0.0250 258 7.8828 0.0885
7.9258 0.0251 259 7.875 0.0889
7.957 0.0252 260 7.8711 0.0893
7.8984 0.0253 261 7.8672 0.0896
7.8945 0.0254 262 7.8633 0.0898
7.9141 0.0255 263 7.8594 0.0899
7.9453 0.0256 264 7.8555 0.0899
7.8672 0.0257 265 7.8477 0.0900
7.9375 0.0258 266 7.8438 0.0902
7.9219 0.0259 267 7.8398 0.0905
7.8555 0.0260 268 7.8359 0.0907
7.8984 0.0261 269 7.8320 0.0908
7.8906 0.0262 270 7.8281 0.0909
7.8711 0.0263 271 7.8242 0.0910
7.8633 0.0263 272 7.8203 0.0909
7.8633 0.0264 273 7.8164 0.0909
7.8789 0.0265 274 7.8125 0.0909
7.8438 0.0266 275 7.8086 0.0910
7.8789 0.0267 276 7.8047 0.0911
7.8516 0.0268 277 7.8008 0.0912
7.8711 0.0269 278 7.7969 0.0913
7.8008 0.0270 279 7.7930 0.0916
7.8477 0.0271 280 7.7891 0.0918
7.8086 0.0272 281 7.7852 0.0919
7.8398 0.0273 282 7.7812 0.0920
7.8008 0.0274 283 7.7773 0.0922
7.8281 0.0275 284 7.7734 0.0922
7.7852 0.0276 285 7.7695 0.0926
7.793 0.0277 286 7.7656 0.0929
7.8086 0.0278 287 7.7617 0.0931
7.7812 0.0279 288 7.7578 0.0931
7.793 0.0280 289 7.7539 0.0931
7.7539 0.0281 290 7.75 0.0931
7.75 0.0282 291 7.7461 0.0930
7.8164 0.0283 292 7.7422 0.0930
7.7539 0.0284 293 7.7422 0.0931
7.8086 0.0285 294 7.7383 0.0932
7.793 0.0286 295 7.7344 0.0936
7.7695 0.0287 296 7.7305 0.0937
7.75 0.0288 297 7.7266 0.0938
7.7891 0.0289 298 7.7227 0.0938
7.7773 0.0290 299 7.7188 0.0936
7.7227 0.0291 300 7.7148 0.0935
7.7109 0.0292 301 7.7148 0.0937
7.7148 0.0293 302 7.7109 0.0939
7.7812 0.0294 303 7.7070 0.0940
7.7109 0.0294 304 7.7031 0.0941
7.7539 0.0295 305 7.6992 0.0942
7.7734 0.0296 306 7.6992 0.0943
7.6914 0.0297 307 7.6953 0.0943
7.6445 0.0298 308 7.6914 0.0944
7.6953 0.0299 309 7.6875 0.0945
7.75 0.0300 310 7.6836 0.0946
7.7539 0.0301 311 7.6836 0.0949
7.6953 0.0302 312 7.6797 0.0951
7.7188 0.0303 313 7.6758 0.0951
7.6914 0.0304 314 7.6719 0.0953
7.7344 0.0305 315 7.6719 0.0954
7.7383 0.0306 316 7.6680 0.0953
7.6875 0.0307 317 7.6641 0.0950
7.6914 0.0308 318 7.6602 0.0947
7.6758 0.0309 319 7.6602 0.0945
7.6836 0.0310 320 7.6562 0.0947
7.6914 0.0311 321 7.6523 0.0950
7.6719 0.0312 322 7.6523 0.0954
7.6914 0.0313 323 7.6484 0.0958
7.6094 0.0314 324 7.6445 0.0961
7.7148 0.0315 325 7.6406 0.0962
7.6641 0.0316 326 7.6406 0.0961
7.6602 0.0317 327 7.6367 0.0961
7.7031 0.0318 328 7.6328 0.0963
7.6953 0.0319 329 7.6328 0.0966
7.6445 0.0320 330 7.6289 0.0968
7.6445 0.0321 331 7.625 0.0969
7.6445 0.0322 332 7.625 0.0969
7.668 0.0323 333 7.6211 0.0968
7.6523 0.0324 334 7.6172 0.0967
7.6602 0.0325 335 7.6172 0.0968
7.6328 0.0325 336 7.6133 0.0972
7.6523 0.0326 337 7.6094 0.0976
7.6133 0.0327 338 7.6094 0.0981
7.6367 0.0328 339 7.6055 0.0984
7.6641 0.0329 340 7.6016 0.0985
7.6367 0.0330 341 7.6016 0.0985
7.6133 0.0331 342 7.5977 0.0985
7.6016 0.0332 343 7.5977 0.0984
7.668 0.0333 344 7.5938 0.0984
7.6172 0.0334 345 7.5898 0.0984
7.6016 0.0335 346 7.5898 0.0985
7.6328 0.0336 347 7.5859 0.0985
7.668 0.0337 348 7.5820 0.0986
7.6719 0.0338 349 7.5820 0.0987
7.6602 0.0339 350 7.5781 0.0989
7.6641 0.0340 351 7.5742 0.0992
7.6445 0.0341 352 7.5742 0.0994
7.5781 0.0342 353 7.5703 0.0995
7.6523 0.0343 354 7.5703 0.0996
7.6562 0.0344 355 7.5664 0.0996
7.5977 0.0345 356 7.5664 0.0998
7.5977 0.0346 357 7.5625 0.0998
7.5508 0.0347 358 7.5625 0.0997
7.6172 0.0348 359 7.5586 0.0997
7.5469 0.0349 360 7.5547 0.0997
7.6172 0.0350 361 7.5547 0.0997
7.625 0.0351 362 7.5508 0.0998
7.6289 0.0352 363 7.5508 0.0999
7.5234 0.0353 364 7.5469 0.1002
7.5703 0.0354 365 7.5430 0.1006
7.5859 0.0355 366 7.5430 0.1010
7.5469 0.0356 367 7.5391 0.1014
7.5508 0.0356 368 7.5391 0.1016
7.6172 0.0357 369 7.5352 0.1017
7.6172 0.0358 370 7.5352 0.1017
7.5352 0.0359 371 7.5312 0.1018
7.5859 0.0360 372 7.5312 0.1018
7.5586 0.0361 373 7.5273 0.1017
7.6406 0.0362 374 7.5273 0.1017
7.5273 0.0363 375 7.5234 0.1018
7.5312 0.0364 376 7.5195 0.1020
7.5898 0.0365 377 7.5195 0.1023
7.5898 0.0366 378 7.5156 0.1027
7.543 0.0367 379 7.5156 0.1029
7.5156 0.0368 380 7.5117 0.1030
7.5664 0.0369 381 7.5117 0.1031
7.5625 0.0370 382 7.5078 0.1031
7.5312 0.0371 383 7.5078 0.1032
7.625 0.0372 384 7.5078 0.1032
7.5898 0.0373 385 7.5039 0.1034
7.5625 0.0374 386 7.5 0.1035
7.5664 0.0375 387 7.5 0.1037
7.4609 0.0376 388 7.4961 0.1039
7.5469 0.0377 389 7.4961 0.1040
7.5742 0.0378 390 7.4922 0.1040
7.4375 0.0379 391 7.4922 0.1040
7.4961 0.0380 392 7.4883 0.1039

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.0a0+32f93b1
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
28
Safetensors
Model size
92.2M params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for gokulsrinivasagan/gpt_train_2_768

Finetuned
(1135)
this model

Dataset used to train gokulsrinivasagan/gpt_train_2_768

Evaluation results

  • Accuracy on gokuls/wiki_book_corpus_raw_dataset_tiny
    self-reported
    0.104