|
2024-06-11 19:24:05,913 - INFO: Calling run.. |
|
2024-06-11 19:24:05,913 - INFO: Problem Type: text_causal_classification_modeling |
|
2024-06-11 19:24:05,913 - INFO: Global random seed: 849738 |
|
2024-06-11 19:24:05,913 - INFO: Preparing the data... |
|
2024-06-11 19:24:05,913 - INFO: Setting up automatic validation split... |
|
2024-06-11 19:24:05,955 - INFO: Preparing train and validation data |
|
2024-06-11 19:24:05,955 - INFO: Loading train dataset... |
|
2024-06-11 19:24:10,161 - INFO: Loading validation dataset... |
|
2024-06-11 19:24:10,266 - INFO: Number of observations in train dataset: 9600 |
|
2024-06-11 19:24:10,267 - INFO: Number of observations in validation dataset: 2400 |
|
2024-06-11 19:24:10,464 - WARNING: EOS token id not matching between config and tokenizer. Overwriting with tokenizer id. |
|
2024-06-11 19:24:10,465 - WARNING: PAD token id not matching between config and tokenizer. Overwriting with tokenizer id. |
|
2024-06-11 19:24:10,469 - INFO: Using bfloat16 for backbone |
|
2024-06-11 19:24:10,469 - INFO: Loading tiiuae/falcon-rw-1b. This may take a while. |
|
2024-06-11 19:24:22,250 - INFO: Loaded tiiuae/falcon-rw-1b. |
|
2024-06-11 19:24:22,253 - WARNING: EOS token id not matching between generation config and tokenizer. Overwriting with tokenizer id. |
|
2024-06-11 19:24:22,253 - WARNING: PAD token id not matching between generation config and tokenizer. Overwriting with tokenizer id. |
|
2024-06-11 19:24:22,253 - INFO: Lora module names: ['query_key_value', 'dense', 'dense_h_to_4h', 'dense_4h_to_h'] |
|
2024-06-11 19:24:22,374 - INFO: Enough space available for saving model weights.Required space: 2591.11MB, Available space: 995995.53MB. |
|
2024-06-11 19:24:22,379 - INFO: Optimizer AdamW has been provided with parameters {'weight_decay': 0.0, 'eps': 1e-08, 'betas': (0.8999999762, 0.9990000129), 'lr': 0.0001} |
|
2024-06-11 19:24:22,520 - WARNING: No order set for keys: ['answer_column_label', 'num_classes']. |
|
2024-06-11 19:24:22,535 - WARNING: No order set for keys: ['answer_column_label', 'num_classes']. |
|
2024-06-11 19:24:23,373 - INFO: started process: 0, can_track: True, tracking_mode: TrackingMode.DURING_EPOCH |
|
2024-06-11 19:24:23,374 - INFO: Training Epoch: 1 / 1 |
|
2024-06-11 19:24:23,374 - INFO: train loss: 0%| | 0/600 [00:00<?, ?it/s] |
|
2024-06-11 19:24:23,592 - INFO: Evaluation step: 600 |
|
2024-06-11 19:24:26,265 - INFO: train loss: 38.50: 1%|1 | 6/600 [00:02<04:46, 2.08it/s] |
|
2024-06-11 19:24:28,665 - INFO: train loss: 37.30: 2%|2 | 12/600 [00:05<04:15, 2.31it/s] |
|
2024-06-11 19:24:31,130 - INFO: train loss: 16.17: 3%|3 | 18/600 [00:07<04:06, 2.36it/s] |
|
2024-06-11 19:24:33,555 - INFO: train loss: 5.66: 4%|4 | 24/600 [00:10<03:59, 2.41it/s] |
|
2024-06-11 19:24:36,045 - INFO: train loss: 3.87: 5%|5 | 30/600 [00:12<03:56, 2.41it/s] |
|
2024-06-11 19:24:38,714 - INFO: train loss: 2.41: 6%|6 | 36/600 [00:15<03:59, 2.35it/s] |
|
2024-06-11 19:24:41,306 - INFO: train loss: 2.09: 7%|7 | 42/600 [00:17<03:58, 2.34it/s] |
|
2024-06-11 19:24:43,832 - INFO: train loss: 2.23: 8%|8 | 48/600 [00:20<03:54, 2.35it/s] |
|
2024-06-11 19:24:46,208 - INFO: train loss: 2.14: 9%|9 | 54/600 [00:22<03:47, 2.40it/s] |
|
2024-06-11 19:24:48,667 - INFO: train loss: 2.26: 10%|# | 60/600 [00:25<03:43, 2.41it/s] |
|
2024-06-11 19:24:51,185 - INFO: train loss: 1.91: 11%|#1 | 66/600 [00:27<03:42, 2.40it/s] |
|
2024-06-11 19:24:53,545 - INFO: train loss: 1.73: 12%|#2 | 72/600 [00:30<03:35, 2.44it/s] |
|
2024-06-11 19:24:55,938 - INFO: train loss: 1.85: 13%|#3 | 78/600 [00:32<03:31, 2.46it/s] |
|
2024-06-11 19:24:58,438 - INFO: train loss: 1.98: 14%|#4 | 84/600 [00:35<03:31, 2.44it/s] |
|
2024-06-11 19:25:00,935 - INFO: train loss: 1.92: 15%|#5 | 90/600 [00:37<03:29, 2.43it/s] |
|
2024-06-11 19:25:03,532 - INFO: train loss: 1.83: 16%|#6 | 96/600 [00:40<03:30, 2.39it/s] |
|
2024-06-11 19:25:06,109 - INFO: train loss: 1.60: 17%|#7 | 102/600 [00:42<03:29, 2.37it/s] |
|
2024-06-11 19:25:08,584 - INFO: train loss: 1.85: 18%|#8 | 108/600 [00:45<03:25, 2.39it/s] |
|
2024-06-11 19:25:11,083 - INFO: train loss: 2.02: 19%|#9 | 114/600 [00:47<03:23, 2.39it/s] |
|
2024-06-11 19:25:13,366 - INFO: train loss: 1.99: 20%|## | 120/600 [00:49<03:15, 2.46it/s] |
|
2024-06-11 19:25:15,909 - INFO: train loss: 1.88: 21%|##1 | 126/600 [00:52<03:15, 2.43it/s] |
|
2024-06-11 19:25:18,314 - INFO: train loss: 1.95: 22%|##2 | 132/600 [00:54<03:11, 2.45it/s] |
|
2024-06-11 19:25:20,805 - INFO: train loss: 1.82: 23%|##3 | 138/600 [00:57<03:09, 2.44it/s] |
|
2024-06-11 19:25:23,180 - INFO: train loss: 1.80: 24%|##4 | 144/600 [00:59<03:05, 2.46it/s] |
|
2024-06-11 19:25:25,788 - INFO: train loss: 1.89: 25%|##5 | 150/600 [01:02<03:06, 2.41it/s] |
|
2024-06-11 19:25:28,422 - INFO: train loss: 2.07: 26%|##6 | 156/600 [01:05<03:07, 2.37it/s] |
|
2024-06-11 19:25:30,947 - INFO: train loss: 2.02: 27%|##7 | 162/600 [01:07<03:04, 2.37it/s] |
|
2024-06-11 19:25:33,560 - INFO: train loss: 1.64: 28%|##8 | 168/600 [01:10<03:03, 2.35it/s] |
|
2024-06-11 19:25:35,986 - INFO: train loss: 1.88: 29%|##9 | 174/600 [01:12<02:58, 2.38it/s] |
|
2024-06-11 19:25:38,424 - INFO: train loss: 2.17: 30%|### | 180/600 [01:15<02:54, 2.41it/s] |
|
2024-06-11 19:25:40,841 - INFO: train loss: 1.92: 31%|###1 | 186/600 [01:17<02:50, 2.43it/s] |
|
2024-06-11 19:25:43,479 - INFO: train loss: 1.89: 32%|###2 | 192/600 [01:20<02:51, 2.38it/s] |
|
2024-06-11 19:25:46,108 - INFO: train loss: 1.71: 33%|###3 | 198/600 [01:22<02:51, 2.35it/s] |
|
2024-06-11 19:25:48,485 - INFO: train loss: 1.76: 34%|###4 | 204/600 [01:25<02:45, 2.40it/s] |
|
2024-06-11 19:25:50,947 - INFO: train loss: 1.93: 35%|###5 | 210/600 [01:27<02:41, 2.41it/s] |
|
2024-06-11 19:25:53,448 - INFO: train loss: 1.66: 36%|###6 | 216/600 [01:30<02:39, 2.41it/s] |
|
2024-06-11 19:25:55,958 - INFO: train loss: 1.70: 37%|###7 | 222/600 [01:32<02:37, 2.40it/s] |
|
2024-06-11 19:25:58,603 - INFO: train loss: 1.68: 38%|###8 | 228/600 [01:35<02:37, 2.36it/s] |
|
2024-06-11 19:26:01,255 - INFO: train loss: 1.67: 39%|###9 | 234/600 [01:37<02:37, 2.33it/s] |
|
2024-06-11 19:26:03,885 - INFO: train loss: 1.69: 40%|#### | 240/600 [01:40<02:35, 2.32it/s] |
|
2024-06-11 19:26:06,564 - INFO: train loss: 1.78: 41%|####1 | 246/600 [01:43<02:34, 2.29it/s] |
|
2024-06-11 19:26:09,095 - INFO: train loss: 1.72: 42%|####2 | 252/600 [01:45<02:30, 2.32it/s] |
|
2024-06-11 19:26:11,385 - INFO: train loss: 1.62: 43%|####3 | 258/600 [01:48<02:22, 2.40it/s] |
|
2024-06-11 19:26:13,870 - INFO: train loss: 1.52: 44%|####4 | 264/600 [01:50<02:19, 2.40it/s] |
|
2024-06-11 19:26:16,372 - INFO: train loss: 1.69: 45%|####5 | 270/600 [01:52<02:17, 2.40it/s] |
|
2024-06-11 19:26:18,692 - INFO: train loss: 1.82: 46%|####6 | 276/600 [01:55<02:12, 2.45it/s] |
|
2024-06-11 19:26:21,147 - INFO: train loss: 1.96: 47%|####6 | 282/600 [01:57<02:09, 2.45it/s] |
|
2024-06-11 19:26:23,681 - INFO: train loss: 1.83: 48%|####8 | 288/600 [02:00<02:08, 2.43it/s] |
|
2024-06-11 19:26:34,896 - INFO: train loss: 1.63: 49%|####9 | 294/600 [02:11<04:19, 1.18it/s] |
|
2024-06-11 19:26:37,499 - INFO: train loss: 1.47: 50%|##### | 300/600 [02:14<03:37, 1.38it/s] |
|
2024-06-11 19:26:40,069 - INFO: train loss: 1.69: 51%|#####1 | 306/600 [02:16<03:06, 1.57it/s] |
|
2024-06-11 19:26:42,587 - INFO: train loss: 1.60: 52%|#####2 | 312/600 [02:19<02:44, 1.75it/s] |
|
2024-06-11 19:26:44,978 - INFO: train loss: 1.44: 53%|#####3 | 318/600 [02:21<02:26, 1.93it/s] |
|
2024-06-11 19:26:47,347 - INFO: train loss: 1.38: 54%|#####4 | 324/600 [02:23<02:13, 2.08it/s] |
|
2024-06-11 19:26:49,772 - INFO: train loss: 1.22: 55%|#####5 | 330/600 [02:26<02:03, 2.18it/s] |
|
2024-06-11 19:26:52,193 - INFO: train loss: 1.34: 56%|#####6 | 336/600 [02:28<01:56, 2.26it/s] |
|
2024-06-11 19:26:54,866 - INFO: train loss: 1.70: 57%|#####6 | 342/600 [02:31<01:54, 2.26it/s] |
|
2024-06-11 19:26:57,352 - INFO: train loss: 1.63: 58%|#####8 | 348/600 [02:33<01:49, 2.30it/s] |
|
2024-06-11 19:26:59,884 - INFO: train loss: 1.51: 59%|#####8 | 354/600 [02:36<01:45, 2.32it/s] |
|
2024-06-11 19:27:02,254 - INFO: train loss: 1.32: 60%|###### | 360/600 [02:38<01:40, 2.38it/s] |
|
2024-06-11 19:27:04,630 - INFO: train loss: 1.43: 61%|######1 | 366/600 [02:41<01:36, 2.42it/s] |
|
2024-06-11 19:27:06,991 - INFO: train loss: 1.48: 62%|######2 | 372/600 [02:43<01:32, 2.46it/s] |
|
2024-06-11 19:27:09,474 - INFO: train loss: 1.45: 63%|######3 | 378/600 [02:46<01:30, 2.44it/s] |
|
2024-06-11 19:27:11,952 - INFO: train loss: 1.31: 64%|######4 | 384/600 [02:48<01:28, 2.44it/s] |
|
2024-06-11 19:27:14,439 - INFO: train loss: 1.36: 65%|######5 | 390/600 [02:51<01:26, 2.43it/s] |
|
2024-06-11 19:27:16,955 - INFO: train loss: 1.27: 66%|######6 | 396/600 [02:53<01:24, 2.42it/s] |
|
2024-06-11 19:27:19,389 - INFO: train loss: 1.36: 67%|######7 | 402/600 [02:56<01:21, 2.43it/s] |
|
2024-06-11 19:27:21,801 - INFO: train loss: 1.63: 68%|######8 | 408/600 [02:58<01:18, 2.45it/s] |
|
2024-06-11 19:27:24,365 - INFO: train loss: 1.45: 69%|######9 | 414/600 [03:00<01:17, 2.41it/s] |
|
2024-06-11 19:27:26,983 - INFO: train loss: 1.14: 70%|####### | 420/600 [03:03<01:15, 2.38it/s] |
|
2024-06-11 19:27:29,629 - INFO: train loss: 1.08: 71%|#######1 | 426/600 [03:06<01:14, 2.34it/s] |
|
2024-06-11 19:27:31,995 - INFO: train loss: 1.22: 72%|#######2 | 432/600 [03:08<01:10, 2.40it/s] |
|
2024-06-11 19:27:34,489 - INFO: train loss: 1.17: 73%|#######3 | 438/600 [03:11<01:07, 2.40it/s] |
|
2024-06-11 19:27:36,917 - INFO: train loss: 1.21: 74%|#######4 | 444/600 [03:13<01:04, 2.42it/s] |
|
2024-06-11 19:27:39,322 - INFO: train loss: 1.28: 75%|#######5 | 450/600 [03:15<01:01, 2.44it/s] |
|
2024-06-11 19:27:41,904 - INFO: train loss: 1.20: 76%|#######6 | 456/600 [03:18<00:59, 2.41it/s] |
|
2024-06-11 19:27:44,618 - INFO: train loss: 1.17: 77%|#######7 | 462/600 [03:21<00:58, 2.34it/s] |
|
2024-06-11 19:27:47,008 - INFO: train loss: 1.16: 78%|#######8 | 468/600 [03:23<00:55, 2.39it/s] |
|
2024-06-11 19:27:49,405 - INFO: train loss: 1.16: 79%|#######9 | 474/600 [03:26<00:51, 2.42it/s] |
|
2024-06-11 19:27:51,760 - INFO: train loss: 1.09: 80%|######## | 480/600 [03:28<00:48, 2.46it/s] |
|
2024-06-11 19:27:54,239 - INFO: train loss: 1.19: 81%|########1 | 486/600 [03:30<00:46, 2.45it/s] |
|
2024-06-11 19:27:56,731 - INFO: train loss: 1.25: 82%|########2 | 492/600 [03:33<00:44, 2.44it/s] |
|
2024-06-11 19:27:59,481 - INFO: train loss: 1.17: 83%|########2 | 498/600 [03:36<00:43, 2.35it/s] |
|
2024-06-11 19:28:02,025 - INFO: train loss: 1.21: 84%|########4 | 504/600 [03:38<00:40, 2.35it/s] |
|
2024-06-11 19:28:04,499 - INFO: train loss: 1.11: 85%|########5 | 510/600 [03:41<00:37, 2.38it/s] |
|
2024-06-11 19:28:07,055 - INFO: train loss: 1.09: 86%|########6 | 516/600 [03:43<00:35, 2.37it/s] |
|
2024-06-11 19:28:09,588 - INFO: train loss: 0.99: 87%|########7 | 522/600 [03:46<00:32, 2.37it/s] |
|
2024-06-11 19:28:12,009 - INFO: train loss: 1.08: 88%|########8 | 528/600 [03:48<00:30, 2.40it/s] |
|
2024-06-11 19:28:14,559 - INFO: train loss: 1.10: 89%|########9 | 534/600 [03:51<00:27, 2.39it/s] |
|
2024-06-11 19:28:17,044 - INFO: train loss: 1.03: 90%|######### | 540/600 [03:53<00:25, 2.39it/s] |
|
2024-06-11 19:28:19,569 - INFO: train loss: 1.06: 91%|#########1| 546/600 [03:56<00:22, 2.39it/s] |
|
2024-06-11 19:28:22,110 - INFO: train loss: 1.00: 92%|#########2| 552/600 [03:58<00:20, 2.38it/s] |
|
2024-06-11 19:28:24,605 - INFO: train loss: 0.95: 93%|#########3| 558/600 [04:01<00:17, 2.39it/s] |
|
2024-06-11 19:28:26,921 - INFO: train loss: 1.02: 94%|#########3| 564/600 [04:03<00:14, 2.45it/s] |
|
2024-06-11 19:28:29,323 - INFO: train loss: 1.06: 95%|#########5| 570/600 [04:05<00:12, 2.46it/s] |
|
2024-06-11 19:28:31,750 - INFO: train loss: 1.11: 96%|#########6| 576/600 [04:08<00:09, 2.46it/s] |
|
2024-06-11 19:28:34,275 - INFO: train loss: 1.13: 97%|#########7| 582/600 [04:10<00:07, 2.44it/s] |
|
2024-06-11 19:28:36,803 - INFO: train loss: 1.15: 98%|#########8| 588/600 [04:13<00:04, 2.42it/s] |
|
2024-06-11 19:28:39,449 - INFO: train loss: 1.01: 99%|#########9| 594/600 [04:16<00:02, 2.37it/s] |
|
2024-06-11 19:28:42,270 - INFO: train loss: 0.92: 100%|##########| 600/600 [04:18<00:00, 2.29it/s] |
|
2024-06-11 19:28:42,270 - INFO: train loss: 0.92: 100%|##########| 600/600 [04:18<00:00, 2.32it/s] |
|
2024-06-11 19:28:42,270 - INFO: Saving last model checkpoint to /app/output |
|
2024-06-11 19:28:42,270 - INFO: Saving checkpoint.. |
|
2024-06-11 19:28:44,868 - INFO: Starting validation inference |
|
2024-06-11 19:28:44,868 - INFO: validation progress: 0%| | 0/150 [00:00<?, ?it/s] |
|
2024-06-11 19:28:45,811 - INFO: validation progress: 5%|4 | 7/150 [00:00<00:19, 7.42it/s] |
|
2024-06-11 19:28:46,542 - INFO: validation progress: 9%|9 | 14/150 [00:01<00:15, 8.55it/s] |
|
2024-06-11 19:28:47,310 - INFO: validation progress: 14%|#4 | 21/150 [00:02<00:14, 8.80it/s] |
|
2024-06-11 19:28:48,099 - INFO: validation progress: 19%|#8 | 28/150 [00:03<00:13, 8.83it/s] |
|
2024-06-11 19:28:48,870 - INFO: validation progress: 23%|##3 | 35/150 [00:04<00:12, 8.92it/s] |
|
2024-06-11 19:28:49,672 - INFO: validation progress: 28%|##8 | 42/150 [00:04<00:12, 8.85it/s] |
|
2024-06-11 19:28:50,424 - INFO: validation progress: 33%|###2 | 49/150 [00:05<00:11, 9.00it/s] |
|
2024-06-11 19:28:51,218 - INFO: validation progress: 37%|###7 | 56/150 [00:06<00:10, 8.94it/s] |
|
2024-06-11 19:28:51,997 - INFO: validation progress: 42%|####2 | 63/150 [00:07<00:09, 8.95it/s] |
|
2024-06-11 19:28:52,814 - INFO: validation progress: 47%|####6 | 70/150 [00:07<00:09, 8.83it/s] |
|
2024-06-11 19:28:53,629 - INFO: validation progress: 51%|#####1 | 77/150 [00:08<00:08, 8.76it/s] |
|
2024-06-11 19:28:54,398 - INFO: validation progress: 56%|#####6 | 84/150 [00:09<00:07, 8.86it/s] |
|
2024-06-11 19:28:55,216 - INFO: validation progress: 61%|###### | 91/150 [00:10<00:06, 8.76it/s] |
|
2024-06-11 19:28:55,965 - INFO: validation progress: 65%|######5 | 98/150 [00:11<00:05, 8.93it/s] |
|
2024-06-11 19:28:56,706 - INFO: validation progress: 70%|####### | 105/150 [00:11<00:04, 9.08it/s] |
|
2024-06-11 19:28:57,509 - INFO: validation progress: 75%|#######4 | 112/150 [00:12<00:04, 8.97it/s] |
|
2024-06-11 19:28:58,294 - INFO: validation progress: 79%|#######9 | 119/150 [00:13<00:03, 8.95it/s] |
|
2024-06-11 19:28:59,089 - INFO: validation progress: 84%|########4 | 126/150 [00:14<00:02, 8.91it/s] |
|
2024-06-11 19:28:59,901 - INFO: validation progress: 89%|########8 | 133/150 [00:15<00:01, 8.82it/s] |
|
2024-06-11 19:29:00,664 - INFO: validation progress: 93%|#########3| 140/150 [00:15<00:01, 8.92it/s] |
|
2024-06-11 19:29:01,481 - INFO: validation progress: 98%|#########8| 147/150 [00:16<00:00, 8.81it/s] |
|
2024-06-11 19:29:01,816 - INFO: validation progress: 100%|##########| 150/150 [00:16<00:00, 8.85it/s] |
|
2024-06-11 19:29:01,876 - INFO: Validation AUC: 0.54705 |
|
2024-06-11 19:29:01,876 - INFO: Mean validation loss: 0.99504 |
|
2024-06-11 19:29:02,796 - WARNING: No order set for keys: ['answer_column_label', 'num_classes']. |
|
|