oi-fdibaldassarre's picture
Upload logs
c3ecbbe verified
2024-07-17 06:14:19,185 - INFO: Calling run..
2024-07-17 06:14:19,186 - INFO: Environment configuration: ConfigNLPCausalClassificationEnvironment(gpus=['0'], mixed_precision=False, compile_model=False, use_deepspeed=False, deepspeed_reduce_bucket_size=10000000.0, deepspeed_stage3_prefetch_bucket_size=10000000.0, deepspeed_stage3_param_persistence_threshold=10000000.0, deepspeed_offload_optimizer=False, deepspeed_stage3_max_live_parameters=10000000.0, deepspeed_stage3_max_reuse_distance=10000000.0, find_unused_parameters=False, trust_remote_code=False, huggingface_branch='main', number_of_workers=8, seed=-1, _seed=0, _distributed=False, _distributed_inference=True, _local_rank=0, _world_size=1, _curr_step=0, _curr_val_step=0, _rank=0, _device='cuda', _cpu_comm=None, _model_card_template='text_causal_classification_model_card_template.md', _summary_card_template='text_causal_classification_experiment_summary_card_template.md')
2024-07-17 06:14:19,186 - INFO: cfg.environment._distributed set to False
2024-07-17 06:14:19,186 - INFO: Problem Type: text_causal_classification_modeling
2024-07-17 06:14:19,186 - INFO: Global random seed: 419783
2024-07-17 06:14:19,186 - INFO: Preparing the data...
2024-07-17 06:14:19,186 - INFO: Setting up automatic validation split...
2024-07-17 06:14:19,192 - INFO: The dataframe has following columns: Index(['Description', 'category', 'sub_category', 'label'], dtype='object')
2024-07-17 06:14:19,195 - INFO: Preparing train and validation data, dataset config to be used: ConfigNLPCausalClassificationDataset(dataset_class=<class 'llm_studio.src.datasets.text_causal_classification_ds.CustomDataset'>, personalize=False, chatbot_name='OI_AI', chatbot_author='openinnovation.ai', train_dataframe='/app/train_df.csv', validation_strategy='automatic', validation_dataframe='/app/validation_df.csv', validation_size=0.0099999998, data_sample=1.0, data_sample_choice=('Train', 'Validation'), system_column='None', prompt_column=(), answer_column='category', parent_id_column='None', text_system_start='', text_prompt_start='', text_answer_separator='', limit_chained_samples=False, add_eos_token_to_system=False, add_eos_token_to_prompt=False, add_eos_token_to_answer=False, mask_prompt_labels=True, _allowed_file_extensions=('csv', 'pq', 'parquet'), num_classes=2)
2024-07-17 06:14:19,195 - INFO: Loading train dataset...
2024-07-17 06:14:19,195 - INFO: Columns found: Index(['Description', 'category', 'sub_category', 'label'], dtype='object')
2024-07-17 06:14:20,210 - INFO: Loading validation dataset...
2024-07-17 06:14:20,791 - INFO: Number of observations in train dataset: 494
2024-07-17 06:14:20,791 - INFO: Number of observations in validation dataset: 5
2024-07-17 06:14:21,246 - WARNING: PAD token id not matching between config and tokenizer. Overwriting with tokenizer id.
2024-07-17 06:14:21,246 - INFO: Setting pretraining_tp of model config to 1.
2024-07-17 06:14:21,251 - INFO: Using int4 for backbone
2024-07-17 06:14:21,251 - INFO: Loading TinyLlama/TinyLlama_v1.1. This may take a while.
2024-07-17 06:14:35,909 - INFO: Loaded TinyLlama/TinyLlama_v1.1.
2024-07-17 06:14:35,916 - INFO: Lora module names: ['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj']
2024-07-17 06:14:36,191 - INFO: Enough space available for saving model weights.Required space: 1003.87MB, Available space: 993953.90MB.
2024-07-17 06:14:36,200 - INFO: Optimizer AdamW has been provided with parameters {'weight_decay': 0.0, 'eps': 1e-08, 'betas': (0.8999999762, 0.9990000129), 'lr': 0.0001}
2024-07-17 06:14:36,637 - INFO: started process: 0, can_track: True, tracking_mode: TrackingMode.AFTER_EPOCH
2024-07-17 06:14:36,638 - INFO: Training Epoch: 1 / 1
2024-07-17 06:14:36,638 - INFO: train loss: 0%| | 0/247 [00:00<?, ?it/s]
2024-07-17 06:14:36,787 - INFO: Evaluation step: 247
2024-07-17 06:14:40,478 - INFO: train loss: 0.69: 1%| | 2/247 [00:03<07:50, 1.92s/it]
2024-07-17 06:14:44,085 - INFO: train loss: 0.69: 2%|1 | 4/247 [00:07<07:29, 1.85s/it]
2024-07-17 06:14:47,702 - INFO: train loss: 0.69: 2%|2 | 6/247 [00:11<07:21, 1.83s/it]
2024-07-17 06:14:51,319 - INFO: train loss: 0.69: 3%|3 | 8/247 [00:14<07:15, 1.82s/it]
2024-07-17 06:14:54,943 - INFO: train loss: 0.69: 4%|4 | 10/247 [00:18<07:11, 1.82s/it]
2024-07-17 06:14:58,568 - INFO: train loss: 0.69: 5%|4 | 12/247 [00:21<07:06, 1.82s/it]
2024-07-17 06:15:02,198 - INFO: train loss: 0.69: 6%|5 | 14/247 [00:25<07:03, 1.82s/it]
2024-07-17 06:15:05,836 - INFO: train loss: 0.69: 6%|6 | 16/247 [00:29<06:59, 1.82s/it]
2024-07-17 06:15:09,477 - INFO: train loss: 0.69: 7%|7 | 18/247 [00:32<06:56, 1.82s/it]
2024-07-17 06:15:13,118 - INFO: train loss: 0.69: 8%|8 | 20/247 [00:36<06:52, 1.82s/it]
2024-07-17 06:15:16,760 - INFO: train loss: 0.69: 9%|8 | 22/247 [00:40<06:49, 1.82s/it]
2024-07-17 06:15:20,405 - INFO: train loss: 0.69: 10%|9 | 24/247 [00:43<06:45, 1.82s/it]
2024-07-17 06:15:24,053 - INFO: train loss: 0.69: 11%|# | 26/247 [00:47<06:42, 1.82s/it]
2024-07-17 06:15:27,704 - INFO: train loss: 0.69: 11%|#1 | 28/247 [00:51<06:39, 1.82s/it]
2024-07-17 06:15:31,359 - INFO: train loss: 0.69: 12%|#2 | 30/247 [00:54<06:35, 1.82s/it]
2024-07-17 06:15:35,012 - INFO: train loss: 0.69: 13%|#2 | 32/247 [00:58<06:32, 1.82s/it]
2024-07-17 06:15:38,664 - INFO: train loss: 0.69: 14%|#3 | 34/247 [01:02<06:28, 1.83s/it]
2024-07-17 06:15:42,324 - INFO: train loss: 0.69: 15%|#4 | 36/247 [01:05<06:25, 1.83s/it]
2024-07-17 06:15:45,991 - INFO: train loss: 0.69: 15%|#5 | 38/247 [01:09<06:22, 1.83s/it]
2024-07-17 06:15:49,653 - INFO: train loss: 0.69: 16%|#6 | 40/247 [01:13<06:18, 1.83s/it]
2024-07-17 06:15:53,319 - INFO: train loss: 0.69: 17%|#7 | 42/247 [01:16<06:15, 1.83s/it]
2024-07-17 06:15:56,988 - INFO: train loss: 0.69: 18%|#7 | 44/247 [01:20<06:11, 1.83s/it]
2024-07-17 06:16:00,655 - INFO: train loss: 0.69: 19%|#8 | 46/247 [01:24<06:08, 1.83s/it]
2024-07-17 06:16:04,324 - INFO: train loss: 0.69: 19%|#9 | 48/247 [01:27<06:04, 1.83s/it]
2024-07-17 06:16:07,995 - INFO: train loss: 0.69: 20%|## | 50/247 [01:31<06:01, 1.83s/it]
2024-07-17 06:16:11,667 - INFO: train loss: 0.69: 21%|##1 | 52/247 [01:35<05:57, 1.83s/it]
2024-07-17 06:16:15,344 - INFO: train loss: 0.69: 22%|##1 | 54/247 [01:38<05:54, 1.84s/it]
2024-07-17 06:16:19,020 - INFO: train loss: 0.69: 23%|##2 | 56/247 [01:42<05:50, 1.84s/it]
2024-07-17 06:16:22,696 - INFO: train loss: 0.69: 23%|##3 | 58/247 [01:46<05:47, 1.84s/it]
2024-07-17 06:16:26,371 - INFO: train loss: 0.69: 24%|##4 | 60/247 [01:49<05:43, 1.84s/it]
2024-07-17 06:16:30,047 - INFO: train loss: 0.69: 25%|##5 | 62/247 [01:53<05:39, 1.84s/it]
2024-07-17 06:16:33,723 - INFO: train loss: 0.69: 26%|##5 | 64/247 [01:57<05:36, 1.84s/it]
2024-07-17 06:16:37,401 - INFO: train loss: 0.69: 27%|##6 | 66/247 [02:00<05:32, 1.84s/it]
2024-07-17 06:16:41,082 - INFO: train loss: 0.69: 28%|##7 | 68/247 [02:04<05:29, 1.84s/it]
2024-07-17 06:16:44,762 - INFO: train loss: 0.69: 28%|##8 | 70/247 [02:08<05:25, 1.84s/it]
2024-07-17 06:16:48,448 - INFO: train loss: 0.69: 29%|##9 | 72/247 [02:11<05:22, 1.84s/it]
2024-07-17 06:16:52,124 - INFO: train loss: 0.69: 30%|##9 | 74/247 [02:15<05:18, 1.84s/it]
2024-07-17 06:16:55,803 - INFO: train loss: 0.69: 31%|### | 76/247 [02:19<05:14, 1.84s/it]
2024-07-17 06:16:59,486 - INFO: train loss: 0.69: 32%|###1 | 78/247 [02:22<05:10, 1.84s/it]
2024-07-17 06:17:03,165 - INFO: train loss: 0.69: 32%|###2 | 80/247 [02:26<05:07, 1.84s/it]
2024-07-17 06:17:06,841 - INFO: train loss: 0.69: 33%|###3 | 82/247 [02:30<05:03, 1.84s/it]
2024-07-17 06:17:10,530 - INFO: train loss: 0.69: 34%|###4 | 84/247 [02:33<05:00, 1.84s/it]
2024-07-17 06:17:14,215 - INFO: train loss: 0.69: 35%|###4 | 86/247 [02:37<04:56, 1.84s/it]
2024-07-17 06:17:17,898 - INFO: train loss: 0.69: 36%|###5 | 88/247 [02:41<04:52, 1.84s/it]
2024-07-17 06:17:21,582 - INFO: train loss: 0.69: 36%|###6 | 90/247 [02:44<04:49, 1.84s/it]
2024-07-17 06:17:25,270 - INFO: train loss: 0.69: 37%|###7 | 92/247 [02:48<04:45, 1.84s/it]
2024-07-17 06:17:28,955 - INFO: train loss: 0.69: 38%|###8 | 94/247 [02:52<04:41, 1.84s/it]
2024-07-17 06:17:32,638 - INFO: train loss: 0.69: 39%|###8 | 96/247 [02:55<04:38, 1.84s/it]
2024-07-17 06:17:36,324 - INFO: train loss: 0.69: 40%|###9 | 98/247 [02:59<04:34, 1.84s/it]
2024-07-17 06:17:40,011 - INFO: train loss: 0.69: 40%|#### | 100/247 [03:03<04:30, 1.84s/it]
2024-07-17 06:17:43,698 - INFO: train loss: 0.69: 41%|####1 | 102/247 [03:07<04:27, 1.84s/it]
2024-07-17 06:17:47,382 - INFO: train loss: 0.69: 42%|####2 | 104/247 [03:10<04:23, 1.84s/it]
2024-07-17 06:17:51,066 - INFO: train loss: 0.69: 43%|####2 | 106/247 [03:14<04:19, 1.84s/it]
2024-07-17 06:17:54,758 - INFO: train loss: 0.69: 44%|####3 | 108/247 [03:18<04:16, 1.84s/it]
2024-07-17 06:17:58,443 - INFO: train loss: 0.69: 45%|####4 | 110/247 [03:21<04:12, 1.84s/it]
2024-07-17 06:18:02,129 - INFO: train loss: 0.69: 45%|####5 | 112/247 [03:25<04:08, 1.84s/it]
2024-07-17 06:18:05,813 - INFO: train loss: 0.69: 46%|####6 | 114/247 [03:29<04:05, 1.84s/it]
2024-07-17 06:18:09,498 - INFO: train loss: 0.69: 47%|####6 | 116/247 [03:32<04:01, 1.84s/it]
2024-07-17 06:18:13,182 - INFO: train loss: 0.69: 48%|####7 | 118/247 [03:36<03:57, 1.84s/it]
2024-07-17 06:18:16,869 - INFO: train loss: 0.69: 49%|####8 | 120/247 [03:40<03:54, 1.84s/it]
2024-07-17 06:18:20,560 - INFO: train loss: 0.69: 49%|####9 | 122/247 [03:43<03:50, 1.84s/it]
2024-07-17 06:18:24,248 - INFO: train loss: 0.69: 50%|##### | 124/247 [03:47<03:46, 1.84s/it]
2024-07-17 06:18:27,937 - INFO: train loss: 0.69: 51%|#####1 | 126/247 [03:51<03:43, 1.84s/it]
2024-07-17 06:18:31,621 - INFO: train loss: 0.69: 52%|#####1 | 128/247 [03:54<03:39, 1.84s/it]
2024-07-17 06:18:35,310 - INFO: train loss: 0.69: 53%|#####2 | 130/247 [03:58<03:35, 1.84s/it]
2024-07-17 06:18:38,999 - INFO: train loss: 0.69: 53%|#####3 | 132/247 [04:02<03:32, 1.84s/it]
2024-07-17 06:18:42,685 - INFO: train loss: 0.69: 54%|#####4 | 134/247 [04:06<03:28, 1.84s/it]
2024-07-17 06:18:46,373 - INFO: train loss: 0.69: 55%|#####5 | 136/247 [04:09<03:24, 1.84s/it]
2024-07-17 06:18:50,063 - INFO: train loss: 0.69: 56%|#####5 | 138/247 [04:13<03:21, 1.84s/it]
2024-07-17 06:18:53,753 - INFO: train loss: 0.69: 57%|#####6 | 140/247 [04:17<03:17, 1.84s/it]
2024-07-17 06:18:57,439 - INFO: train loss: 0.69: 57%|#####7 | 142/247 [04:20<03:13, 1.84s/it]
2024-07-17 06:19:01,127 - INFO: train loss: 0.69: 58%|#####8 | 144/247 [04:24<03:09, 1.84s/it]
2024-07-17 06:19:04,818 - INFO: train loss: 0.69: 59%|#####9 | 146/247 [04:28<03:06, 1.84s/it]
2024-07-17 06:19:08,506 - INFO: train loss: 0.69: 60%|#####9 | 148/247 [04:31<03:02, 1.84s/it]
2024-07-17 06:19:12,194 - INFO: train loss: 0.69: 61%|###### | 150/247 [04:35<02:58, 1.84s/it]
2024-07-17 06:19:15,881 - INFO: train loss: 0.69: 62%|######1 | 152/247 [04:39<02:55, 1.84s/it]
2024-07-17 06:19:19,570 - INFO: train loss: 0.69: 62%|######2 | 154/247 [04:42<02:51, 1.84s/it]
2024-07-17 06:19:23,257 - INFO: train loss: 0.69: 63%|######3 | 156/247 [04:46<02:47, 1.84s/it]
2024-07-17 06:19:26,952 - INFO: train loss: 0.69: 64%|######3 | 158/247 [04:50<02:44, 1.84s/it]
2024-07-17 06:19:30,641 - INFO: train loss: 0.69: 65%|######4 | 160/247 [04:54<02:40, 1.84s/it]
2024-07-17 06:19:34,331 - INFO: train loss: 0.69: 66%|######5 | 162/247 [04:57<02:36, 1.84s/it]
2024-07-17 06:19:38,020 - INFO: train loss: 0.69: 66%|######6 | 164/247 [05:01<02:33, 1.84s/it]
2024-07-17 06:19:41,710 - INFO: train loss: 0.69: 67%|######7 | 166/247 [05:05<02:29, 1.84s/it]
2024-07-17 06:19:45,399 - INFO: train loss: 0.69: 68%|######8 | 168/247 [05:08<02:25, 1.84s/it]
2024-07-17 06:19:49,092 - INFO: train loss: 0.69: 69%|######8 | 170/247 [05:12<02:22, 1.85s/it]
2024-07-17 06:19:52,778 - INFO: train loss: 0.69: 70%|######9 | 172/247 [05:16<02:18, 1.84s/it]
2024-07-17 06:19:56,471 - INFO: train loss: 0.69: 70%|####### | 174/247 [05:19<02:14, 1.85s/it]
2024-07-17 06:20:00,161 - INFO: train loss: 0.69: 71%|#######1 | 176/247 [05:23<02:11, 1.85s/it]
2024-07-17 06:20:03,851 - INFO: train loss: 0.69: 72%|#######2 | 178/247 [05:27<02:07, 1.85s/it]
2024-07-17 06:20:07,541 - INFO: train loss: 0.69: 73%|#######2 | 180/247 [05:30<02:03, 1.85s/it]
2024-07-17 06:20:11,231 - INFO: train loss: 0.69: 74%|#######3 | 182/247 [05:34<01:59, 1.85s/it]
2024-07-17 06:20:14,920 - INFO: train loss: 0.69: 74%|#######4 | 184/247 [05:38<01:56, 1.84s/it]
2024-07-17 06:20:18,607 - INFO: train loss: 0.69: 75%|#######5 | 186/247 [05:41<01:52, 1.84s/it]
2024-07-17 06:20:22,292 - INFO: train loss: 0.69: 76%|#######6 | 188/247 [05:45<01:48, 1.84s/it]
2024-07-17 06:20:25,982 - INFO: train loss: 0.69: 77%|#######6 | 190/247 [05:49<01:45, 1.84s/it]
2024-07-17 06:20:29,678 - INFO: train loss: 0.69: 78%|#######7 | 192/247 [05:53<01:41, 1.85s/it]
2024-07-17 06:20:33,371 - INFO: train loss: 0.69: 79%|#######8 | 194/247 [05:56<01:37, 1.85s/it]
2024-07-17 06:20:37,064 - INFO: train loss: 0.69: 79%|#######9 | 196/247 [06:00<01:34, 1.85s/it]
2024-07-17 06:20:40,756 - INFO: train loss: 0.69: 80%|######## | 198/247 [06:04<01:30, 1.85s/it]
2024-07-17 06:20:44,444 - INFO: train loss: 0.69: 81%|######## | 200/247 [06:07<01:26, 1.85s/it]
2024-07-17 06:20:48,128 - INFO: train loss: 0.69: 82%|########1 | 202/247 [06:11<01:22, 1.84s/it]
2024-07-17 06:20:51,813 - INFO: train loss: 0.69: 83%|########2 | 204/247 [06:15<01:19, 1.84s/it]
2024-07-17 06:20:55,505 - INFO: train loss: 0.69: 83%|########3 | 206/247 [06:18<01:15, 1.84s/it]
2024-07-17 06:20:59,200 - INFO: train loss: 0.69: 84%|########4 | 208/247 [06:22<01:11, 1.85s/it]
2024-07-17 06:21:02,890 - INFO: train loss: 0.69: 85%|########5 | 210/247 [06:26<01:08, 1.85s/it]
2024-07-17 06:21:06,580 - INFO: train loss: 0.69: 86%|########5 | 212/247 [06:29<01:04, 1.85s/it]
2024-07-17 06:21:10,264 - INFO: train loss: 0.69: 87%|########6 | 214/247 [06:33<01:00, 1.84s/it]
2024-07-17 06:21:13,970 - INFO: train loss: 0.69: 87%|########7 | 216/247 [06:37<00:57, 1.85s/it]
2024-07-17 06:21:17,646 - INFO: train loss: 0.69: 88%|########8 | 218/247 [06:41<00:53, 1.84s/it]
2024-07-17 06:21:21,336 - INFO: train loss: 0.69: 89%|########9 | 220/247 [06:44<00:49, 1.84s/it]
2024-07-17 06:21:25,029 - INFO: train loss: 0.69: 90%|########9 | 222/247 [06:48<00:46, 1.85s/it]
2024-07-17 06:21:28,717 - INFO: train loss: 0.69: 91%|######### | 224/247 [06:52<00:42, 1.84s/it]
2024-07-17 06:21:32,402 - INFO: train loss: 0.69: 91%|#########1| 226/247 [06:55<00:38, 1.84s/it]
2024-07-17 06:21:36,093 - INFO: train loss: 0.69: 92%|#########2| 228/247 [06:59<00:35, 1.84s/it]
2024-07-17 06:21:39,782 - INFO: train loss: 0.69: 93%|#########3| 230/247 [07:03<00:31, 1.84s/it]
2024-07-17 06:21:43,472 - INFO: train loss: 0.69: 94%|#########3| 232/247 [07:06<00:27, 1.84s/it]
2024-07-17 06:21:47,159 - INFO: train loss: 0.69: 95%|#########4| 234/247 [07:10<00:23, 1.84s/it]
2024-07-17 06:21:50,847 - INFO: train loss: 0.69: 96%|#########5| 236/247 [07:14<00:20, 1.84s/it]
2024-07-17 06:21:54,541 - INFO: train loss: 0.69: 96%|#########6| 238/247 [07:17<00:16, 1.84s/it]
2024-07-17 06:21:58,231 - INFO: train loss: 0.69: 97%|#########7| 240/247 [07:21<00:12, 1.85s/it]
2024-07-17 06:22:01,922 - INFO: train loss: 0.69: 98%|#########7| 242/247 [07:25<00:09, 1.85s/it]
2024-07-17 06:22:05,605 - INFO: train loss: 0.69: 99%|#########8| 244/247 [07:28<00:05, 1.84s/it]
2024-07-17 06:22:09,294 - INFO: train loss: 0.69: 100%|#########9| 246/247 [07:32<00:01, 1.84s/it]
2024-07-17 06:22:11,136 - INFO: train loss: 0.69: 100%|##########| 247/247 [07:34<00:00, 1.84s/it]
2024-07-17 06:22:11,136 - INFO: Saving last model checkpoint to /app/output
2024-07-17 06:22:11,136 - INFO: Saving checkpoint..
2024-07-17 06:22:12,661 - INFO: Starting validation inference
2024-07-17 06:22:12,662 - INFO: validation progress: 0%| | 0/3 [00:00<?, ?it/s]
2024-07-17 06:22:13,336 - INFO: validation progress: 33%|###3 | 1/3 [00:00<00:01, 1.48it/s]
2024-07-17 06:22:13,896 - INFO: validation progress: 67%|######6 | 2/3 [00:01<00:00, 1.65it/s]
2024-07-17 06:22:14,207 - INFO: validation progress: 100%|##########| 3/3 [00:01<00:00, 2.12it/s]
2024-07-17 06:22:14,209 - INFO: validation progress: 100%|##########| 3/3 [00:01<00:00, 1.94it/s]
2024-07-17 06:22:14,247 - INFO: Validation Perplexity: 0.69315
2024-07-17 06:22:14,247 - INFO: Mean validation loss: 0.69315