0%| | 0/5000 [00:00> The following columns in the training set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' 0%| | 25/5000 [12:57<38:59:22, 28.21s/it] 1%| | 50/5000 [24:31<38:04:45, 27.69s/it] 2%|▏ | 75/5000 [36:10<38:12:36, 27.93s/it] 2%|▏ | 100/5000 [47:46<37:40:02, 27.67s/it] 2%|▎ | 125/5000 [59:17<37:59:12, 28.05s/it] 3%|▎ | 150/5000 [1:10:46<37:32:21, 27.86s/it] 3%|▎ | 163/5000 [1:15:43<20:26:07, 15.21s/it] Reading metadata...: 23919it [00:01, 14960.32it/s] 4%|▎ | 175/5000 [1:22:25<38:42:56, 28.89s/it] 4%|▍ | 200/5000 [1:34:15<39:11:58, 29.40s/it] 4%|▍ | 224/5000 [1:45:23<36:46:48, 27.72s/it] 5%|▍ | 249/5000 [1:57:07<37:05:48, 28.11s/it] 5%|▌ | 274/5000 [2:08:53<36:42:42, 27.97s/it] 6%|▌ | 299/5000 [2:20:36<36:26:30, 27.91s/it] 6%|▋ | 325/5000 [2:32:04<23:59:08, 18.47s/it] 7%|▋ | 327/5000 [2:32:23<17:43:56, 13.66s/it] Reading metadata...: 23979it [00:02, 8389.85it/s] 7%|▋ | 350/5000 [2:45:20<38:18:16, 29.66s/it] 7%|▋ | 374/5000 [2:56:51<36:29:03, 28.39s/it] 8%|▊ | 400/5000 [3:09:27<37:45:36, 29.55s/it] 8%|▊ | 424/5000 [3:21:03<36:39:46, 28.84s/it] 9%|▉ | 449/5000 [3:33:05<36:30:32, 28.88s/it] 9%|▉ | 474/5000 [3:44:57<36:23:20, 28.94s/it] 10%|▉ | 490/5000 [3:51:29<19:35:34, 15.64s/it] Reading metadata...: 23848it [00:02, 13351.04it/s] 10%|█ | 500/5000 [3:57:20<35:45:29, 28.61s/it] 10%|█ | 525/5000 [4:09:00<34:54:55, 28.09s/it] 11%|█ | 550/5000 [4:20:30<34:27:46, 27.88s/it] 11%|█▏ | 574/5000 [4:31:36<34:02:43, 27.69s/it] 12%|█▏ | 600/5000 [4:43:33<34:06:44, 27.91s/it] 12%|█▎ | 625/5000 [4:55:08<33:43:52, 27.76s/it] 13%|█▎ | 650/5000 [5:06:34<31:39:01, 26.19s/it] 13%|█▎ | 654/5000 [5:07:15<16:27:10, 13.63s/it] Reading metadata...: 10438it [00:00, 28332.17it/s] 14%|█▎ | 675/5000 [5:18:11<33:22:09, 27.78s/it] 14%|█▍ | 699/5000 [5:29:08<32:58:14, 27.60s/it] 14%|█▍ | 725/5000 [5:41:16<32:59:03, 27.78s/it] 15%|█▍ | 749/5000 [5:52:28<32:45:54, 27.75s/it] 16%|█▌ | 775/5000 [6:04:31<32:25:51, 27.63s/it] 16%|█▌ | 800/5000 [6:16:05<31:52:14, 27.32s/it] 16%|█▋ | 817/5000 [6:22:52<17:40:15, 15.21s/it] Reading metadata...: 10438it [00:00, 26121.88it/s] 16%|█▋ | 824/5000 [6:27:10<33:50:30, 29.17s/it] 17%|█▋ | 849/5000 [6:38:45<32:12:00, 27.93s/it] 18%|█▊ | 875/5000 [6:50:44<31:43:20, 27.68s/it] 18%|█▊ | 900/5000 [7:02:26<31:53:22, 28.00s/it] 18%|█▊ | 925/5000 [7:14:00<31:27:54, 27.80s/it] 19%|█▉ | 950/5000 [7:25:24<29:51:57, 26.55s/it] 20%|█▉ | 975/5000 [7:36:55<31:02:52, 27.77s/it] 20%|█▉ | 981/5000 [7:38:26<15:15:07, 13.66s/it] Reading metadata...: 23329it [00:00, 27403.88it/s] 20%|█▉ | 999/5000 [7:48:10<31:41:54, 28.52s/it] 20%|██ | 1000/5000 [7:48:38<31:29:41, 28.35s/it][INFO|trainer.py:3138] 2023-05-07 18:22:49,172 >> ***** Running Evaluation ***** [INFO|trainer.py:3142] 2023-05-07 18:22:49,172 >> Num examples: Unknown [INFO|trainer.py:3143] 2023-05-07 18:22:49,172 >> Batch size = 64 [INFO|trainer_utils.py:693] 2023-05-07 18:23:04,305 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. {'eval_loss': 0.43405279517173767, 'eval_wer': 54.25600000000001, 'eval_runtime': 2248.2056, 'eval_samples_per_second': 4.644, 'eval_steps_per_second': 0.073, 'epoch': 6.0} 20%|██ | 1000/5000 [8:26:06<31:29:41, 28.35s/it][INFO|trainer.py:2877] 2023-05-07 19:00:17,386 >> Saving model checkpoint to ./checkpoint-1000 [INFO|configuration_utils.py:458] 2023-05-07 19:00:17,393 >> Configuration saved in ./checkpoint-1000/config.json [INFO|configuration_utils.py:364] 2023-05-07 19:00:17,398 >> Configuration saved in ./checkpoint-1000/generation_config.json [INFO|modeling_utils.py:1855] 2023-05-07 19:00:20,753 >> Model weights saved in ./checkpoint-1000/pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-07 19:00:20,758 >> Feature extractor saved in ./checkpoint-1000/preprocessor_config.json [INFO|feature_extraction_utils.py:369] 2023-05-07 19:00:30,115 >> Feature extractor saved in ./preprocessor_config.json Adding files tracked by Git LFS: ['wandb/run-20230506_113337-ysywp688/run-ysywp688.wandb', 'wandb/run-20230507_103405-9zf5xxpu/run-9zf5xxpu.wandb']. This may take a bit of time if the files are large. 05/07/2023 19:00:40 - WARNING - huggingface_hub.repository - Adding files tracked by Git LFS: ['wandb/run-20230506_113337-ysywp688/run-ysywp688.wandb', 'wandb/run-20230507_103405-9zf5xxpu/run-9zf5xxpu.wandb']. This may take a bit of time if the files are large. /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' 20%|██ | 1024/5000 [8:37:52<30:54:36, 27.99s/it] 21%|██ | 1049/5000 [8:49:36<30:47:50, 28.06s/it] 21%|██▏ | 1074/5000 [9:01:12<30:10:29, 27.67s/it] 22%|██▏ | 1099/5000 [9:12:51<29:28:49, 27.21s/it] 22%|██▎ | 1125/5000 [9:24:55<30:10:55, 28.04s/it] 23%|██▎ | 1144/5000 [9:32:41<16:23:31, 15.30s/it] Reading metadata...: 23270it [00:01, 13633.81it/s] 23%|██▎ | 1150/5000 [9:36:28<31:22:23, 29.34s/it] 24%|██▎ | 1175/5000 [9:48:12<29:43:46, 27.98s/it] 24%|██▍ | 1199/5000 [9:59:22<29:47:08, 28.21s/it] 24%|██▍ | 1224/5000 [10:11:01<29:19:05, 27.95s/it] 25%|██▌ | 1250/5000 [10:23:19<28:54:41, 27.75s/it] 26%|██▌ | 1275/5000 [10:35:05<28:48:22, 27.84s/it] 26%|██▌ | 1299/5000 [10:46:17<28:38:13, 27.86s/it] 26%|██▌ | 1308/5000 [10:49:12<14:02:21, 13.69s/it] Reading metadata...: 10438it [00:00, 26165.49it/s] 26%|██▋ | 1324/5000 [10:57:48<28:34:52, 27.99s/it] 27%|██▋ | 1349/5000 [11:09:24<28:12:24, 27.81s/it] 27%|██▋ | 1374/5000 [11:21:05<27:52:56, 27.68s/it] 28%|██▊ | 1399/5000 [11:32:41<28:02:08, 28.03s/it] 28%|██▊ | 1425/5000 [11:44:46<27:43:38, 27.92s/it] 29%|██▉ | 1450/5000 [11:56:27<27:40:17, 28.06s/it] 29%|██▉ | 1471/5000 [12:05:07<15:02:59, 15.35s/it] Reading metadata...: 10438it [00:00, 26428.08it/s] 29%|██▉ | 1474/5000 [12:07:33<32:09:36, 32.83s/it] 30%|██▉ | 1499/5000 [12:19:20<27:21:26, 28.13s/it] 30%|███ | 1525/5000 [12:31:31<27:52:56, 28.89s/it] 31%|███ | 1549/5000 [12:42:45<27:06:36, 28.28s/it] 31%|███▏ | 1574/5000 [12:54:23<26:59:56, 28.37s/it] 32%|███▏ | 1599/5000 [13:06:01<26:09:28, 27.69s/it] 32%|███▏ | 1624/5000 [13:17:39<26:33:00, 28.31s/it] 33%|███▎ | 1635/5000 [13:21:27<12:44:04, 13.62s/it] Reading metadata...: 10438it [00:00, 27803.36it/s] 33%|███▎ | 1650/5000 [13:29:38<25:39:58, 27.58s/it] 34%|███▎ | 1675/5000 [13:41:06<25:19:32, 27.42s/it] 34%|███▍ | 1700/5000 [13:52:38<25:50:17, 28.19s/it] 34%|███▍ | 1725/5000 [14:04:25<26:15:06, 28.86s/it] 35%|███▍ | 1749/5000 [14:15:36<25:00:19, 27.69s/it] 36%|███▌ | 1775/5000 [14:27:51<24:48:36, 27.69s/it] Reading metadata...: 28043it [00:01, 15924.27it/s]22s/it] Reading metadata...: 10438it [00:00, 12100.29it/s] 36%|███▌ | 1800/5000 [14:39:33<32:23:23, 36.44s/it] 36%|███▋ | 1825/5000 [14:51:16<24:28:15, 27.75s/it] 37%|███▋ | 1850/5000 [15:02:48<24:08:51, 27.60s/it] 37%|███▋ | 1874/5000 [15:14:04<25:09:04, 28.97s/it] 38%|███▊ | 1900/5000 [15:26:16<24:03:23, 27.94s/it] 38%|███▊ | 1924/5000 [15:37:35<24:05:39, 28.20s/it] 39%|███▉ | 1950/5000 [15:49:42<23:58:06, 28.29s/it] 39%|███▉ | 1962/5000 [15:54:00<11:31:58, 13.67s/it] Reading metadata...: 10438it [00:00, 22408.00it/s] 40%|███▉ | 1975/5000 [16:01:23<23:43:18, 28.23s/it] 40%|████ | 2000/5000 [16:12:55<22:45:51, 27.32s/it][INFO|trainer.py:3138] 2023-05-08 02:47:05,817 >> ***** Running Evaluation ***** [INFO|trainer.py:3142] 2023-05-08 02:47:05,817 >> Num examples: Unknown [INFO|trainer.py:3143] 2023-05-08 02:47:05,817 >> Batch size = 64 {'loss': 0.0042, 'learning_rate': 6.671111111111112e-06, 'epoch': 12.01} Reading metadata...: 10440it [00:00, 30685.17it/s] [INFO|trainer_utils.py:693] 2023-05-08 02:47:16,424 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. {'eval_loss': 0.5698409676551819, 'eval_wer': 55.900000000000006, 'eval_runtime': 2152.7247, 'eval_samples_per_second': 4.85, 'eval_steps_per_second': 0.076, 'epoch': 12.01} 40%|████ | 2000/5000 [16:48:47<22:45:51, 27.32s/it][INFO|trainer.py:2877] 2023-05-08 03:22:58,551 >> Saving model checkpoint to ./checkpoint-2000 [INFO|configuration_utils.py:458] 2023-05-08 03:22:58,556 >> Configuration saved in ./checkpoint-2000/config.json [INFO|configuration_utils.py:364] 2023-05-08 03:22:58,560 >> Configuration saved in ./checkpoint-2000/generation_config.json [INFO|modeling_utils.py:1855] 2023-05-08 03:23:01,997 >> Model weights saved in ./checkpoint-2000/pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-08 03:23:02,003 >> Feature extractor saved in ./checkpoint-2000/preprocessor_config.json [INFO|feature_extraction_utils.py:369] 2023-05-08 03:23:12,574 >> Feature extractor saved in ./preprocessor_config.json /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' 40%|████ | 2024/5000 [17:01:13<23:31:21, 28.45s/it] 41%|████ | 2049/5000 [17:13:06<23:57:15, 29.22s/it] 41%|████▏ | 2074/5000 [17:24:50<22:41:09, 27.91s/it] 42%|████▏ | 2099/5000 [17:36:39<22:30:05, 27.92s/it] 42%|████▎ | 2125/5000 [17:47:57<12:30:01, 15.65s/it] {'loss': 0.0032, 'learning_rate': 6.393333333333334e-06, 'epoch': 12.03} Reading metadata...: 23650it [00:01, 18542.50it/s] 43%|████▎ | 2149/5000 [18:00:53<22:00:05, 27.78s/it] 43%|████▎ | 2174/5000 [18:12:39<22:18:15, 28.41s/it] 44%|████▍ | 2199/5000 [18:24:28<22:05:44, 28.40s/it] 44%|████▍ | 2224/5000 [18:36:09<21:36:49, 28.03s/it] 45%|████▍ | 2249/5000 [18:47:42<20:47:07, 27.20s/it] 45%|████▌ | 2274/5000 [18:59:24<21:26:31, 28.32s/it] 46%|████▌ | 2289/5000 [19:05:06<10:17:47, 13.67s/it] Reading metadata...: 23030it [00:01, 14373.66it/s] 46%|████▌ | 2299/5000 [19:11:04<21:25:22, 28.55s/it] 46%|████▋ | 2324/5000 [19:22:58<20:25:42, 27.48s/it] 47%|████▋ | 2349/5000 [19:34:43<20:34:03, 27.93s/it] 48%|████▊ | 2375/5000 [19:47:01<20:35:00, 28.23s/it] 48%|████▊ | 2399/5000 [19:58:11<20:04:34, 27.79s/it] 48%|████▊ | 2424/5000 [20:09:46<19:53:06, 27.79s/it] 49%|████▉ | 2449/5000 [20:21:05<16:53:12, 23.83s/it] 49%|████▉ | 2452/5000 [20:21:37<10:49:43, 15.30s/it] Reading metadata...: 10438it [00:00, 20922.81it/s] 49%|████▉ | 2474/5000 [20:32:51<19:38:28, 27.99s/it] 50%|████▉ | 2499/5000 [20:44:26<19:28:55, 28.04s/it] 50%|█████ | 2524/5000 [20:56:12<19:11:49, 27.91s/it] 51%|█████ | 2549/5000 [21:07:48<18:50:45, 27.68s/it] 51%|█████▏ | 2574/5000 [21:19:27<19:00:41, 28.21s/it] 52%|█████▏ | 2600/5000 [21:31:27<18:47:00, 28.18s/it] Reading metadata...: 28043it [00:00, 37305.62it/s]9s/it] Reading metadata...: 10438it [00:00, 30532.73it/s] 52%|█████▏ | 2624/5000 [21:42:30<18:57:49, 28.73s/it] 53%|█████▎ | 2649/5000 [21:54:03<18:04:16, 27.67s/it] 53%|█████▎ | 2674/5000 [22:05:42<17:55:47, 27.75s/it] 54%|█████▍ | 2699/5000 [22:17:12<17:36:47, 27.56s/it] 55%|█████▍ | 2725/5000 [22:29:20<17:34:08, 27.80s/it] 55%|█████▍ | 2749/5000 [22:40:23<17:10:47, 27.48s/it] 55%|█████▌ | 2774/5000 [22:51:54<16:57:03, 27.41s/it] Reading metadata...: 28043it [00:00, 37576.14it/s]8s/it] Reading metadata...: 10438it [00:00, 27193.36it/s] 56%|█████▌ | 2799/5000 [23:03:28<17:07:04, 28.00s/it] 56%|█████▋ | 2825/5000 [23:15:28<16:46:36, 27.77s/it] 57%|█████▋ | 2849/5000 [23:26:34<16:24:17, 27.46s/it] 57%|█████▋ | 2874/5000 [23:38:14<16:29:40, 27.93s/it] 58%|█████▊ | 2899/5000 [23:49:46<16:09:24, 27.68s/it] 58%|█████▊ | 2924/5000 [24:01:17<15:52:41, 27.53s/it] 59%|█████▉ | 2943/5000 [24:08:47<7:45:50, 13.59s/it] Reading metadata...: 10438it [00:00, 21471.13it/s] 59%|█████▉ | 2949/5000 [24:12:45<17:03:19, 29.94s/it] 59%|█████▉ | 2974/5000 [24:24:18<15:46:57, 28.04s/it] 60%|██████ | 3000/5000 [24:36:21<15:21:38, 27.65s/it][INFO|trainer.py:3138] 2023-05-08 11:10:32,150 >> ***** Running Evaluation ***** [INFO|trainer.py:3142] 2023-05-08 11:10:32,150 >> Num examples: Unknown [INFO|trainer.py:3143] 2023-05-08 11:10:32,150 >> Batch size = 64 Reading metadata...: 0it [00:00, ?it/s] [INFO|trainer_utils.py:693] 2023-05-08 11:10:42,418 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. 60%|██████ | 3000/5000 [25:12:09<15:21:38, 27.65s/it][INFO|trainer.py:2877] 2023-05-08 11:46:20,051 >> Saving model checkpoint to ./checkpoint-3000 [INFO|configuration_utils.py:458] 2023-05-08 11:46:20,057 >> Configuration saved in ./checkpoint-3000/config.json [INFO|configuration_utils.py:364] 2023-05-08 11:46:20,060 >> Configuration saved in ./checkpoint-3000/generation_config.json {'eval_loss': 0.6270926594734192, 'eval_wer': 56.897333333333336, 'eval_runtime': 2147.8916, 'eval_samples_per_second': 4.861, 'eval_steps_per_second': 0.076, 'epoch': 18.01} [INFO|modeling_utils.py:1855] 2023-05-08 11:46:22,975 >> Model weights saved in ./checkpoint-3000/pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-08 11:46:22,981 >> Feature extractor saved in ./checkpoint-3000/preprocessor_config.json