05/07/2023 10:33:39 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 2distributed training: True, 16-bits training: True 05/07/2023 10:33:39 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments( _n_gpu=2, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_backend=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=1000, evaluation_strategy=steps, fp16=True, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'fsdp_min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, generation_config=None, generation_max_length=225, generation_num_beams=None, gradient_accumulation_steps=2, gradient_checkpointing=True, greater_is_better=False, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=1e-05, length_column_name=input_length, load_best_model_at_end=True, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=./runs/May07_10-33-38_crimv3mgpu025, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=25, logging_strategy=steps, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=5000, metric_for_best_model=wer, mp_parameters=, no_cuda=False, num_train_epochs=3.0, optim=adamw_hf, optim_args=None, output_dir=./, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=32, per_device_train_batch_size=32, predict_with_generate=True, prediction_loss_only=False, push_to_hub=True, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['wandb'], resume_from_checkpoint=None, run_name=./, save_on_each_node=False, save_safetensors=False, save_steps=1000, save_strategy=steps, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, sortish_sampler=False, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=500, weight_decay=0.0, xpu_backend=None, ) 05/07/2023 10:33:39 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments( _n_gpu=2, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_backend=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=1000, evaluation_strategy=steps, fp16=True, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'fsdp_min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, generation_config=None, generation_max_length=225, generation_num_beams=None, gradient_accumulation_steps=2, gradient_checkpointing=True, greater_is_better=False, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=1e-05, length_column_name=input_length, load_best_model_at_end=True, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=./runs/May07_10-33-38_crimv3mgpu025, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=25, logging_strategy=steps, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=5000, metric_for_best_model=wer, mp_parameters=, no_cuda=False, num_train_epochs=3.0, optim=adamw_hf, optim_args=None, output_dir=./, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=32, per_device_train_batch_size=32, predict_with_generate=True, prediction_loss_only=False, push_to_hub=True, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['wandb'], resume_from_checkpoint=None, run_name=./, save_on_each_node=False, save_safetensors=False, save_steps=1000, save_strategy=steps, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, sortish_sampler=False, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=500, weight_decay=0.0, xpu_backend=None, ) [INFO|configuration_utils.py:669] 2023-05-07 10:33:51,873 >> loading configuration file config.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/config.json [INFO|configuration_utils.py:725] 2023-05-07 10:33:51,887 >> Model config WhisperConfig { "_name_or_path": "openai/whisper-small", "activation_dropout": 0.0, "activation_function": "gelu", "apply_spec_augment": false, "architectures": [ "WhisperForConditionalGeneration" ], "attention_dropout": 0.0, "begin_suppress_tokens": [ 220, 50257 ], "bos_token_id": 50257, "classifier_proj_size": 256, "d_model": 768, "decoder_attention_heads": 12, "decoder_ffn_dim": 3072, "decoder_layerdrop": 0.0, "decoder_layers": 12, "decoder_start_token_id": 50258, "dropout": 0.0, "encoder_attention_heads": 12, "encoder_ffn_dim": 3072, "encoder_layerdrop": 0.0, "encoder_layers": 12, "eos_token_id": 50257, "forced_decoder_ids": [ [ 1, 50259 ], [ 2, 50359 ], [ 3, 50363 ] ], "init_std": 0.02, "is_encoder_decoder": true, "mask_feature_length": 10, "mask_feature_min_masks": 0, "mask_feature_prob": 0.0, "mask_time_length": 10, "mask_time_min_masks": 2, "mask_time_prob": 0.05, "max_length": 448, "max_source_positions": 1500, "max_target_positions": 448, "model_type": "whisper", "num_hidden_layers": 12, "num_mel_bins": 80, "pad_token_id": 50257, "scale_embedding": false, "suppress_tokens": [ 1, 2, 7, 8, 9, 10, 14, 25, 26, 27, 28, 29, 31, 58, 59, 60, 61, 62, 63, 90, 91, 92, 93, 359, 503, 522, 542, 873, 893, 902, 918, 922, 931, 1350, 1853, 1982, 2460, 2627, 3246, 3253, 3268, 3536, 3846, 3961, 4183, 4667, 6585, 6647, 7273, 9061, 9383, 10428, 10929, 11938, 12033, 12331, 12562, 13793, 14157, 14635, 15265, 15618, 16553, 16604, 18362, 18956, 20075, 21675, 22520, 26130, 26161, 26435, 28279, 29464, 31650, 32302, 32470, 36865, 42863, 47425, 49870, 50254, 50258, 50360, 50361, 50362 ], "torch_dtype": "float32", "transformers_version": "4.29.0.dev0", "use_cache": true, "use_weighted_layer_sum": false, "vocab_size": 51865 } [INFO|feature_extraction_utils.py:469] 2023-05-07 10:33:52,076 >> loading configuration file preprocessor_config.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/preprocessor_config.json [INFO|feature_extraction_utils.py:511] 2023-05-07 10:33:52,082 >> Feature extractor WhisperFeatureExtractor { "chunk_length": 30, "feature_extractor_type": "WhisperFeatureExtractor", "feature_size": 80, "hop_length": 160, "n_fft": 400, "n_samples": 480000, "nb_max_frames": 3000, "padding_side": "right", "padding_value": 0.0, "processor_class": "WhisperProcessor", "return_attention_mask": false, "sampling_rate": 16000 } [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file vocab.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/vocab.json [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file tokenizer.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/tokenizer.json [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file merges.txt from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/merges.txt [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file normalizer.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/normalizer.json [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file added_tokens.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/added_tokens.json [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file special_tokens_map.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/special_tokens_map.json [INFO|tokenization_utils_base.py:1810] 2023-05-07 10:33:52,291 >> loading file tokenizer_config.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/tokenizer_config.json [INFO|modeling_utils.py:2542] 2023-05-07 10:33:52,385 >> loading weights file pytorch_model.bin from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/pytorch_model.bin [INFO|configuration_utils.py:577] 2023-05-07 10:33:52,963 >> Generate config GenerationConfig { "_from_model_config": true, "begin_suppress_tokens": [ 220, 50257 ], "bos_token_id": 50257, "decoder_start_token_id": 50258, "eos_token_id": 50257, "max_length": 448, "pad_token_id": 50257, "transformers_version": "4.29.0.dev0", "use_cache": false } [INFO|modeling_utils.py:3211] 2023-05-07 10:33:55,474 >> All model checkpoint weights were used when initializing WhisperForConditionalGeneration. [INFO|modeling_utils.py:3219] 2023-05-07 10:33:55,474 >> All the weights of WhisperForConditionalGeneration were initialized from the model checkpoint at openai/whisper-small. If your task is similar to the task the model of the checkpoint was trained on, you can already use WhisperForConditionalGeneration for predictions without further training. [INFO|configuration_utils.py:539] 2023-05-07 10:33:55,680 >> loading configuration file generation_config.json from cache at /home/local/QCRI/dizham/.cache/huggingface/hub/models--openai--whisper-small/snapshots/f6744499d1eba717bcf4d6be735e3d386ffb60ad/generation_config.json [INFO|configuration_utils.py:577] 2023-05-07 10:33:55,681 >> Generate config GenerationConfig { "begin_suppress_tokens": [ 220, 50257 ], "bos_token_id": 50257, "decoder_start_token_id": 50258, "eos_token_id": 50257, "forced_decoder_ids": [ [ 1, null ], [ 2, 50359 ] ], "is_multilingual": true, "lang_to_id": { "<|af|>": 50327, "<|am|>": 50334, "<|ar|>": 50272, "<|as|>": 50350, "<|az|>": 50304, "<|ba|>": 50355, "<|be|>": 50330, "<|bg|>": 50292, "<|bn|>": 50302, "<|bo|>": 50347, "<|br|>": 50309, "<|bs|>": 50315, "<|ca|>": 50270, "<|cs|>": 50283, "<|cy|>": 50297, "<|da|>": 50285, "<|de|>": 50261, "<|el|>": 50281, "<|en|>": 50259, "<|es|>": 50262, "<|et|>": 50307, "<|eu|>": 50310, "<|fa|>": 50300, "<|fi|>": 50277, "<|fo|>": 50338, "<|fr|>": 50265, "<|gl|>": 50319, "<|gu|>": 50333, "<|haw|>": 50352, "<|ha|>": 50354, "<|he|>": 50279, "<|hi|>": 50276, "<|hr|>": 50291, "<|ht|>": 50339, "<|hu|>": 50286, "<|hy|>": 50312, "<|id|>": 50275, "<|is|>": 50311, "<|it|>": 50274, "<|ja|>": 50266, "<|jw|>": 50356, "<|ka|>": 50329, "<|kk|>": 50316, "<|km|>": 50323, "<|kn|>": 50306, "<|ko|>": 50264, "<|la|>": 50294, "<|lb|>": 50345, "<|ln|>": 50353, "<|lo|>": 50336, "<|lt|>": 50293, "<|lv|>": 50301, "<|mg|>": 50349, "<|mi|>": 50295, "<|mk|>": 50308, "<|ml|>": 50296, "<|mn|>": 50314, "<|mr|>": 50320, "<|ms|>": 50282, "<|mt|>": 50343, "<|my|>": 50346, "<|ne|>": 50313, "<|nl|>": 50271, "<|nn|>": 50342, "<|no|>": 50288, "<|oc|>": 50328, "<|pa|>": 50321, "<|pl|>": 50269, "<|ps|>": 50340, "<|pt|>": 50267, "<|ro|>": 50284, "<|ru|>": 50263, "<|sa|>": 50344, "<|sd|>": 50332, "<|si|>": 50322, "<|sk|>": 50298, "<|sl|>": 50305, "<|sn|>": 50324, "<|so|>": 50326, "<|sq|>": 50317, "<|sr|>": 50303, "<|su|>": 50357, "<|sv|>": 50273, "<|sw|>": 50318, "<|ta|>": 50287, "<|te|>": 50299, "<|tg|>": 50331, "<|th|>": 50289, "<|tk|>": 50341, "<|tl|>": 50348, "<|tr|>": 50268, "<|tt|>": 50351, "<|uk|>": 50280, "<|ur|>": 50290, "<|uz|>": 50337, "<|vi|>": 50278, "<|yi|>": 50335, "<|yo|>": 50325, "<|zh|>": 50260 }, "max_initial_timestamp_index": 1, "max_length": 448, "no_timestamps_token_id": 50363, "pad_token_id": 50257, "return_timestamps": false, "suppress_tokens": [ 1, 2, 7, 8, 9, 10, 14, 25, 26, 27, 28, 29, 31, 58, 59, 60, 61, 62, 63, 90, 91, 92, 93, 359, 503, 522, 542, 873, 893, 902, 918, 922, 931, 1350, 1853, 1982, 2460, 2627, 3246, 3253, 3268, 3536, 3846, 3961, 4183, 4667, 6585, 6647, 7273, 9061, 9383, 10428, 10929, 11938, 12033, 12331, 12562, 13793, 14157, 14635, 15265, 15618, 16553, 16604, 18362, 18956, 20075, 21675, 22520, 26130, 26161, 26435, 28279, 29464, 31650, 32302, 32470, 36865, 42863, 47425, 49870, 50254, 50258, 50358, 50359, 50360, 50361, 50362 ], "task_to_id": { "transcribe": 50359, "translate": 50358 }, "transformers_version": "4.29.0.dev0" } [INFO|feature_extraction_utils.py:369] 2023-05-07 10:33:56,907 >> Feature extractor saved in ./preprocessor_config.json [INFO|tokenization_utils_base.py:2181] 2023-05-07 10:33:56,915 >> tokenizer config file saved in ./tokenizer_config.json [INFO|tokenization_utils_base.py:2188] 2023-05-07 10:33:56,922 >> Special tokens file saved in ./special_tokens_map.json [INFO|configuration_utils.py:458] 2023-05-07 10:33:57,075 >> Configuration saved in ./config.json [INFO|image_processing_utils.py:307] 2023-05-07 10:33:57,075 >> loading configuration file ./preprocessor_config.json [INFO|feature_extraction_utils.py:467] 2023-05-07 10:33:57,084 >> loading configuration file ./preprocessor_config.json [INFO|feature_extraction_utils.py:511] 2023-05-07 10:33:57,085 >> Feature extractor WhisperFeatureExtractor { "chunk_length": 30, "feature_extractor_type": "WhisperFeatureExtractor", "feature_size": 80, "hop_length": 160, "n_fft": 400, "n_samples": 480000, "nb_max_frames": 3000, "padding_side": "right", "padding_value": 0.0, "processor_class": "WhisperProcessor", "return_attention_mask": false, "sampling_rate": 16000 } [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file vocab.json [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file merges.txt [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file normalizer.json [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:1808] 2023-05-07 10:33:57,086 >> loading file tokenizer_config.json [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|startoftranscript|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|en|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|zh|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|de|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|es|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|ru|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|ko|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|fr|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|ja|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|pt|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|tr|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|pl|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|ca|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|nl|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|ar|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|sv|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|it|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|id|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|hi|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|fi|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,149 >> Adding <|vi|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|he|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|uk|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|el|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|ms|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|cs|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|ro|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|da|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|hu|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|ta|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|no|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|th|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|ur|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|hr|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|bg|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|lt|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|la|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|mi|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|ml|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|cy|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|sk|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|te|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|fa|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|lv|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|bn|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|sr|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|az|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|sl|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|kn|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|et|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|mk|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|br|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|eu|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|is|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|hy|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|ne|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|mn|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|bs|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|kk|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|sq|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|sw|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|gl|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,150 >> Adding <|mr|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|pa|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|si|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|km|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|sn|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|yo|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|so|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|af|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|oc|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|ka|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|be|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|tg|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|sd|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|gu|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|am|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|yi|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|lo|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|uz|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|fo|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|ht|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|ps|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|tk|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|nn|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|mt|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|sa|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|lb|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|my|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|bo|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|tl|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|mg|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|as|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|tt|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|haw|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|ln|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|ha|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|ba|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|jw|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|su|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|translate|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|transcribe|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|startoflm|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,151 >> Adding <|startofprev|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,152 >> Adding <|nocaptions|> to the vocabulary [INFO|tokenization_utils.py:426] 2023-05-07 10:33:57,152 >> Adding <|notimestamps|> to the vocabulary /home/local/QCRI/dizham/kanari/whisper/whisper-small-ar/./ is already a clone of https://huggingface.co/danielizham/whisper-small-ar. Make sure you pull the latest changes with `repo.git_pull()`. 05/07/2023 10:34:00 - WARNING - huggingface_hub.repository - /home/local/QCRI/dizham/kanari/whisper/whisper-small-ar/./ is already a clone of https://huggingface.co/danielizham/whisper-small-ar. Make sure you pull the latest changes with `repo.git_pull()`. [INFO|trainer.py:565] 2023-05-07 10:34:02,856 >> max_steps is given, it will override any value given in num_train_epochs [INFO|trainer.py:622] 2023-05-07 10:34:02,856 >> Using cuda_amp half precision backend /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/transformers/optimization.py:407: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning warnings.warn( [INFO|trainer.py:1771] 2023-05-07 10:34:02,869 >> ***** Running training ***** [INFO|trainer.py:1772] 2023-05-07 10:34:02,869 >> Num examples = 640,000 [INFO|trainer.py:1773] 2023-05-07 10:34:02,869 >> Num Epochs = 9,223,372,036,854,775,807 [INFO|trainer.py:1774] 2023-05-07 10:34:02,870 >> Instantaneous batch size per device = 32 [INFO|trainer.py:1775] 2023-05-07 10:34:02,870 >> Total train batch size (w. parallel, distributed & accumulation) = 128 [INFO|trainer.py:1776] 2023-05-07 10:34:02,870 >> Gradient Accumulation steps = 2 [INFO|trainer.py:1777] 2023-05-07 10:34:02,870 >> Total optimization steps = 5,000 [INFO|trainer.py:1778] 2023-05-07 10:34:02,871 >> Number of trainable parameters = 241,734,912 [INFO|integrations.py:720] 2023-05-07 10:34:02,872 >> Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true" wandb: Currently logged in as: danielizham. Use `wandb login --relogin` to force relogin wandb: Tracking run with wandb version 0.15.2 wandb: Run data is saved locally in /home/local/QCRI/dizham/kanari/whisper/whisper-small-ar/wandb/run-20230507_103405-9zf5xxpu wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run fast-feather-2 wandb: ⭐️ View project at https://wandb.ai/danielizham/huggingface wandb: 🚀 View run at https://wandb.ai/danielizham/huggingface/runs/9zf5xxpu 0%| | 0/5000 [00:00> The following columns in the training set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' 0%| | 1/5000 [01:47<148:38:46, 107.05s/it] 0%| | 2/5000 [02:14<83:36:35, 60.22s/it] 0%| | 3/5000 [02:41<62:42:18, 45.17s/it] 0%| | 4/5000 [03:10<53:25:35, 38.50s/it] 0%| | 5/5000 [03:40<49:28:00, 35.65s/it] 0%| | 6/5000 [04:08<45:33:39, 32.84s/it] 0%| | 7/5000 [04:35<42:56:31, 30.96s/it] 0%| | 8/5000 [05:02<41:32:10, 29.95s/it] 0%| | 9/5000 [05:30<40:27:17, 29.18s/it] 0%| | 10/5000 [05:59<40:15:31, 29.04s/it] 0%| | 11/5000 [06:25<39:17:47, 28.36s/it] 0%| | 12/5000 [06:54<39:16:28, 28.35s/it] 0%| | 13/5000 [07:22<39:09:18, 28.27s/it] 0%| | 14/5000 [07:50<39:02:25, 28.19s/it] 0%| | 15/5000 [08:17<38:33:57, 27.85s/it] 0%| | 16/5000 [08:45<38:35:55, 27.88s/it] 0%| | 17/5000 [09:13<38:32:36, 27.85s/it] 0%| | 18/5000 [09:40<38:19:57, 27.70s/it] 0%| | 19/5000 [10:10<39:17:48, 28.40s/it] 0%| | 20/5000 [10:37<38:47:34, 28.04s/it] 0%| | 21/5000 [11:04<38:25:49, 27.79s/it] 0%| | 22/5000 [11:33<38:42:41, 28.00s/it] 0%| | 23/5000 [12:01<38:38:51, 27.95s/it] 0%| | 24/5000 [12:28<38:28:42, 27.84s/it] 0%| | 25/5000 [12:57<38:59:22, 28.21s/it] 0%| | 25/5000 [12:57<38:59:22, 28.21s/it] 1%| | 26/5000 [13:24<38:30:52, 27.88s/it] 1%| | 27/5000 [13:52<38:13:36, 27.67s/it] 1%| | 28/5000 [14:20<38:19:33, 27.75s/it] 1%| | 29/5000 [14:47<38:05:23, 27.58s/it] 1%| | 30/5000 [15:14<38:00:17, 27.53s/it] 1%| | 31/5000 [15:42<38:12:38, 27.68s/it] 1%| | 32/5000 [16:09<37:58:26, 27.52s/it] 1%| | 33/5000 [16:36<37:47:47, 27.39s/it] 1%| | 34/5000 [17:07<38:52:40, 28.18s/it] 1%| | 35/5000 [17:33<38:19:00, 27.78s/it] 1%| | 36/5000 [18:01<38:02:35, 27.59s/it] 1%| | 37/5000 [18:30<38:40:53, 28.06s/it] 1%| | 38/5000 [18:57<38:17:23, 27.78s/it] 1%| | 39/5000 [19:24<38:09:39, 27.69s/it] 1%| | 40/5000 [19:54<38:59:07, 28.30s/it] 1%| | 41/5000 [20:21<38:36:11, 28.02s/it] 1%| | 42/5000 [20:50<38:40:08, 28.08s/it] 1%| | 43/5000 [21:17<38:33:12, 28.00s/it] 1%| | 44/5000 [21:44<38:03:57, 27.65s/it] 1%| | 45/5000 [22:13<38:24:59, 27.91s/it] 1%| | 46/5000 [22:40<38:11:21, 27.75s/it] 1%| | 47/5000 [23:10<38:55:38, 28.29s/it] 1%| | 48/5000 [23:37<38:24:02, 27.92s/it] 1%| | 49/5000 [24:03<37:46:25, 27.47s/it] 1%| | 50/5000 [24:31<38:04:45, 27.69s/it] 1%| | 50/5000 [24:31<38:04:45, 27.69s/it] 1%| | 51/5000 [24:59<38:02:43, 27.68s/it] 1%| | 52/5000 [25:26<37:51:48, 27.55s/it] 1%| | 53/5000 [25:55<38:16:50, 27.86s/it] 1%| | 54/5000 [26:22<38:10:08, 27.78s/it] 1%| | 55/5000 [26:50<37:59:21, 27.66s/it] 1%| | 56/5000 [27:18<38:09:00, 27.78s/it] 1%| | 57/5000 [27:45<37:51:24, 27.57s/it] 1%| | 58/5000 [28:12<37:39:30, 27.43s/it] 1%| | 59/5000 [28:42<38:33:05, 28.09s/it] 1%| | 60/5000 [29:09<38:09:46, 27.81s/it] 1%| | 61/5000 [29:36<38:01:21, 27.71s/it] 1%| | 62/5000 [30:05<38:26:25, 28.02s/it] 1%|▏ | 63/5000 [30:32<38:07:36, 27.80s/it] 1%|▏ | 64/5000 [30:59<37:48:53, 27.58s/it] 1%|▏ | 65/5000 [31:29<38:48:09, 28.31s/it] 1%|▏ | 66/5000 [31:57<38:25:18, 28.03s/it] 1%|▏ | 67/5000 [32:26<38:52:41, 28.37s/it] 1%|▏ | 68/5000 [32:54<38:41:09, 28.24s/it] 1%|▏ | 69/5000 [33:21<38:15:10, 27.93s/it] 1%|▏ | 70/5000 [33:49<38:07:26, 27.84s/it] 1%|▏ | 71/5000 [34:16<37:59:11, 27.74s/it] 1%|▏ | 72/5000 [34:47<39:02:11, 28.52s/it] 1%|▏ | 73/5000 [35:15<38:48:31, 28.36s/it] 1%|▏ | 74/5000 [35:42<38:24:24, 28.07s/it] 2%|▏ | 75/5000 [36:10<38:12:36, 27.93s/it] 2%|▏ | 75/5000 [36:10<38:12:36, 27.93s/it] 2%|▏ | 76/5000 [36:42<40:02:06, 29.27s/it] 2%|▏ | 77/5000 [37:09<39:13:52, 28.69s/it] 2%|▏ | 78/5000 [37:37<38:50:12, 28.41s/it] 2%|▏ | 79/5000 [38:06<39:05:12, 28.59s/it] 2%|▏ | 80/5000 [38:33<38:20:59, 28.06s/it] 2%|▏ | 81/5000 [39:01<38:30:04, 28.18s/it] 2%|▏ | 82/5000 [39:29<38:05:29, 27.88s/it] 2%|▏ | 83/5000 [39:56<37:52:05, 27.73s/it] 2%|▏ | 84/5000 [40:23<37:34:58, 27.52s/it] 2%|▏ | 85/5000 [40:51<37:40:57, 27.60s/it] 2%|▏ | 86/5000 [41:18<37:44:37, 27.65s/it] 2%|▏ | 87/5000 [41:46<37:36:43, 27.56s/it] 2%|▏ | 88/5000 [42:14<37:44:37, 27.66s/it] 2%|▏ | 89/5000 [42:42<37:51:35, 27.75s/it] 2%|▏ | 90/5000 [43:09<37:41:54, 27.64s/it] 2%|▏ | 91/5000 [43:37<37:50:15, 27.75s/it] 2%|▏ | 92/5000 [44:05<38:00:27, 27.88s/it] 2%|▏ | 93/5000 [44:32<37:38:09, 27.61s/it] 2%|▏ | 94/5000 [45:00<37:35:56, 27.59s/it] 2%|▏ | 95/5000 [45:27<37:35:42, 27.59s/it] 2%|▏ | 96/5000 [45:55<37:24:43, 27.46s/it] 2%|▏ | 97/5000 [46:23<37:39:21, 27.65s/it] 2%|▏ | 98/5000 [46:50<37:38:40, 27.65s/it] 2%|▏ | 99/5000 [47:18<37:32:18, 27.57s/it] 2%|▏ | 100/5000 [47:46<37:40:02, 27.67s/it] 2%|▏ | 100/5000 [47:46<37:40:02, 27.67s/it] 2%|▏ | 101/5000 [48:13<37:44:47, 27.74s/it] 2%|▏ | 102/5000 [48:41<37:29:30, 27.56s/it] 2%|▏ | 103/5000 [49:09<37:45:09, 27.75s/it] 2%|▏ | 104/5000 [49:36<37:34:21, 27.63s/it] 2%|▏ | 105/5000 [50:03<37:25:22, 27.52s/it] 2%|▏ | 106/5000 [50:31<37:25:13, 27.53s/it] 2%|▏ | 107/5000 [50:58<37:16:28, 27.42s/it] 2%|▏ | 108/5000 [51:25<37:13:39, 27.40s/it] 2%|▏ | 109/5000 [51:53<37:18:48, 27.46s/it] 2%|▏ | 110/5000 [52:20<37:12:04, 27.39s/it] 2%|▏ | 111/5000 [52:48<37:11:35, 27.39s/it] 2%|▏ | 112/5000 [53:15<37:19:40, 27.49s/it] 2%|▏ | 113/5000 [53:43<37:14:35, 27.44s/it] 2%|▏ | 114/5000 [54:10<37:06:26, 27.34s/it] 2%|▏ | 115/5000 [54:37<37:12:39, 27.42s/it] 2%|▏ | 116/5000 [55:06<37:30:46, 27.65s/it] 2%|▏ | 117/5000 [55:33<37:33:08, 27.69s/it] 2%|▏ | 118/5000 [56:01<37:25:22, 27.60s/it] 2%|▏ | 119/5000 [56:29<37:37:17, 27.75s/it] 2%|▏ | 120/5000 [56:56<37:25:52, 27.61s/it] 2%|▏ | 121/5000 [57:24<37:19:37, 27.54s/it] 2%|▏ | 122/5000 [57:53<38:07:24, 28.14s/it] 2%|▏ | 123/5000 [58:20<37:47:02, 27.89s/it] 2%|▏ | 124/5000 [58:48<37:31:04, 27.70s/it] 2%|▎ | 125/5000 [59:17<37:59:12, 28.05s/it] 2%|▎ | 125/5000 [59:17<37:59:12, 28.05s/it] 3%|▎ | 126/5000 [59:43<37:15:16, 27.52s/it] 3%|▎ | 127/5000 [1:00:08<36:26:56, 26.93s/it] 3%|▎ | 128/5000 [1:00:37<36:56:46, 27.30s/it] 3%|▎ | 129/5000 [1:01:03<36:45:18, 27.16s/it] 3%|▎ | 130/5000 [1:01:30<36:26:39, 26.94s/it] 3%|▎ | 131/5000 [1:01:57<36:38:18, 27.09s/it] 3%|▎ | 132/5000 [1:02:25<37:00:00, 27.36s/it] 3%|▎ | 133/5000 [1:02:52<36:50:32, 27.25s/it] 3%|▎ | 134/5000 [1:03:20<36:51:47, 27.27s/it] 3%|▎ | 135/5000 [1:03:50<38:00:00, 28.12s/it] 3%|▎ | 136/5000 [1:04:17<37:37:28, 27.85s/it] 3%|▎ | 137/5000 [1:04:44<37:24:28, 27.69s/it] 3%|▎ | 138/5000 [1:05:13<37:57:48, 28.11s/it] 3%|▎ | 139/5000 [1:05:41<37:35:32, 27.84s/it] 3%|▎ | 140/5000 [1:06:08<37:17:24, 27.62s/it] 3%|▎ | 141/5000 [1:06:35<37:22:03, 27.69s/it] 3%|▎ | 142/5000 [1:07:03<37:12:38, 27.57s/it] 3%|▎ | 143/5000 [1:07:30<37:00:17, 27.43s/it] 3%|▎ | 144/5000 [1:07:58<37:20:26, 27.68s/it] 3%|▎ | 145/5000 [1:08:26<37:18:01, 27.66s/it] 3%|▎ | 146/5000 [1:08:53<37:10:30, 27.57s/it] 3%|▎ | 147/5000 [1:09:24<38:26:43, 28.52s/it] 3%|▎ | 148/5000 [1:09:51<37:53:31, 28.11s/it] 3%|▎ | 149/5000 [1:10:18<37:34:47, 27.89s/it] 3%|▎ | 150/5000 [1:10:46<37:32:21, 27.86s/it] 3%|▎ | 150/5000 [1:10:46<37:32:21, 27.86s/it] 3%|▎ | 151/5000 [1:11:14<37:20:59, 27.73s/it] 3%|▎ | 152/5000 [1:11:41<37:11:01, 27.61s/it] 3%|▎ | 153/5000 [1:12:11<38:05:01, 28.29s/it] 3%|▎ | 154/5000 [1:12:38<37:41:38, 28.00s/it] 3%|▎ | 155/5000 [1:13:05<37:14:21, 27.67s/it] 3%|▎ | 156/5000 [1:13:34<37:40:15, 28.00s/it] 3%|▎ | 157/5000 [1:14:01<37:22:01, 27.78s/it] 3%|▎ | 158/5000 [1:14:28<37:08:30, 27.61s/it] 3%|▎ | 159/5000 [1:14:56<37:18:52, 27.75s/it] 3%|▎ | 160/5000 [1:15:10<31:43:49, 23.60s/it] 3%|▎ | 161/5000 [1:15:21<26:35:31, 19.78s/it] 3%|▎ | 162/5000 [1:15:32<23:01:15, 17.13s/it] 3%|▎ | 163/5000 [1:15:43<20:26:07, 15.21s/it]{'loss': 0.9879, 'learning_rate': 4.6000000000000004e-07, 'epoch': 0.01} {'loss': 0.8962, 'learning_rate': 9.600000000000001e-07, 'epoch': 0.01} {'loss': 0.6006, 'learning_rate': 1.46e-06, 'epoch': 0.01} {'loss': 0.4218, 'learning_rate': 1.9600000000000003e-06, 'epoch': 0.02} {'loss': 0.4419, 'learning_rate': 2.46e-06, 'epoch': 0.03} {'loss': 0.4007, 'learning_rate': 2.96e-06, 'epoch': 0.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 2.54it/s] Reading metadata...: 15060it [00:00, 40107.89it/s] Reading metadata...: 23919it [00:01, 14960.32it/s] Reading metadata...: 28043it [00:01, 18033.45it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:01, 1.15s/it] Reading metadata...: 10438it [00:01, 8535.33it/s] 3%|▎ | 164/5000 [1:17:12<50:13:58, 37.39s/it] 3%|▎ | 165/5000 [1:17:40<46:30:54, 34.63s/it] 3%|▎ | 166/5000 [1:18:08<43:58:06, 32.74s/it] 3%|▎ | 167/5000 [1:18:37<42:17:46, 31.51s/it] 3%|▎ | 168/5000 [1:19:04<40:32:14, 30.20s/it] 3%|▎ | 169/5000 [1:19:33<39:51:03, 29.70s/it] 3%|▎ | 170/5000 [1:20:02<39:30:40, 29.45s/it] 3%|▎ | 171/5000 [1:20:31<39:30:47, 29.46s/it] 3%|▎ | 172/5000 [1:20:59<39:02:18, 29.11s/it] 3%|▎ | 173/5000 [1:21:27<38:19:55, 28.59s/it] 3%|▎ | 174/5000 [1:21:55<38:12:08, 28.50s/it] 4%|▎ | 175/5000 [1:22:25<38:42:56, 28.89s/it] 4%|▎ | 175/5000 [1:22:25<38:42:56, 28.89s/it] 4%|▎ | 176/5000 [1:22:52<37:55:34, 28.30s/it] 4%|▎ | 177/5000 [1:23:20<37:58:12, 28.34s/it] 4%|▎ | 178/5000 [1:23:48<37:52:23, 28.28s/it] 4%|▎ | 179/5000 [1:24:16<37:46:36, 28.21s/it] 4%|▎ | 180/5000 [1:24:44<37:33:35, 28.05s/it] 4%|▎ | 181/5000 [1:25:11<37:14:38, 27.82s/it] 4%|▎ | 182/5000 [1:25:44<39:19:43, 29.39s/it] 4%|▎ | 183/5000 [1:26:13<38:47:39, 28.99s/it] 4%|▎ | 184/5000 [1:26:40<38:00:33, 28.41s/it] 4%|▎ | 185/5000 [1:27:08<38:09:17, 28.53s/it] 4%|▎ | 186/5000 [1:27:37<38:02:08, 28.44s/it] 4%|▎ | 187/5000 [1:28:04<37:29:01, 28.04s/it] 4%|▍ | 188/5000 [1:28:33<37:47:02, 28.27s/it] 4%|▍ | 189/5000 [1:29:00<37:39:52, 28.18s/it] 4%|▍ | 190/5000 [1:29:28<37:13:28, 27.86s/it] 4%|▍ | 191/5000 [1:29:56<37:29:50, 28.07s/it] 4%|▍ | 192/5000 [1:30:26<38:20:14, 28.71s/it] 4%|▍ | 193/5000 [1:30:54<37:43:33, 28.25s/it] 4%|▍ | 194/5000 [1:31:23<38:01:21, 28.48s/it] 4%|▍ | 195/5000 [1:31:51<38:09:37, 28.59s/it] 4%|▍ | 196/5000 [1:32:19<37:38:58, 28.21s/it] 4%|▍ | 197/5000 [1:32:48<37:55:29, 28.43s/it] 4%|▍ | 198/5000 [1:33:15<37:25:58, 28.06s/it] 4%|▍ | 199/5000 [1:33:41<36:50:43, 27.63s/it] 4%|▍ | 200/5000 [1:34:15<39:11:58, 29.40s/it] 4%|▍ | 200/5000 [1:34:15<39:11:58, 29.40s/it] 4%|▍ | 201/5000 [1:34:44<38:50:49, 29.14s/it] 4%|▍ | 202/5000 [1:35:11<38:06:09, 28.59s/it] 4%|▍ | 203/5000 [1:35:39<37:59:39, 28.51s/it] 4%|▍ | 204/5000 [1:36:07<37:39:47, 28.27s/it] 4%|▍ | 205/5000 [1:36:34<37:18:41, 28.01s/it] 4%|▍ | 206/5000 [1:37:03<37:22:26, 28.07s/it] 4%|▍ | 207/5000 [1:37:31<37:39:51, 28.29s/it] 4%|▍ | 208/5000 [1:37:58<36:56:51, 27.76s/it] 4%|▍ | 209/5000 [1:38:26<37:06:54, 27.89s/it] 4%|▍ | 210/5000 [1:38:54<36:58:07, 27.78s/it] 4%|▍ | 211/5000 [1:39:21<36:51:24, 27.71s/it] 4%|▍ | 212/5000 [1:39:48<36:25:39, 27.39s/it] 4%|▍ | 213/5000 [1:40:14<36:06:47, 27.16s/it] 4%|▍ | 214/5000 [1:40:43<36:31:19, 27.47s/it] 4%|▍ | 215/5000 [1:41:10<36:30:44, 27.47s/it] 4%|▍ | 216/5000 [1:41:39<37:02:43, 27.88s/it] 4%|▍ | 217/5000 [1:42:08<37:27:08, 28.19s/it] 4%|▍ | 218/5000 [1:42:35<37:12:10, 28.01s/it] 4%|▍ | 219/5000 [1:43:04<37:27:18, 28.20s/it] 4%|▍ | 220/5000 [1:43:32<37:32:49, 28.28s/it] 4%|▍ | 221/5000 [1:44:00<37:14:15, 28.05s/it] 4%|▍ | 222/5000 [1:44:29<37:34:23, 28.31s/it] 4%|▍ | 223/5000 [1:44:56<37:10:20, 28.01s/it] 4%|▍ | 224/5000 [1:45:23<36:46:48, 27.72s/it] 4%|▍ | 225/5000 [1:45:53<37:34:07, 28.32s/it] 4%|▍ | 225/5000 [1:45:53<37:34:07, 28.32s/it] 5%|▍ | 226/5000 [1:46:20<37:01:08, 27.92s/it] 5%|▍ | 227/5000 [1:46:47<36:45:13, 27.72s/it] 5%|▍ | 228/5000 [1:47:16<37:12:24, 28.07s/it] 5%|▍ | 229/5000 [1:47:45<37:36:57, 28.38s/it] 5%|▍ | 230/5000 [1:48:12<37:03:13, 27.97s/it] 5%|▍ | 231/5000 [1:48:40<37:09:03, 28.04s/it] 5%|▍ | 232/5000 [1:49:09<37:18:54, 28.17s/it] 5%|▍ | 233/5000 [1:49:36<37:00:35, 27.95s/it] 5%|▍ | 234/5000 [1:50:05<37:20:39, 28.21s/it] 5%|▍ | 235/5000 [1:50:33<37:08:04, 28.06s/it] 5%|▍ | 236/5000 [1:51:00<36:46:12, 27.79s/it] 5%|▍ | 237/5000 [1:51:28<37:02:06, 27.99s/it] 5%|▍ | 238/5000 [1:51:57<37:14:26, 28.15s/it] 5%|▍ | 239/5000 [1:52:26<37:23:32, 28.27s/it] 5%|▍ | 240/5000 [1:52:53<36:58:16, 27.96s/it] 5%|▍ | 241/5000 [1:53:22<37:17:08, 28.21s/it] 5%|▍ | 242/5000 [1:53:50<37:23:16, 28.29s/it] 5%|▍ | 243/5000 [1:54:18<37:04:50, 28.06s/it] 5%|▍ | 244/5000 [1:54:46<37:20:31, 28.27s/it] 5%|▍ | 245/5000 [1:55:15<37:34:31, 28.45s/it] 5%|▍ | 246/5000 [1:55:43<37:08:49, 28.13s/it] 5%|▍ | 247/5000 [1:56:11<37:22:29, 28.31s/it] 5%|▍ | 248/5000 [1:56:40<37:33:11, 28.45s/it] 5%|▍ | 249/5000 [1:57:07<37:05:48, 28.11s/it] 5%|▌ | 250/5000 [1:57:37<37:49:09, 28.66s/it] 5%|▌ | 250/5000 [1:57:37<37:49:09, 28.66s/it] 5%|▌ | 251/5000 [1:58:06<37:46:16, 28.63s/it] 5%|▌ | 252/5000 [1:58:33<37:13:53, 28.23s/it] 5%|▌ | 253/5000 [1:59:01<37:12:43, 28.22s/it] 5%|▌ | 254/5000 [1:59:30<37:21:10, 28.33s/it] 5%|▌ | 255/5000 [1:59:57<36:56:09, 28.02s/it] 5%|▌ | 256/5000 [2:00:26<37:17:02, 28.29s/it] 5%|▌ | 257/5000 [2:00:56<37:42:29, 28.62s/it] 5%|▌ | 258/5000 [2:01:22<36:40:47, 27.85s/it] 5%|▌ | 259/5000 [2:01:50<36:50:29, 27.97s/it] 5%|▌ | 260/5000 [2:02:19<37:11:52, 28.25s/it] 5%|▌ | 261/5000 [2:02:46<36:45:24, 27.92s/it] 5%|▌ | 262/5000 [2:03:15<37:08:28, 28.22s/it] 5%|▌ | 263/5000 [2:03:43<36:55:19, 28.06s/it] 5%|▌ | 264/5000 [2:04:11<36:57:37, 28.09s/it] 5%|▌ | 265/5000 [2:04:38<36:37:24, 27.84s/it] 5%|▌ | 266/5000 [2:05:09<38:02:09, 28.92s/it] 5%|▌ | 267/5000 [2:05:38<37:58:42, 28.89s/it] 5%|▌ | 268/5000 [2:06:06<37:22:13, 28.43s/it] 5%|▌ | 269/5000 [2:06:33<36:50:00, 28.03s/it] 5%|▌ | 270/5000 [2:07:02<37:30:20, 28.55s/it] 5%|▌ | 271/5000 [2:07:29<36:42:57, 27.95s/it] 5%|▌ | 272/5000 [2:07:56<36:14:44, 27.60s/it] 5%|▌ | 273/5000 [2:08:25<37:00:39, 28.19s/it] 5%|▌ | 274/5000 [2:08:53<36:42:42, 27.97s/it] 6%|▌ | 275/5000 [2:09:20<36:22:53, 27.72s/it] 6%|▌ | 275/5000 [2:09:20<36:22:53, 27.72s/it] 6%|▌ | 276/5000 [2:09:50<37:23:24, 28.49s/it] 6%|▌ | 277/5000 [2:10:18<36:54:46, 28.14s/it] 6%|▌ | 278/5000 [2:10:46<37:06:12, 28.29s/it] 6%|▌ | 279/5000 [2:11:15<37:16:37, 28.43s/it] 6%|▌ | 280/5000 [2:11:43<37:06:44, 28.31s/it] 6%|▌ | 281/5000 [2:12:10<36:43:23, 28.02s/it] 6%|▌ | 282/5000 [2:12:39<36:56:56, 28.19s/it] 6%|▌ | 283/5000 [2:13:07<37:01:41, 28.26s/it] 6%|▌ | 284/5000 [2:13:33<36:08:17, 27.59s/it] 6%|▌ | 285/5000 [2:14:01<36:14:21, 27.67s/it] 6%|▌ | 286/5000 [2:14:30<36:32:41, 27.91s/it] 6%|▌ | 287/5000 [2:14:57<36:20:12, 27.76s/it] 6%|▌ | 288/5000 [2:15:28<37:40:15, 28.78s/it] 6%|▌ | 289/5000 [2:15:57<37:40:20, 28.79s/it] 6%|▌ | 290/5000 [2:16:24<37:05:20, 28.35s/it] 6%|▌ | 291/5000 [2:16:52<36:35:38, 27.98s/it] 6%|▌ | 292/5000 [2:17:21<37:08:29, 28.40s/it] 6%|▌ | 293/5000 [2:17:48<36:35:56, 27.99s/it] 6%|▌ | 294/5000 [2:18:15<36:15:07, 27.73s/it] 6%|▌ | 295/5000 [2:18:45<37:01:27, 28.33s/it] 6%|▌ | 296/5000 [2:19:12<36:38:26, 28.04s/it] 6%|▌ | 297/5000 [2:19:39<36:10:13, 27.69s/it] 6%|▌ | 298/5000 [2:20:09<37:13:08, 28.50s/it] 6%|▌ | 299/5000 [2:20:36<36:26:30, 27.91s/it] 6%|▌ | 300/5000 [2:21:03<36:04:12, 27.63s/it] 6%|▌ | 300/5000 [2:21:03<36:04:12, 27.63s/it] 6%|▌ | 301/5000 [2:21:32<36:29:21, 27.96s/it] 6%|▌ | 302/5000 [2:22:01<36:54:28, 28.28s/it] 6%|▌ | 303/5000 [2:22:28<36:31:45, 28.00s/it] 6%|▌ | 304/5000 [2:22:57<36:51:38, 28.26s/it] 6%|▌ | 305/5000 [2:23:24<36:23:09, 27.90s/it] 6%|▌ | 306/5000 [2:23:51<36:07:02, 27.70s/it] 6%|▌ | 307/5000 [2:24:20<36:39:50, 28.12s/it] 6%|▌ | 308/5000 [2:24:49<36:59:57, 28.39s/it] 6%|▌ | 309/5000 [2:25:17<36:34:23, 28.07s/it] 6%|▌ | 310/5000 [2:25:44<36:21:42, 27.91s/it] 6%|▌ | 311/5000 [2:26:12<36:30:16, 28.03s/it] 6%|▌ | 312/5000 [2:26:40<36:08:38, 27.76s/it] 6%|▋ | 313/5000 [2:27:10<37:11:11, 28.56s/it] 6%|▋ | 314/5000 [2:27:37<36:34:51, 28.10s/it] 6%|▋ | 315/5000 [2:28:04<36:11:05, 27.80s/it] 6%|▋ | 316/5000 [2:28:33<36:35:48, 28.13s/it] 6%|▋ | 317/5000 [2:29:00<36:12:26, 27.83s/it] 6%|▋ | 318/5000 [2:29:27<35:56:52, 27.64s/it] 6%|▋ | 319/5000 [2:29:56<36:27:25, 28.04s/it] 6%|▋ | 320/5000 [2:30:24<36:15:39, 27.89s/it] 6%|▋ | 321/5000 [2:30:51<35:57:36, 27.67s/it] 6%|▋ | 322/5000 [2:31:20<36:38:12, 28.19s/it] 6%|▋ | 323/5000 [2:31:43<34:18:17, 26.41s/it] 6%|▋ | 324/5000 [2:31:54<28:14:37, 21.74s/it] 6%|▋ | 325/5000 [2:32:04<23:59:08, 18.47s/it] 6%|▋ | 325/5000 [2:32:04<23:59:08, 18.47s/it] 7%|▋ | 326/5000 [2:32:15<21:01:36, 16.20s/it] 7%|▋ | 327/5000 [2:32:23<17:43:56, 13.66s/it]{'loss': 0.3592, 'learning_rate': 3.46e-06, 'epoch': 1.0} {'loss': 0.3448, 'learning_rate': 3.96e-06, 'epoch': 1.01} {'loss': 0.3673, 'learning_rate': 4.4600000000000005e-06, 'epoch': 1.01} {'loss': 0.273, 'learning_rate': 4.960000000000001e-06, 'epoch': 1.02} {'loss': 0.3088, 'learning_rate': 5.460000000000001e-06, 'epoch': 1.02} {'loss': 0.302, 'learning_rate': 5.9600000000000005e-06, 'epoch': 1.03} {'loss': 0.2583, 'learning_rate': 6.460000000000001e-06, 'epoch': 1.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:01, 1.09s/it] Reading metadata...: 15098it [00:01, 17551.49it/s] Reading metadata...: 23979it [00:02, 8389.85it/s]  Reading metadata...: 28043it [00:02, 9574.58it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:03, 3.94s/it] Reading metadata...: 10438it [00:04, 2601.71it/s] 7%|▋ | 328/5000 [2:34:24<59:27:28, 45.82s/it] 7%|▋ | 329/5000 [2:34:54<53:18:07, 41.08s/it] 7%|▋ | 330/5000 [2:35:21<47:58:08, 36.98s/it] 7%|▋ | 331/5000 [2:35:51<45:08:09, 34.80s/it] 7%|▋ | 332/5000 [2:36:21<43:09:15, 33.28s/it] 7%|▋ | 333/5000 [2:36:51<41:49:05, 32.26s/it] 7%|▋ | 334/5000 [2:37:18<39:53:29, 30.78s/it] 7%|▋ | 335/5000 [2:37:49<39:56:07, 30.82s/it] 7%|▋ | 336/5000 [2:38:19<39:47:55, 30.72s/it] 7%|▋ | 337/5000 [2:38:49<39:19:14, 30.36s/it] 7%|▋ | 338/5000 [2:39:16<38:05:12, 29.41s/it] 7%|▋ | 339/5000 [2:39:46<38:06:08, 29.43s/it] 7%|▋ | 340/5000 [2:40:23<41:17:39, 31.90s/it] 7%|▋ | 341/5000 [2:40:53<40:16:40, 31.12s/it] 7%|▋ | 342/5000 [2:41:20<38:45:32, 29.96s/it] 7%|▋ | 343/5000 [2:41:49<38:35:57, 29.84s/it] 7%|▋ | 344/5000 [2:42:19<38:26:12, 29.72s/it] 7%|▋ | 345/5000 [2:42:46<37:37:18, 29.10s/it] 7%|▋ | 346/5000 [2:43:25<41:28:39, 32.08s/it] 7%|▋ | 347/5000 [2:43:53<39:32:15, 30.59s/it] 7%|▋ | 348/5000 [2:44:20<38:24:11, 29.72s/it] 7%|▋ | 349/5000 [2:44:50<38:33:23, 29.84s/it] 7%|▋ | 350/5000 [2:45:20<38:18:16, 29.66s/it] 7%|▋ | 350/5000 [2:45:20<38:18:16, 29.66s/it] 7%|▋ | 351/5000 [2:45:47<37:25:43, 28.98s/it] 7%|▋ | 352/5000 [2:46:24<40:39:26, 31.49s/it] 7%|▋ | 353/5000 [2:46:52<39:00:37, 30.22s/it] 7%|▋ | 354/5000 [2:47:19<37:50:13, 29.32s/it] 7%|▋ | 355/5000 [2:47:50<38:39:55, 29.97s/it] 7%|▋ | 356/5000 [2:48:18<37:44:03, 29.25s/it] 7%|▋ | 357/5000 [2:48:45<36:53:44, 28.61s/it] 7%|▋ | 358/5000 [2:49:16<37:57:20, 29.44s/it] 7%|▋ | 359/5000 [2:49:44<37:07:42, 28.80s/it] 7%|▋ | 360/5000 [2:50:11<36:34:52, 28.38s/it] 7%|▋ | 361/5000 [2:50:41<37:16:38, 28.93s/it] 7%|▋ | 362/5000 [2:51:08<36:25:25, 28.27s/it] 7%|▋ | 363/5000 [2:51:35<36:03:10, 27.99s/it] 7%|▋ | 364/5000 [2:52:08<37:57:49, 29.48s/it] 7%|▋ | 365/5000 [2:52:36<37:09:38, 28.86s/it] 7%|▋ | 366/5000 [2:53:03<36:26:23, 28.31s/it] 7%|▋ | 367/5000 [2:53:33<37:20:10, 29.01s/it] 7%|▋ | 368/5000 [2:54:00<36:21:56, 28.26s/it] 7%|▋ | 369/5000 [2:54:29<36:32:11, 28.40s/it] 7%|▋ | 370/5000 [2:54:58<36:49:06, 28.63s/it] 7%|▋ | 371/5000 [2:55:25<36:14:47, 28.19s/it] 7%|▋ | 372/5000 [2:55:56<37:19:36, 29.04s/it] 7%|▋ | 373/5000 [2:56:23<36:34:38, 28.46s/it] 7%|▋ | 374/5000 [2:56:51<36:29:03, 28.39s/it] 8%|▊ | 375/5000 [2:57:26<38:44:36, 30.16s/it] 8%|▊ | 375/5000 [2:57:26<38:44:36, 30.16s/it] 8%|▊ | 376/5000 [2:57:53<37:34:56, 29.26s/it] 8%|▊ | 377/5000 [2:58:23<37:52:08, 29.49s/it] 8%|▊ | 378/5000 [2:58:53<37:59:17, 29.59s/it] 8%|▊ | 379/5000 [2:59:20<37:09:54, 28.95s/it] 8%|▊ | 380/5000 [2:59:52<38:14:13, 29.80s/it] 8%|▊ | 381/5000 [3:00:19<37:18:02, 29.07s/it] 8%|▊ | 382/5000 [3:00:46<36:33:53, 28.50s/it] 8%|▊ | 383/5000 [3:01:19<38:02:05, 29.66s/it] 8%|▊ | 384/5000 [3:01:46<37:05:06, 28.92s/it] 8%|▊ | 385/5000 [3:02:13<36:29:52, 28.47s/it] 8%|▊ | 386/5000 [3:02:44<37:20:51, 29.14s/it] 8%|▊ | 387/5000 [3:03:11<36:28:43, 28.47s/it] 8%|▊ | 388/5000 [3:03:37<35:26:14, 27.66s/it] 8%|▊ | 389/5000 [3:04:08<36:38:24, 28.61s/it] 8%|▊ | 390/5000 [3:04:35<36:09:31, 28.24s/it] 8%|▊ | 391/5000 [3:05:02<35:46:46, 27.95s/it] 8%|▊ | 392/5000 [3:05:32<36:24:56, 28.45s/it] 8%|▊ | 393/5000 [3:05:59<35:54:14, 28.06s/it] 8%|▊ | 394/5000 [3:06:29<36:30:25, 28.53s/it] 8%|▊ | 395/5000 [3:07:02<38:23:21, 30.01s/it] 8%|▊ | 396/5000 [3:07:29<37:17:38, 29.16s/it] 8%|▊ | 397/5000 [3:07:59<37:24:48, 29.26s/it] 8%|▊ | 398/5000 [3:08:26<36:38:14, 28.66s/it] 8%|▊ | 399/5000 [3:08:58<37:48:56, 29.59s/it] 8%|▊ | 400/5000 [3:09:27<37:45:36, 29.55s/it] 8%|▊ | 400/5000 [3:09:27<37:45:36, 29.55s/it] 8%|▊ | 401/5000 [3:09:54<36:51:22, 28.85s/it] 8%|▊ | 402/5000 [3:10:24<37:14:45, 29.16s/it] 8%|▊ | 403/5000 [3:10:55<37:40:44, 29.51s/it] 8%|▊ | 404/5000 [3:11:22<36:45:44, 28.80s/it] 8%|▊ | 405/5000 [3:11:51<37:01:17, 29.00s/it] 8%|▊ | 406/5000 [3:12:26<39:06:48, 30.65s/it] 8%|▊ | 407/5000 [3:12:53<37:47:58, 29.63s/it] 8%|▊ | 408/5000 [3:13:22<37:42:58, 29.57s/it] 8%|▊ | 409/5000 [3:13:52<37:40:01, 29.54s/it] 8%|▊ | 410/5000 [3:14:19<36:46:57, 28.85s/it] 8%|▊ | 411/5000 [3:14:52<38:10:48, 29.95s/it] 8%|▊ | 412/5000 [3:15:20<37:36:03, 29.50s/it] 8%|▊ | 413/5000 [3:15:50<37:35:51, 29.51s/it] 8%|▊ | 414/5000 [3:16:16<36:32:40, 28.69s/it] 8%|▊ | 415/5000 [3:16:46<36:43:34, 28.84s/it] 8%|▊ | 416/5000 [3:17:15<37:01:20, 29.08s/it] 8%|▊ | 417/5000 [3:17:42<36:02:01, 28.30s/it] 8%|▊ | 418/5000 [3:18:11<36:23:04, 28.59s/it] 8%|▊ | 419/5000 [3:18:40<36:35:37, 28.76s/it] 8%|▊ | 420/5000 [3:19:07<35:55:13, 28.23s/it] 8%|▊ | 421/5000 [3:19:37<36:31:35, 28.72s/it] 8%|▊ | 422/5000 [3:20:06<36:44:24, 28.89s/it] 8%|▊ | 423/5000 [3:20:34<36:09:13, 28.44s/it] 8%|▊ | 424/5000 [3:21:03<36:39:46, 28.84s/it] 8%|▊ | 425/5000 [3:21:33<36:52:00, 29.01s/it] 8%|▊ | 425/5000 [3:21:33<36:52:00, 29.01s/it] 9%|▊ | 426/5000 [3:22:00<36:05:51, 28.41s/it] 9%|▊ | 427/5000 [3:22:30<36:43:01, 28.90s/it] 9%|▊ | 428/5000 [3:22:59<36:49:08, 28.99s/it] 9%|▊ | 429/5000 [3:23:26<36:07:18, 28.45s/it] 9%|▊ | 430/5000 [3:23:58<37:29:11, 29.53s/it] 9%|▊ | 431/5000 [3:24:26<36:36:32, 28.84s/it] 9%|▊ | 432/5000 [3:24:53<35:57:34, 28.34s/it] 9%|▊ | 433/5000 [3:25:25<37:21:31, 29.45s/it] 9%|▊ | 434/5000 [3:25:52<36:34:29, 28.84s/it] 9%|▊ | 435/5000 [3:26:19<35:56:08, 28.34s/it] 9%|▊ | 436/5000 [3:26:50<36:42:42, 28.96s/it] 9%|▊ | 437/5000 [3:27:20<37:05:57, 29.27s/it] 9%|▉ | 438/5000 [3:27:47<36:12:53, 28.58s/it] 9%|▉ | 439/5000 [3:28:17<36:41:46, 28.96s/it] 9%|▉ | 440/5000 [3:28:46<36:55:46, 29.15s/it] 9%|▉ | 441/5000 [3:29:13<36:09:48, 28.56s/it] 9%|▉ | 442/5000 [3:29:43<36:28:07, 28.80s/it] 9%|▉ | 443/5000 [3:30:12<36:43:21, 29.01s/it] 9%|▉ | 444/5000 [3:30:42<36:55:59, 29.18s/it] 9%|▉ | 445/5000 [3:31:09<36:02:45, 28.49s/it] 9%|▉ | 446/5000 [3:31:40<37:05:25, 29.32s/it] 9%|▉ | 447/5000 [3:32:07<36:13:45, 28.65s/it] 9%|▉ | 448/5000 [3:32:34<35:38:07, 28.18s/it] 9%|▉ | 449/5000 [3:33:05<36:30:32, 28.88s/it] 9%|▉ | 450/5000 [3:33:32<35:54:20, 28.41s/it] 9%|▉ | 450/5000 [3:33:32<35:54:20, 28.41s/it] 9%|▉ | 451/5000 [3:33:59<35:22:36, 28.00s/it] 9%|▉ | 452/5000 [3:34:31<36:52:54, 29.19s/it] 9%|▉ | 453/5000 [3:34:58<36:06:32, 28.59s/it] 9%|▉ | 454/5000 [3:35:25<35:30:53, 28.12s/it] 9%|▉ | 455/5000 [3:35:57<37:00:03, 29.31s/it] 9%|▉ | 456/5000 [3:36:24<36:06:07, 28.60s/it] 9%|▉ | 457/5000 [3:36:51<35:17:38, 27.97s/it] 9%|▉ | 458/5000 [3:37:17<34:37:21, 27.44s/it] 9%|▉ | 459/5000 [3:37:49<36:31:38, 28.96s/it] 9%|▉ | 460/5000 [3:38:17<35:49:57, 28.41s/it] 9%|▉ | 461/5000 [3:38:44<35:16:34, 27.98s/it] 9%|▉ | 462/5000 [3:39:14<36:11:48, 28.71s/it] 9%|▉ | 463/5000 [3:39:41<35:42:23, 28.33s/it] 9%|▉ | 464/5000 [3:40:08<35:02:57, 27.82s/it] 9%|▉ | 465/5000 [3:40:40<36:28:00, 28.95s/it] 9%|▉ | 466/5000 [3:41:07<35:48:57, 28.44s/it] 9%|▉ | 467/5000 [3:41:34<35:21:06, 28.08s/it] 9%|▉ | 468/5000 [3:42:06<36:39:29, 29.12s/it] 9%|▉ | 469/5000 [3:42:33<36:00:39, 28.61s/it] 9%|▉ | 470/5000 [3:43:00<35:25:08, 28.15s/it] 9%|▉ | 471/5000 [3:43:32<36:57:59, 29.38s/it] 9%|▉ | 472/5000 [3:43:59<36:04:29, 28.68s/it] 9%|▉ | 473/5000 [3:44:27<35:39:16, 28.35s/it] 9%|▉ | 474/5000 [3:44:57<36:23:20, 28.94s/it] 10%|▉ | 475/5000 [3:45:25<35:46:58, 28.47s/it] 10%|▉ | 475/5000 [3:45:25<35:46:58, 28.47s/it] 10%|▉ | 476/5000 [3:45:52<35:21:38, 28.14s/it] 10%|▉ | 477/5000 [3:46:28<38:13:45, 30.43s/it] 10%|▉ | 478/5000 [3:46:55<37:06:11, 29.54s/it] 10%|▉ | 479/5000 [3:47:21<35:39:21, 28.39s/it] 10%|▉ | 480/5000 [3:47:49<35:32:51, 28.31s/it] 10%|▉ | 481/5000 [3:48:16<34:57:24, 27.85s/it] 10%|▉ | 482/5000 [3:48:43<34:46:50, 27.71s/it] 10%|▉ | 483/5000 [3:49:16<36:38:31, 29.20s/it] 10%|▉ | 484/5000 [3:49:43<35:54:50, 28.63s/it] 10%|▉ | 485/5000 [3:50:10<35:19:14, 28.16s/it] 10%|▉ | 486/5000 [3:50:43<36:49:28, 29.37s/it] 10%|▉ | 487/5000 [3:50:57<31:01:29, 24.75s/it] 10%|▉ | 488/5000 [3:51:07<25:45:30, 20.55s/it] 10%|▉ | 489/5000 [3:51:18<22:05:59, 17.64s/it] 10%|▉ | 490/5000 [3:51:29<19:35:34, 15.64s/it]{'loss': 0.2545, 'learning_rate': 6.96e-06, 'epoch': 2.0} {'loss': 0.2599, 'learning_rate': 7.4600000000000006e-06, 'epoch': 2.01} {'loss': 0.2464, 'learning_rate': 7.960000000000002e-06, 'epoch': 2.01} {'loss': 0.1981, 'learning_rate': 8.46e-06, 'epoch': 2.02} {'loss': 0.2316, 'learning_rate': 8.96e-06, 'epoch': 2.02} {'loss': 0.2077, 'learning_rate': 9.460000000000001e-06, 'epoch': 2.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:02, 2.22s/it] Reading metadata...: 15016it [00:02, 9088.72it/s] Reading metadata...: 23848it [00:02, 13351.04it/s] Reading metadata...: 28043it [00:02, 10627.78it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:01, 1.89s/it] Reading metadata...: 10438it [00:01, 5308.23it/s] 10%|▉ | 491/5000 [3:53:06<49:59:47, 39.92s/it] 10%|▉ | 492/5000 [3:53:34<45:28:45, 36.32s/it] 10%|▉ | 493/5000 [3:54:01<42:16:20, 33.77s/it] 10%|▉ | 494/5000 [3:54:30<40:29:47, 32.35s/it] 10%|▉ | 495/5000 [3:54:58<38:41:25, 30.92s/it] 10%|▉ | 496/5000 [3:55:26<37:40:40, 30.12s/it] 10%|▉ | 497/5000 [3:55:57<37:50:39, 30.26s/it] 10%|▉ | 498/5000 [3:56:25<36:55:39, 29.53s/it] 10%|▉ | 499/5000 [3:56:53<36:39:25, 29.32s/it] 10%|█ | 500/5000 [3:57:20<35:45:29, 28.61s/it] 10%|█ | 500/5000 [3:57:20<35:45:29, 28.61s/it] 10%|█ | 501/5000 [3:57:50<36:10:29, 28.95s/it] 10%|█ | 502/5000 [3:58:17<35:22:32, 28.31s/it] 10%|█ | 503/5000 [3:58:44<34:59:39, 28.01s/it] 10%|█ | 504/5000 [3:59:13<35:18:43, 28.27s/it] 10%|█ | 505/5000 [3:59:41<35:07:06, 28.13s/it] 10%|█ | 506/5000 [4:00:09<35:10:42, 28.18s/it] 10%|█ | 507/5000 [4:00:37<35:09:58, 28.18s/it] 10%|█ | 508/5000 [4:01:05<34:52:29, 27.95s/it] 10%|█ | 509/5000 [4:01:33<34:57:53, 28.03s/it] 10%|█ | 510/5000 [4:02:01<34:58:48, 28.05s/it] 10%|█ | 511/5000 [4:02:28<34:36:24, 27.75s/it] 10%|█ | 512/5000 [4:02:57<34:56:20, 28.03s/it] 10%|█ | 513/5000 [4:03:25<34:47:39, 27.92s/it] 10%|█ | 514/5000 [4:03:52<34:30:48, 27.70s/it] 10%|█ | 515/5000 [4:04:19<34:25:26, 27.63s/it] 10%|█ | 516/5000 [4:04:47<34:17:59, 27.54s/it] 10%|█ | 517/5000 [4:05:14<34:12:29, 27.47s/it] 10%|█ | 518/5000 [4:05:41<34:10:29, 27.45s/it] 10%|█ | 519/5000 [4:06:09<34:24:54, 27.65s/it] 10%|█ | 520/5000 [4:06:36<34:07:13, 27.42s/it] 10%|█ | 521/5000 [4:07:08<35:46:29, 28.75s/it] 10%|█ | 522/5000 [4:07:36<35:18:23, 28.38s/it] 10%|█ | 523/5000 [4:08:03<34:47:17, 27.97s/it] 10%|█ | 524/5000 [4:08:32<35:23:31, 28.47s/it] 10%|█ | 525/5000 [4:09:00<34:54:55, 28.09s/it] 10%|█ | 525/5000 [4:09:00<34:54:55, 28.09s/it] 11%|█ | 526/5000 [4:09:25<33:59:55, 27.36s/it] 11%|█ | 527/5000 [4:09:51<33:27:38, 26.93s/it] 11%|█ | 528/5000 [4:10:19<33:45:53, 27.18s/it] 11%|█ | 529/5000 [4:10:46<33:44:51, 27.17s/it] 11%|█ | 530/5000 [4:11:15<34:14:15, 27.57s/it] 11%|█ | 531/5000 [4:11:42<34:10:56, 27.54s/it] 11%|█ | 532/5000 [4:12:09<34:01:11, 27.41s/it] 11%|█ | 533/5000 [4:12:38<34:34:56, 27.87s/it] 11%|█ | 534/5000 [4:13:06<34:35:00, 27.88s/it] 11%|█ | 535/5000 [4:13:33<34:18:03, 27.66s/it] 11%|█ | 536/5000 [4:14:01<34:30:46, 27.83s/it] 11%|█ | 537/5000 [4:14:29<34:28:24, 27.81s/it] 11%|█ | 538/5000 [4:14:57<34:24:07, 27.76s/it] 11%|█ | 539/5000 [4:15:24<34:11:36, 27.59s/it] 11%|█ | 540/5000 [4:15:51<34:07:40, 27.55s/it] 11%|█ | 541/5000 [4:16:20<34:33:30, 27.90s/it] 11%|█ | 542/5000 [4:16:47<34:20:33, 27.73s/it] 11%|█ | 543/5000 [4:17:15<34:24:08, 27.79s/it] 11%|█ | 544/5000 [4:17:43<34:25:59, 27.82s/it] 11%|█ | 545/5000 [4:18:10<34:09:36, 27.60s/it] 11%|█ | 546/5000 [4:18:38<34:19:04, 27.74s/it] 11%|█ | 547/5000 [4:19:06<34:21:15, 27.77s/it] 11%|█ | 548/5000 [4:19:34<34:13:57, 27.68s/it] 11%|█ | 549/5000 [4:20:03<34:45:28, 28.11s/it] 11%|█ | 550/5000 [4:20:30<34:27:46, 27.88s/it] 11%|█ | 550/5000 [4:20:30<34:27:46, 27.88s/it] 11%|█ | 551/5000 [4:20:57<34:11:41, 27.67s/it] 11%|█ | 552/5000 [4:21:26<34:33:00, 27.96s/it] 11%|█ | 553/5000 [4:21:53<34:17:24, 27.76s/it] 11%|█ | 554/5000 [4:22:21<34:08:34, 27.65s/it] 11%|█ | 555/5000 [4:22:49<34:26:50, 27.90s/it] 11%|█ | 556/5000 [4:23:17<34:28:07, 27.92s/it] 11%|█ | 557/5000 [4:23:44<34:10:44, 27.69s/it] 11%|█ | 558/5000 [4:24:12<34:00:19, 27.56s/it] 11%|█ | 559/5000 [4:24:40<34:23:16, 27.88s/it] 11%|█ | 560/5000 [4:25:08<34:11:49, 27.73s/it] 11%|█ | 561/5000 [4:25:36<34:17:53, 27.82s/it] 11%|█ | 562/5000 [4:26:04<34:35:00, 28.05s/it] 11%|█▏ | 563/5000 [4:26:31<33:58:17, 27.56s/it] 11%|█▏ | 564/5000 [4:26:58<34:00:51, 27.60s/it] 11%|█▏ | 565/5000 [4:27:26<34:07:16, 27.70s/it] 11%|█▏ | 566/5000 [4:27:54<34:20:15, 27.88s/it] 11%|█▏ | 567/5000 [4:28:22<34:03:53, 27.66s/it] 11%|█▏ | 568/5000 [4:28:50<34:09:43, 27.75s/it] 11%|█▏ | 569/5000 [4:29:17<34:10:34, 27.77s/it] 11%|█▏ | 570/5000 [4:29:45<33:56:18, 27.58s/it] 11%|█▏ | 571/5000 [4:30:13<34:10:09, 27.77s/it] 11%|█▏ | 572/5000 [4:30:41<34:11:42, 27.80s/it] 11%|█▏ | 573/5000 [4:31:08<33:58:26, 27.63s/it] 11%|█▏ | 574/5000 [4:31:36<34:02:43, 27.69s/it] 12%|█▏ | 575/5000 [4:32:04<34:05:04, 27.73s/it] 12%|█▏ | 575/5000 [4:32:04<34:05:04, 27.73s/it] 12%|█▏ | 576/5000 [4:32:31<33:53:20, 27.58s/it] 12%|█▏ | 577/5000 [4:32:58<33:43:16, 27.45s/it] 12%|█▏ | 578/5000 [4:33:25<33:25:33, 27.21s/it] 12%|█▏ | 579/5000 [4:33:52<33:22:58, 27.18s/it] 12%|█▏ | 580/5000 [4:34:19<33:30:08, 27.29s/it] 12%|█▏ | 581/5000 [4:34:47<33:40:18, 27.43s/it] 12%|█▏ | 582/5000 [4:35:14<33:32:07, 27.33s/it] 12%|█▏ | 583/5000 [4:35:42<33:55:35, 27.65s/it] 12%|█▏ | 584/5000 [4:36:10<33:48:17, 27.56s/it] 12%|█▏ | 585/5000 [4:36:37<33:41:31, 27.47s/it] 12%|█▏ | 586/5000 [4:37:05<33:56:36, 27.68s/it] 12%|█▏ | 587/5000 [4:37:33<33:58:42, 27.72s/it] 12%|█▏ | 588/5000 [4:38:00<33:51:26, 27.63s/it] 12%|█▏ | 589/5000 [4:38:29<34:04:58, 27.82s/it] 12%|█▏ | 590/5000 [4:38:56<34:03:34, 27.80s/it] 12%|█▏ | 591/5000 [4:39:24<33:57:32, 27.73s/it] 12%|█▏ | 592/5000 [4:39:51<33:42:24, 27.53s/it] 12%|█▏ | 593/5000 [4:40:19<33:58:36, 27.76s/it] 12%|█▏ | 594/5000 [4:40:47<34:02:31, 27.81s/it] 12%|█▏ | 595/5000 [4:41:15<33:51:44, 27.67s/it] 12%|█▏ | 596/5000 [4:41:42<33:39:13, 27.51s/it] 12%|█▏ | 597/5000 [4:42:10<33:58:56, 27.78s/it] 12%|█▏ | 598/5000 [4:42:37<33:42:46, 27.57s/it] 12%|█▏ | 599/5000 [4:43:04<33:30:34, 27.41s/it] 12%|█▏ | 600/5000 [4:43:33<34:06:44, 27.91s/it] 12%|█▏ | 600/5000 [4:43:33<34:06:44, 27.91s/it] 12%|█▏ | 601/5000 [4:44:01<33:53:07, 27.73s/it] 12%|█▏ | 602/5000 [4:44:28<33:41:18, 27.58s/it] 12%|█▏ | 603/5000 [4:44:59<35:04:30, 28.72s/it] 12%|█▏ | 604/5000 [4:45:27<34:38:09, 28.36s/it] 12%|█▏ | 605/5000 [4:45:55<34:34:59, 28.33s/it] 12%|█▏ | 606/5000 [4:46:23<34:20:01, 28.13s/it] 12%|█▏ | 607/5000 [4:46:51<34:20:45, 28.15s/it] 12%|█▏ | 608/5000 [4:47:18<34:01:48, 27.89s/it] 12%|█▏ | 609/5000 [4:47:46<34:00:57, 27.89s/it] 12%|█▏ | 610/5000 [4:48:14<34:06:23, 27.97s/it] 12%|█▏ | 611/5000 [4:48:41<33:49:16, 27.74s/it] 12%|█▏ | 612/5000 [4:49:10<34:06:46, 27.99s/it] 12%|█▏ | 613/5000 [4:49:38<34:00:04, 27.90s/it] 12%|█▏ | 614/5000 [4:50:05<33:43:00, 27.67s/it] 12%|█▏ | 615/5000 [4:50:32<33:37:13, 27.60s/it] 12%|█▏ | 616/5000 [4:51:00<33:39:36, 27.64s/it] 12%|█▏ | 617/5000 [4:51:27<33:30:54, 27.53s/it] 12%|█▏ | 618/5000 [4:51:55<33:23:37, 27.43s/it] 12%|█▏ | 619/5000 [4:52:23<33:51:14, 27.82s/it] 12%|█▏ | 620/5000 [4:52:50<33:32:09, 27.56s/it] 12%|█▏ | 621/5000 [4:53:17<33:20:34, 27.41s/it] 12%|█▏ | 622/5000 [4:53:45<33:35:22, 27.62s/it] 12%|█▏ | 623/5000 [4:54:13<33:30:25, 27.56s/it] 12%|█▏ | 624/5000 [4:54:40<33:11:34, 27.31s/it] 12%|█▎ | 625/5000 [4:55:08<33:43:52, 27.76s/it] 12%|█▎ | 625/5000 [4:55:08<33:43:52, 27.76s/it] 13%|█▎ | 626/5000 [4:55:36<33:33:46, 27.62s/it] 13%|█▎ | 627/5000 [4:56:03<33:23:47, 27.49s/it] 13%|█▎ | 628/5000 [4:56:32<33:50:09, 27.86s/it] 13%|█▎ | 629/5000 [4:57:00<33:54:49, 27.93s/it] 13%|█▎ | 630/5000 [4:57:27<33:38:47, 27.72s/it] 13%|█▎ | 631/5000 [4:57:55<33:41:28, 27.76s/it] 13%|█▎ | 632/5000 [4:58:22<33:38:57, 27.73s/it] 13%|█▎ | 633/5000 [4:58:49<33:19:08, 27.47s/it] 13%|█▎ | 634/5000 [4:59:17<33:30:40, 27.63s/it] 13%|█▎ | 635/5000 [4:59:44<33:21:27, 27.51s/it] 13%|█▎ | 636/5000 [5:00:12<33:16:18, 27.45s/it] 13%|█▎ | 637/5000 [5:00:40<33:31:57, 27.67s/it] 13%|█▎ | 638/5000 [5:01:08<33:39:37, 27.78s/it] 13%|█▎ | 639/5000 [5:01:35<33:26:38, 27.61s/it] 13%|█▎ | 640/5000 [5:02:04<34:00:57, 28.09s/it] 13%|█▎ | 641/5000 [5:02:32<33:44:56, 27.87s/it] 13%|█▎ | 642/5000 [5:02:59<33:29:59, 27.67s/it] 13%|█▎ | 643/5000 [5:03:28<33:47:23, 27.92s/it] 13%|█▎ | 644/5000 [5:03:54<33:15:04, 27.48s/it] 13%|█▎ | 645/5000 [5:04:21<33:01:41, 27.30s/it] 13%|█▎ | 646/5000 [5:04:48<32:48:05, 27.12s/it] 13%|█▎ | 647/5000 [5:05:16<33:09:03, 27.42s/it] 13%|█▎ | 648/5000 [5:05:43<32:59:56, 27.30s/it] 13%|█▎ | 649/5000 [5:06:12<33:44:52, 27.92s/it] 13%|█▎ | 650/5000 [5:06:34<31:39:01, 26.19s/it] 13%|█▎ | 650/5000 [5:06:34<31:39:01, 26.19s/it] 13%|█▎ | 651/5000 [5:06:45<26:03:56, 21.58s/it] 13%|█▎ | 652/5000 [5:06:56<22:15:49, 18.43s/it] 13%|█▎ | 653/5000 [5:07:07<19:32:01, 16.18s/it] 13%|█▎ | 654/5000 [5:07:15<16:27:10, 13.63s/it]{'loss': 0.1836, 'learning_rate': 9.960000000000001e-06, 'epoch': 3.0} {'loss': 0.2001, 'learning_rate': 9.94888888888889e-06, 'epoch': 3.01} {'loss': 0.1875, 'learning_rate': 9.893333333333334e-06, 'epoch': 3.01} {'loss': 0.15, 'learning_rate': 9.837777777777778e-06, 'epoch': 3.02} {'loss': 0.1534, 'learning_rate': 9.782222222222222e-06, 'epoch': 3.02} {'loss': 0.1565, 'learning_rate': 9.726666666666668e-06, 'epoch': 3.03} {'loss': 0.1266, 'learning_rate': 9.671111111111112e-06, 'epoch': 3.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 2.40it/s] Reading metadata...: 13887it [00:00, 35502.94it/s] Reading metadata...: 22056it [00:00, 29691.65it/s] Reading metadata...: 28043it [00:00, 31651.18it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 3.34it/s] Reading metadata...: 10438it [00:00, 28332.17it/s] 13%|█▎ | 655/5000 [5:08:52<46:55:10, 38.87s/it] 13%|█▎ | 656/5000 [5:09:21<43:01:04, 35.65s/it] 13%|█▎ | 657/5000 [5:09:48<39:53:00, 33.06s/it] 13%|█▎ | 658/5000 [5:10:16<38:12:56, 31.69s/it] 13%|█▎ | 659/5000 [5:10:45<37:11:39, 30.85s/it] 13%|█▎ | 660/5000 [5:11:13<36:08:48, 29.98s/it] 13%|█▎ | 661/5000 [5:11:40<35:07:56, 29.15s/it] 13%|█▎ | 662/5000 [5:12:08<34:35:15, 28.70s/it] 13%|█▎ | 663/5000 [5:12:36<34:21:00, 28.51s/it] 13%|█▎ | 664/5000 [5:13:04<34:13:30, 28.42s/it] 13%|█▎ | 665/5000 [5:13:31<33:46:38, 28.05s/it] 13%|█▎ | 666/5000 [5:14:00<34:02:21, 28.27s/it] 13%|█▎ | 667/5000 [5:14:28<33:49:20, 28.10s/it] 13%|█▎ | 668/5000 [5:14:56<34:02:05, 28.28s/it] 13%|█▎ | 669/5000 [5:15:24<33:41:22, 28.00s/it] 13%|█▎ | 670/5000 [5:15:52<33:34:18, 27.91s/it] 13%|█▎ | 671/5000 [5:16:20<33:35:14, 27.93s/it] 13%|█▎ | 672/5000 [5:16:47<33:18:26, 27.70s/it] 13%|█▎ | 673/5000 [5:17:17<34:06:49, 28.38s/it] 13%|█▎ | 674/5000 [5:17:44<33:41:46, 28.04s/it] 14%|█▎ | 675/5000 [5:18:11<33:22:09, 27.78s/it] 14%|█▎ | 675/5000 [5:18:11<33:22:09, 27.78s/it] 14%|█▎ | 676/5000 [5:18:39<33:28:46, 27.87s/it] 14%|█▎ | 677/5000 [5:19:06<33:03:48, 27.53s/it] 14%|█▎ | 678/5000 [5:19:32<32:31:54, 27.10s/it] 14%|█▎ | 679/5000 [5:20:00<32:46:53, 27.31s/it] 14%|█▎ | 680/5000 [5:20:26<32:28:07, 27.06s/it] 14%|█▎ | 681/5000 [5:20:53<32:23:40, 27.00s/it] 14%|█▎ | 682/5000 [5:21:21<32:45:28, 27.31s/it] 14%|█▎ | 683/5000 [5:21:48<32:40:09, 27.24s/it] 14%|█▎ | 684/5000 [5:22:16<32:41:46, 27.27s/it] 14%|█▎ | 685/5000 [5:22:44<33:11:01, 27.69s/it] 14%|█▎ | 686/5000 [5:23:12<33:07:35, 27.64s/it] 14%|█▎ | 687/5000 [5:23:39<32:59:00, 27.53s/it] 14%|█▍ | 688/5000 [5:24:08<33:19:27, 27.82s/it] 14%|█▍ | 689/5000 [5:24:34<32:52:08, 27.45s/it] 14%|█▍ | 690/5000 [5:25:02<32:50:17, 27.43s/it] 14%|█▍ | 691/5000 [5:25:29<32:47:05, 27.39s/it] 14%|█▍ | 692/5000 [5:25:55<32:30:49, 27.17s/it] 14%|█▍ | 693/5000 [5:26:22<32:24:10, 27.08s/it] 14%|█▍ | 694/5000 [5:26:50<32:34:11, 27.23s/it] 14%|█▍ | 695/5000 [5:27:17<32:33:06, 27.22s/it] 14%|█▍ | 696/5000 [5:27:45<32:48:17, 27.44s/it] 14%|█▍ | 697/5000 [5:28:13<33:02:45, 27.65s/it] 14%|█▍ | 698/5000 [5:28:40<32:49:27, 27.47s/it] 14%|█▍ | 699/5000 [5:29:08<32:58:14, 27.60s/it] 14%|█▍ | 700/5000 [5:29:36<32:52:54, 27.53s/it] 14%|█▍ | 700/5000 [5:29:36<32:52:54, 27.53s/it] 14%|█▍ | 701/5000 [5:30:03<32:59:06, 27.62s/it] 14%|█▍ | 702/5000 [5:30:31<33:01:56, 27.67s/it] 14%|█▍ | 703/5000 [5:30:58<32:42:15, 27.40s/it] 14%|█▍ | 704/5000 [5:31:27<33:14:49, 27.86s/it] 14%|█▍ | 705/5000 [5:31:55<33:10:53, 27.81s/it] 14%|█▍ | 706/5000 [5:32:22<33:01:40, 27.69s/it] 14%|█▍ | 707/5000 [5:32:51<33:27:50, 28.06s/it] 14%|█▍ | 708/5000 [5:33:18<33:11:38, 27.84s/it] 14%|█▍ | 709/5000 [5:33:45<32:58:58, 27.67s/it] 14%|█▍ | 710/5000 [5:34:14<33:20:25, 27.98s/it] 14%|█▍ | 711/5000 [5:34:41<33:03:46, 27.75s/it] 14%|█▍ | 712/5000 [5:35:09<32:58:54, 27.69s/it] 14%|█▍ | 713/5000 [5:35:38<33:29:44, 28.13s/it] 14%|█▍ | 714/5000 [5:36:05<33:12:27, 27.89s/it] 14%|█▍ | 715/5000 [5:36:33<33:01:30, 27.75s/it] 14%|█▍ | 716/5000 [5:37:05<34:30:24, 29.00s/it] 14%|█▍ | 717/5000 [5:37:32<33:45:23, 28.37s/it] 14%|█▍ | 718/5000 [5:37:59<33:22:11, 28.06s/it] 14%|█▍ | 719/5000 [5:38:29<34:12:47, 28.77s/it] 14%|█▍ | 720/5000 [5:38:57<33:39:21, 28.31s/it] 14%|█▍ | 721/5000 [5:39:25<33:32:35, 28.22s/it] 14%|█▍ | 722/5000 [5:39:53<33:33:52, 28.25s/it] 14%|█▍ | 723/5000 [5:40:20<33:12:43, 27.96s/it] 14%|█▍ | 724/5000 [5:40:48<33:17:00, 28.02s/it] 14%|█▍ | 725/5000 [5:41:16<32:59:03, 27.78s/it] 14%|█▍ | 725/5000 [5:41:16<32:59:03, 27.78s/it] 15%|█▍ | 726/5000 [5:41:43<32:57:34, 27.76s/it] 15%|█▍ | 727/5000 [5:42:11<32:46:32, 27.61s/it] 15%|█▍ | 728/5000 [5:42:38<32:38:55, 27.51s/it] 15%|█▍ | 729/5000 [5:43:06<32:51:03, 27.69s/it] 15%|█▍ | 730/5000 [5:43:34<33:03:15, 27.87s/it] 15%|█▍ | 731/5000 [5:44:02<32:55:06, 27.76s/it] 15%|█▍ | 732/5000 [5:44:30<33:04:30, 27.90s/it] 15%|█▍ | 733/5000 [5:44:58<33:12:34, 28.02s/it] 15%|█▍ | 734/5000 [5:45:26<32:58:22, 27.83s/it] 15%|█▍ | 735/5000 [5:45:54<33:00:14, 27.86s/it] 15%|█▍ | 736/5000 [5:46:21<32:58:52, 27.85s/it] 15%|█▍ | 737/5000 [5:46:49<32:46:49, 27.68s/it] 15%|█▍ | 738/5000 [5:47:16<32:45:39, 27.67s/it] 15%|█▍ | 739/5000 [5:47:45<32:58:14, 27.86s/it] 15%|█▍ | 740/5000 [5:48:12<32:53:57, 27.80s/it] 15%|█▍ | 741/5000 [5:48:40<32:41:30, 27.63s/it] 15%|█▍ | 742/5000 [5:49:13<34:39:06, 29.30s/it] 15%|█▍ | 743/5000 [5:49:42<34:44:37, 29.38s/it] 15%|█▍ | 744/5000 [5:50:09<33:50:56, 28.63s/it] 15%|█▍ | 745/5000 [5:50:38<33:44:17, 28.54s/it] 15%|█▍ | 746/5000 [5:51:06<33:31:41, 28.37s/it] 15%|█▍ | 747/5000 [5:51:32<32:53:41, 27.84s/it] 15%|█▍ | 748/5000 [5:52:00<33:01:45, 27.96s/it] 15%|█▍ | 749/5000 [5:52:28<32:45:54, 27.75s/it] 15%|█▌ | 750/5000 [5:52:55<32:38:30, 27.65s/it] 15%|█▌ | 750/5000 [5:52:55<32:38:30, 27.65s/it] 15%|█▌ | 751/5000 [5:53:23<32:35:57, 27.62s/it] 15%|█▌ | 752/5000 [5:53:51<32:57:59, 27.94s/it] 15%|█▌ | 753/5000 [5:54:19<32:43:21, 27.74s/it] 15%|█▌ | 754/5000 [5:54:47<32:54:30, 27.90s/it] 15%|█▌ | 755/5000 [5:55:15<33:06:21, 28.08s/it] 15%|█▌ | 756/5000 [5:55:43<32:51:02, 27.87s/it] 15%|█▌ | 757/5000 [5:56:11<33:07:55, 28.11s/it] 15%|█▌ | 758/5000 [5:56:39<32:52:04, 27.89s/it] 15%|█▌ | 759/5000 [5:57:06<32:39:43, 27.73s/it] 15%|█▌ | 760/5000 [5:57:35<33:04:22, 28.08s/it] 15%|█▌ | 761/5000 [5:58:02<32:47:53, 27.85s/it] 15%|█▌ | 762/5000 [5:58:30<32:35:13, 27.68s/it] 15%|█▌ | 763/5000 [5:58:57<32:38:01, 27.73s/it] 15%|█▌ | 764/5000 [5:59:26<32:49:59, 27.90s/it] 15%|█▌ | 765/5000 [5:59:53<32:36:02, 27.71s/it] 15%|█▌ | 766/5000 [6:00:21<32:48:05, 27.89s/it] 15%|█▌ | 767/5000 [6:00:49<32:45:08, 27.85s/it] 15%|█▌ | 768/5000 [6:01:17<32:36:36, 27.74s/it] 15%|█▌ | 769/5000 [6:01:45<32:40:26, 27.80s/it] 15%|█▌ | 770/5000 [6:02:13<32:48:40, 27.92s/it] 15%|█▌ | 771/5000 [6:02:41<32:46:30, 27.90s/it] 15%|█▌ | 772/5000 [6:03:08<32:28:53, 27.66s/it] 15%|█▌ | 773/5000 [6:03:36<32:52:36, 28.00s/it] 15%|█▌ | 774/5000 [6:04:04<32:33:05, 27.73s/it] 16%|█▌ | 775/5000 [6:04:31<32:25:51, 27.63s/it] 16%|█▌ | 775/5000 [6:04:31<32:25:51, 27.63s/it] 16%|█▌ | 776/5000 [6:04:58<32:20:50, 27.57s/it] 16%|█▌ | 777/5000 [6:05:25<32:09:51, 27.42s/it] 16%|█▌ | 778/5000 [6:05:53<32:08:01, 27.40s/it] 16%|█▌ | 779/5000 [6:06:26<34:05:05, 29.07s/it] 16%|█▌ | 780/5000 [6:06:53<33:26:32, 28.53s/it] 16%|█▌ | 781/5000 [6:07:21<33:04:20, 28.22s/it] 16%|█▌ | 782/5000 [6:07:49<33:15:12, 28.38s/it] 16%|█▌ | 783/5000 [6:08:16<32:46:58, 27.99s/it] 16%|█▌ | 784/5000 [6:08:44<32:28:10, 27.73s/it] 16%|█▌ | 785/5000 [6:09:11<32:19:03, 27.60s/it] 16%|█▌ | 786/5000 [6:09:40<32:53:14, 28.10s/it] 16%|█▌ | 787/5000 [6:10:07<32:31:51, 27.80s/it] 16%|█▌ | 788/5000 [6:10:34<32:07:03, 27.45s/it] 16%|█▌ | 789/5000 [6:11:02<32:30:34, 27.79s/it] 16%|█▌ | 790/5000 [6:11:30<32:23:16, 27.70s/it] 16%|█▌ | 791/5000 [6:11:57<32:15:07, 27.59s/it] 16%|█▌ | 792/5000 [6:12:26<32:45:02, 28.02s/it] 16%|█▌ | 793/5000 [6:12:54<32:33:00, 27.85s/it] 16%|█▌ | 794/5000 [6:13:20<32:04:35, 27.45s/it] 16%|█▌ | 795/5000 [6:13:48<32:20:21, 27.69s/it] 16%|█▌ | 796/5000 [6:14:16<32:08:30, 27.52s/it] 16%|█▌ | 797/5000 [6:14:42<31:54:08, 27.33s/it] 16%|█▌ | 798/5000 [6:15:11<32:25:26, 27.78s/it] 16%|█▌ | 799/5000 [6:15:38<32:07:25, 27.53s/it] 16%|█▌ | 800/5000 [6:16:05<31:52:14, 27.32s/it] 16%|█▌ | 800/5000 [6:16:05<31:52:14, 27.32s/it] 16%|█▌ | 801/5000 [6:16:35<32:43:39, 28.06s/it] 16%|█▌ | 802/5000 [6:17:01<32:07:25, 27.55s/it] 16%|█▌ | 803/5000 [6:17:29<32:04:13, 27.51s/it] 16%|█▌ | 804/5000 [6:17:58<32:39:16, 28.02s/it] 16%|█▌ | 805/5000 [6:18:25<32:30:30, 27.90s/it] 16%|█▌ | 806/5000 [6:18:53<32:15:14, 27.69s/it] 16%|█▌ | 807/5000 [6:19:20<32:07:52, 27.59s/it] 16%|█▌ | 808/5000 [6:19:47<31:59:21, 27.47s/it] 16%|█▌ | 809/5000 [6:20:14<31:49:52, 27.34s/it] 16%|█▌ | 810/5000 [6:20:42<32:03:25, 27.54s/it] 16%|█▌ | 811/5000 [6:21:10<31:57:46, 27.47s/it] 16%|█▌ | 812/5000 [6:21:37<31:55:38, 27.44s/it] 16%|█▋ | 813/5000 [6:22:05<32:09:12, 27.65s/it] 16%|█▋ | 814/5000 [6:22:19<27:24:16, 23.57s/it] 16%|█▋ | 815/5000 [6:22:30<22:57:38, 19.75s/it] 16%|█▋ | 816/5000 [6:22:41<19:50:24, 17.07s/it] 16%|█▋ | 817/5000 [6:22:52<17:40:15, 15.21s/it]{'loss': 0.1249, 'learning_rate': 9.615555555555558e-06, 'epoch': 4.0} {'loss': 0.1291, 'learning_rate': 9.56e-06, 'epoch': 4.01} {'loss': 0.1197, 'learning_rate': 9.504444444444446e-06, 'epoch': 4.01} {'loss': 0.0921, 'learning_rate': 9.44888888888889e-06, 'epoch': 4.02} {'loss': 0.1023, 'learning_rate': 9.393333333333334e-06, 'epoch': 4.02} {'loss': 0.0962, 'learning_rate': 9.33777777777778e-06, 'epoch': 4.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 1.99it/s] Reading metadata...: 14292it [00:00, 31622.25it/s] Reading metadata...: 22699it [00:01, 14675.60it/s] Reading metadata...: 28043it [00:01, 17791.47it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 3.06it/s] Reading metadata...: 10438it [00:00, 26121.88it/s] 16%|█▋ | 818/5000 [6:24:21<43:33:27, 37.50s/it] 16%|█▋ | 819/5000 [6:24:49<40:08:37, 34.57s/it] 16%|█▋ | 820/5000 [6:25:18<38:07:25, 32.83s/it] 16%|█▋ | 821/5000 [6:25:46<36:30:50, 31.46s/it] 16%|█▋ | 822/5000 [6:26:13<35:08:17, 30.28s/it] 16%|█▋ | 823/5000 [6:26:42<34:24:28, 29.65s/it] 16%|█▋ | 824/5000 [6:27:10<33:50:30, 29.17s/it] 16%|█▋ | 825/5000 [6:27:37<33:16:40, 28.69s/it] 16%|█▋ | 825/5000 [6:27:37<33:16:40, 28.69s/it] 17%|█▋ | 826/5000 [6:28:05<32:50:08, 28.32s/it] 17%|█▋ | 827/5000 [6:28:32<32:31:20, 28.06s/it] 17%|█▋ | 828/5000 [6:29:00<32:32:50, 28.08s/it] 17%|█▋ | 829/5000 [6:29:29<32:47:49, 28.31s/it] 17%|█▋ | 830/5000 [6:29:56<32:25:57, 28.00s/it] 17%|█▋ | 831/5000 [6:30:24<32:08:03, 27.75s/it] 17%|█▋ | 832/5000 [6:30:52<32:21:38, 27.95s/it] 17%|█▋ | 833/5000 [6:31:20<32:30:00, 28.08s/it] 17%|█▋ | 834/5000 [6:31:48<32:28:06, 28.06s/it] 17%|█▋ | 835/5000 [6:32:15<32:08:23, 27.78s/it] 17%|█▋ | 836/5000 [6:32:44<32:21:13, 27.97s/it] 17%|█▋ | 837/5000 [6:33:12<32:27:14, 28.06s/it] 17%|█▋ | 838/5000 [6:33:39<32:08:25, 27.80s/it] 17%|█▋ | 839/5000 [6:34:07<31:57:33, 27.65s/it] 17%|█▋ | 840/5000 [6:34:34<31:59:22, 27.68s/it] 17%|█▋ | 841/5000 [6:35:02<31:51:45, 27.58s/it] 17%|█▋ | 842/5000 [6:35:29<31:43:14, 27.46s/it] 17%|█▋ | 843/5000 [6:35:57<31:59:44, 27.71s/it] 17%|█▋ | 844/5000 [6:36:24<31:45:32, 27.51s/it] 17%|█▋ | 845/5000 [6:36:53<32:03:23, 27.77s/it] 17%|█▋ | 846/5000 [6:37:21<32:12:27, 27.91s/it] 17%|█▋ | 847/5000 [6:37:48<31:57:26, 27.70s/it] 17%|█▋ | 848/5000 [6:38:17<32:13:27, 27.94s/it] 17%|█▋ | 849/5000 [6:38:45<32:12:00, 27.93s/it] 17%|█▋ | 850/5000 [6:39:12<31:58:48, 27.74s/it] 17%|█▋ | 850/5000 [6:39:12<31:58:48, 27.74s/it] 17%|█▋ | 851/5000 [6:39:41<32:35:02, 28.27s/it] 17%|█▋ | 852/5000 [6:40:08<32:03:50, 27.83s/it] 17%|█▋ | 853/5000 [6:40:34<31:21:02, 27.22s/it] 17%|█▋ | 854/5000 [6:41:05<32:35:05, 28.29s/it] 17%|█▋ | 855/5000 [6:41:32<32:20:09, 28.08s/it] 17%|█▋ | 856/5000 [6:42:00<32:05:24, 27.88s/it] 17%|█▋ | 857/5000 [6:42:27<32:00:54, 27.82s/it] 17%|█▋ | 858/5000 [6:42:55<31:53:36, 27.72s/it] 17%|█▋ | 859/5000 [6:43:21<31:21:33, 27.26s/it] 17%|█▋ | 860/5000 [6:43:48<31:23:22, 27.30s/it] 17%|█▋ | 861/5000 [6:44:16<31:33:14, 27.44s/it] 17%|█▋ | 862/5000 [6:44:44<31:32:29, 27.44s/it] 17%|█▋ | 863/5000 [6:45:11<31:36:02, 27.50s/it] 17%|█▋ | 864/5000 [6:45:39<31:47:27, 27.67s/it] 17%|█▋ | 865/5000 [6:46:08<32:02:32, 27.90s/it] 17%|█▋ | 866/5000 [6:46:35<31:48:54, 27.71s/it] 17%|█▋ | 867/5000 [6:47:02<31:40:50, 27.59s/it] 17%|█▋ | 868/5000 [6:47:30<31:44:43, 27.66s/it] 17%|█▋ | 869/5000 [6:47:58<31:40:10, 27.60s/it] 17%|█▋ | 870/5000 [6:48:26<31:54:41, 27.82s/it] 17%|█▋ | 871/5000 [6:48:53<31:45:34, 27.69s/it] 17%|█▋ | 872/5000 [6:49:21<31:40:09, 27.62s/it] 17%|█▋ | 873/5000 [6:49:49<31:45:19, 27.70s/it] 17%|█▋ | 874/5000 [6:50:17<31:49:11, 27.76s/it] 18%|█▊ | 875/5000 [6:50:44<31:43:20, 27.68s/it] 18%|█▊ | 875/5000 [6:50:44<31:43:20, 27.68s/it] 18%|█▊ | 876/5000 [6:51:13<32:09:20, 28.07s/it] 18%|█▊ | 877/5000 [6:51:40<31:49:58, 27.79s/it] 18%|█▊ | 878/5000 [6:52:06<31:12:07, 27.25s/it] 18%|█▊ | 879/5000 [6:52:34<31:26:38, 27.47s/it] 18%|█▊ | 880/5000 [6:53:02<31:30:05, 27.53s/it] 18%|█▊ | 881/5000 [6:53:29<31:17:53, 27.35s/it] 18%|█▊ | 882/5000 [6:53:56<31:20:37, 27.40s/it] 18%|█▊ | 883/5000 [6:54:24<31:28:55, 27.53s/it] 18%|█▊ | 884/5000 [6:54:51<31:18:44, 27.39s/it] 18%|█▊ | 885/5000 [6:55:23<32:58:05, 28.84s/it] 18%|█▊ | 886/5000 [6:55:51<32:40:29, 28.59s/it] 18%|█▊ | 887/5000 [6:56:19<32:10:38, 28.16s/it] 18%|█▊ | 888/5000 [6:56:49<33:03:36, 28.94s/it] 18%|█▊ | 889/5000 [6:57:19<33:11:19, 29.06s/it] 18%|█▊ | 890/5000 [6:57:45<32:08:10, 28.15s/it] 18%|█▊ | 891/5000 [6:58:13<32:16:37, 28.28s/it] 18%|█▊ | 892/5000 [6:58:41<32:05:31, 28.12s/it] 18%|█▊ | 893/5000 [6:59:09<32:09:33, 28.19s/it] 18%|█▊ | 894/5000 [6:59:37<31:48:05, 27.88s/it] 18%|█▊ | 895/5000 [7:00:05<32:07:41, 28.18s/it] 18%|█▊ | 896/5000 [7:00:34<32:11:07, 28.23s/it] 18%|█▊ | 897/5000 [7:01:01<31:56:09, 28.02s/it] 18%|█▊ | 898/5000 [7:01:30<32:00:27, 28.09s/it] 18%|█▊ | 899/5000 [7:01:58<32:05:52, 28.18s/it] 18%|█▊ | 900/5000 [7:02:26<31:53:22, 28.00s/it] 18%|█▊ | 900/5000 [7:02:26<31:53:22, 28.00s/it] 18%|█▊ | 901/5000 [7:02:52<31:29:30, 27.66s/it] 18%|█▊ | 902/5000 [7:03:21<31:37:07, 27.78s/it] 18%|█▊ | 903/5000 [7:03:48<31:31:29, 27.70s/it] 18%|█▊ | 904/5000 [7:04:16<31:41:15, 27.85s/it] 18%|█▊ | 905/5000 [7:04:44<31:44:15, 27.90s/it] 18%|█▊ | 906/5000 [7:05:12<31:35:45, 27.78s/it] 18%|█▊ | 907/5000 [7:05:40<31:39:21, 27.84s/it] 18%|█▊ | 908/5000 [7:06:08<31:47:10, 27.96s/it] 18%|█▊ | 909/5000 [7:06:35<31:33:15, 27.77s/it] 18%|█▊ | 910/5000 [7:07:03<31:30:04, 27.73s/it] 18%|█▊ | 911/5000 [7:07:31<31:42:06, 27.91s/it] 18%|█▊ | 912/5000 [7:07:58<31:25:28, 27.67s/it] 18%|█▊ | 913/5000 [7:08:26<31:26:02, 27.69s/it] 18%|█▊ | 914/5000 [7:08:54<31:35:53, 27.84s/it] 18%|█▊ | 915/5000 [7:09:22<31:29:30, 27.75s/it] 18%|█▊ | 916/5000 [7:09:50<31:38:49, 27.90s/it] 18%|█▊ | 917/5000 [7:10:19<31:52:19, 28.10s/it] 18%|█▊ | 918/5000 [7:10:46<31:43:08, 27.97s/it] 18%|█▊ | 919/5000 [7:11:14<31:30:55, 27.80s/it] 18%|█▊ | 920/5000 [7:11:42<31:36:39, 27.89s/it] 18%|█▊ | 921/5000 [7:12:09<31:25:04, 27.73s/it] 18%|█▊ | 922/5000 [7:12:36<31:15:20, 27.59s/it] 18%|█▊ | 923/5000 [7:13:04<31:07:13, 27.48s/it] 18%|█▊ | 924/5000 [7:13:33<31:44:03, 28.03s/it] 18%|█▊ | 925/5000 [7:14:00<31:27:54, 27.80s/it] 18%|█▊ | 925/5000 [7:14:00<31:27:54, 27.80s/it] 19%|█▊ | 926/5000 [7:14:26<30:55:29, 27.33s/it] 19%|█▊ | 927/5000 [7:14:54<30:58:35, 27.38s/it] 19%|█▊ | 928/5000 [7:15:21<30:50:12, 27.26s/it] 19%|█▊ | 929/5000 [7:15:48<30:41:11, 27.14s/it] 19%|█▊ | 930/5000 [7:16:17<31:25:56, 27.80s/it] 19%|█▊ | 931/5000 [7:16:44<31:11:50, 27.60s/it] 19%|█▊ | 932/5000 [7:17:12<31:18:34, 27.71s/it] 19%|█▊ | 933/5000 [7:17:40<31:26:23, 27.83s/it] 19%|█▊ | 934/5000 [7:18:08<31:23:55, 27.80s/it] 19%|█▊ | 935/5000 [7:18:35<31:02:04, 27.48s/it] 19%|█▊ | 936/5000 [7:19:03<31:07:14, 27.57s/it] 19%|█▊ | 937/5000 [7:19:30<31:10:13, 27.62s/it] 19%|█▉ | 938/5000 [7:19:58<31:00:48, 27.49s/it] 19%|█▉ | 939/5000 [7:20:26<31:23:30, 27.83s/it] 19%|█▉ | 940/5000 [7:20:54<31:18:10, 27.76s/it] 19%|█▉ | 941/5000 [7:21:21<31:06:46, 27.59s/it] 19%|█▉ | 942/5000 [7:21:48<30:58:47, 27.48s/it] 19%|█▉ | 943/5000 [7:22:16<31:00:08, 27.51s/it] 19%|█▉ | 944/5000 [7:22:43<30:50:49, 27.38s/it] 19%|█▉ | 945/5000 [7:23:10<30:45:47, 27.31s/it] 19%|█▉ | 946/5000 [7:23:39<31:26:24, 27.92s/it] 19%|█▉ | 947/5000 [7:24:06<30:59:01, 27.52s/it] 19%|█▉ | 948/5000 [7:24:31<30:12:57, 26.85s/it] 19%|█▉ | 949/5000 [7:24:59<30:28:04, 27.08s/it] 19%|█▉ | 950/5000 [7:25:24<29:51:57, 26.55s/it] 19%|█▉ | 950/5000 [7:25:24<29:51:57, 26.55s/it] 19%|█▉ | 951/5000 [7:25:51<29:49:11, 26.51s/it] 19%|█▉ | 952/5000 [7:26:20<30:39:01, 27.26s/it] 19%|█▉ | 953/5000 [7:26:47<30:39:02, 27.27s/it] 19%|█▉ | 954/5000 [7:27:14<30:39:54, 27.28s/it] 19%|█▉ | 955/5000 [7:27:43<31:09:52, 27.74s/it] 19%|█▉ | 956/5000 [7:28:10<31:05:19, 27.68s/it] 19%|█▉ | 957/5000 [7:28:38<30:52:26, 27.49s/it] 19%|█▉ | 958/5000 [7:29:05<30:49:08, 27.45s/it] 19%|█▉ | 959/5000 [7:29:33<30:53:05, 27.51s/it] 19%|█▉ | 960/5000 [7:30:00<30:47:34, 27.44s/it] 19%|█▉ | 961/5000 [7:30:28<31:11:15, 27.80s/it] 19%|█▉ | 962/5000 [7:30:56<31:08:37, 27.77s/it] 19%|█▉ | 963/5000 [7:31:24<31:04:05, 27.71s/it] 19%|█▉ | 964/5000 [7:31:51<30:49:39, 27.50s/it] 19%|█▉ | 965/5000 [7:32:18<30:54:27, 27.58s/it] 19%|█▉ | 966/5000 [7:32:46<30:45:21, 27.45s/it] 19%|█▉ | 967/5000 [7:33:14<31:04:40, 27.74s/it] 19%|█▉ | 968/5000 [7:33:40<30:26:48, 27.18s/it] 19%|█▉ | 969/5000 [7:34:07<30:28:54, 27.22s/it] 19%|█▉ | 970/5000 [7:34:36<30:57:28, 27.65s/it] 19%|█▉ | 971/5000 [7:35:03<30:50:45, 27.56s/it] 19%|█▉ | 972/5000 [7:35:31<30:50:11, 27.56s/it] 19%|█▉ | 973/5000 [7:35:59<31:02:30, 27.75s/it] 19%|█▉ | 974/5000 [7:36:27<31:11:27, 27.89s/it] 20%|█▉ | 975/5000 [7:36:55<31:02:52, 27.77s/it] 20%|█▉ | 975/5000 [7:36:55<31:02:52, 27.77s/it] 20%|█▉ | 976/5000 [7:37:23<31:13:46, 27.94s/it] 20%|█▉ | 977/5000 [7:37:45<29:17:09, 26.21s/it] 20%|█▉ | 978/5000 [7:37:56<24:09:31, 21.62s/it] 20%|█▉ | 979/5000 [7:38:07<20:32:29, 18.39s/it] 20%|█▉ | 980/5000 [7:38:18<17:59:55, 16.12s/it] 20%|█▉ | 981/5000 [7:38:26<15:15:07, 13.66s/it]{'loss': 0.0803, 'learning_rate': 9.282222222222222e-06, 'epoch': 5.0} {'loss': 0.0877, 'learning_rate': 9.226666666666668e-06, 'epoch': 5.01} {'loss': 0.0812, 'learning_rate': 9.171111111111112e-06, 'epoch': 5.01} {'loss': 0.0639, 'learning_rate': 9.115555555555556e-06, 'epoch': 5.02} {'loss': 0.0615, 'learning_rate': 9.060000000000001e-06, 'epoch': 5.02} {'loss': 0.07, 'learning_rate': 9.004444444444445e-06, 'epoch': 5.03} {'loss': 0.053, 'learning_rate': 8.94888888888889e-06, 'epoch': 5.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 2.05it/s] Reading metadata...: 14689it [00:00, 33229.49it/s] Reading metadata...: 23329it [00:00, 27403.88it/s] Reading metadata...: 28043it [00:01, 28001.93it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 2.80it/s] Reading metadata...: 10438it [00:00, 24569.38it/s] 20%|█▉ | 982/5000 [7:40:08<44:59:17, 40.31s/it] 20%|█▉ | 983/5000 [7:40:39<41:48:40, 37.47s/it] 20%|█▉ | 984/5000 [7:41:06<38:21:27, 34.38s/it] 20%|█▉ | 985/5000 [7:41:35<36:19:09, 32.57s/it] 20%|█▉ | 986/5000 [7:42:03<34:54:32, 31.31s/it] 20%|█▉ | 987/5000 [7:42:31<33:49:17, 30.34s/it] 20%|█▉ | 988/5000 [7:42:58<32:48:35, 29.44s/it] 20%|█▉ | 989/5000 [7:43:27<32:34:34, 29.24s/it] 20%|█▉ | 990/5000 [7:43:55<32:09:24, 28.87s/it] 20%|█▉ | 991/5000 [7:44:22<31:31:59, 28.32s/it] 20%|█▉ | 992/5000 [7:44:50<31:14:19, 28.06s/it] 20%|█▉ | 993/5000 [7:45:18<31:13:43, 28.06s/it] 20%|█▉ | 994/5000 [7:45:46<31:25:23, 28.24s/it] 20%|█▉ | 995/5000 [7:46:15<31:30:11, 28.32s/it] 20%|█▉ | 996/5000 [7:46:42<31:11:18, 28.04s/it] 20%|█▉ | 997/5000 [7:47:14<32:27:39, 29.19s/it] 20%|█▉ | 998/5000 [7:47:43<32:18:30, 29.06s/it] 20%|█▉ | 999/5000 [7:48:10<31:41:54, 28.52s/it] 20%|██ | 1000/5000 [7:48:38<31:29:41, 28.35s/it] 20%|██ | 1000/5000 [7:48:38<31:29:41, 28.35s/it][INFO|trainer.py:3138] 2023-05-07 18:22:49,172 >> ***** Running Evaluation ***** [INFO|trainer.py:3142] 2023-05-07 18:22:49,172 >> Num examples: Unknown [INFO|trainer.py:3143] 2023-05-07 18:22:49,172 >> Batch size = 64 {'loss': 0.0519, 'learning_rate': 8.893333333333333e-06, 'epoch': 6.0} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:02, 2.58s/it] Reading metadata...: 10440it [00:02, 3954.23it/s] [INFO|trainer_utils.py:693] 2023-05-07 18:23:04,305 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. 20%|██ | 1000/5000 [8:26:06<31:29:41, 28.35s/it][INFO|trainer.py:2877] 2023-05-07 19:00:17,386 >> Saving model checkpoint to ./checkpoint-1000 [INFO|configuration_utils.py:458] 2023-05-07 19:00:17,393 >> Configuration saved in ./checkpoint-1000/config.json [INFO|configuration_utils.py:364] 2023-05-07 19:00:17,398 >> Configuration saved in ./checkpoint-1000/generation_config.json [INFO|modeling_utils.py:1855] 2023-05-07 19:00:20,753 >> Model weights saved in ./checkpoint-1000/pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-07 19:00:20,758 >> Feature extractor saved in ./checkpoint-1000/preprocessor_config.json [INFO|feature_extraction_utils.py:369] 2023-05-07 19:00:30,115 >> Feature extractor saved in ./preprocessor_config.json Adding files tracked by Git LFS: ['wandb/run-20230506_113337-ysywp688/run-ysywp688.wandb', 'wandb/run-20230507_103405-9zf5xxpu/run-9zf5xxpu.wandb']. This may take a bit of time if the files are large. {'eval_loss': 0.43405279517173767, 'eval_wer': 54.25600000000001, 'eval_runtime': 2248.2056, 'eval_samples_per_second': 4.644, 'eval_steps_per_second': 0.073, 'epoch': 6.0} 05/07/2023 19:00:40 - WARNING - huggingface_hub.repository - Adding files tracked by Git LFS: ['wandb/run-20230506_113337-ysywp688/run-ysywp688.wandb', 'wandb/run-20230507_103405-9zf5xxpu/run-9zf5xxpu.wandb']. This may take a bit of time if the files are large. /home/local/QCRI/dizham/miniconda3/envs/whisper/lib/python3.9/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' 20%|██ | 1001/5000 [8:27:07<791:34:47, 712.60s/it] 20%|██ | 1002/5000 [8:27:35<563:11:28, 507.13s/it] 20%|██ | 1003/5000 [8:28:04<403:46:05, 363.66s/it] 20%|██ | 1004/5000 [8:28:32<291:54:08, 262.98s/it] 20%|██ | 1005/5000 [8:28:59<213:24:18, 192.30s/it] 20%|██ | 1006/5000 [8:29:28<158:59:55, 143.31s/it] 20%|██ | 1007/5000 [8:29:56<120:19:03, 108.48s/it] 20%|██ | 1008/5000 [8:30:23<93:12:59, 84.06s/it] 20%|██ | 1009/5000 [8:30:52<74:58:34, 67.63s/it] 20%|██ | 1010/5000 [8:31:19<61:38:34, 55.62s/it] 20%|██ | 1011/5000 [8:31:46<51:49:30, 46.77s/it] 20%|██ | 1012/5000 [8:32:20<47:34:07, 42.94s/it] 20%|██ | 1013/5000 [8:32:46<42:11:11, 38.09s/it] 20%|██ | 1014/5000 [8:33:14<38:34:08, 34.83s/it] 20%|██ | 1015/5000 [8:33:43<36:51:23, 33.30s/it] 20%|██ | 1016/5000 [8:34:11<34:51:19, 31.50s/it] 20%|██ | 1017/5000 [8:34:38<33:26:52, 30.23s/it] 20%|██ | 1018/5000 [8:35:07<32:57:16, 29.79s/it] 20%|██ | 1019/5000 [8:35:34<32:09:41, 29.08s/it] 20%|██ | 1020/5000 [8:36:01<31:28:13, 28.47s/it] 20%|██ | 1021/5000 [8:36:29<31:22:00, 28.38s/it] 20%|██ | 1022/5000 [8:36:56<30:53:33, 27.96s/it] 20%|██ | 1023/5000 [8:37:25<31:00:06, 28.06s/it] 20%|██ | 1024/5000 [8:37:52<30:54:36, 27.99s/it] 20%|██ | 1025/5000 [8:38:20<30:36:36, 27.72s/it] 20%|██ | 1025/5000 [8:38:20<30:36:36, 27.72s/it] 21%|██ | 1026/5000 [8:38:52<32:15:09, 29.22s/it] 21%|██ | 1027/5000 [8:39:20<31:39:01, 28.68s/it] 21%|██ | 1028/5000 [8:39:49<31:48:02, 28.82s/it] 21%|██ | 1029/5000 [8:40:17<31:30:25, 28.56s/it] 21%|██ | 1030/5000 [8:40:44<31:02:35, 28.15s/it] 21%|██ | 1031/5000 [8:41:12<30:58:22, 28.09s/it] 21%|██ | 1032/5000 [8:41:40<31:03:55, 28.18s/it] 21%|██ | 1033/5000 [8:42:08<30:48:01, 27.95s/it] 21%|██ | 1034/5000 [8:42:37<31:18:07, 28.41s/it] 21%|██ | 1035/5000 [8:43:04<30:48:47, 27.98s/it] 21%|██ | 1036/5000 [8:43:31<30:34:22, 27.77s/it] 21%|██ | 1037/5000 [8:44:01<31:08:55, 28.30s/it] 21%|██ | 1038/5000 [8:44:28<30:52:59, 28.06s/it] 21%|██ | 1039/5000 [8:44:55<30:22:36, 27.61s/it] 21%|██ | 1040/5000 [8:45:24<30:47:47, 28.00s/it] 21%|██ | 1041/5000 [8:45:51<30:35:37, 27.82s/it] 21%|██ | 1042/5000 [8:46:19<30:27:20, 27.70s/it] 21%|██ | 1043/5000 [8:46:48<30:51:34, 28.08s/it] 21%|██ | 1044/5000 [8:47:15<30:37:28, 27.87s/it] 21%|██ | 1045/5000 [8:47:43<30:27:59, 27.73s/it] 21%|██ | 1046/5000 [8:48:13<31:16:19, 28.47s/it] 21%|██ | 1047/5000 [8:48:40<30:57:09, 28.19s/it] 21%|██ | 1048/5000 [8:49:08<30:57:21, 28.20s/it] 21%|██ | 1049/5000 [8:49:36<30:47:50, 28.06s/it] 21%|██ | 1050/5000 [8:50:03<30:27:46, 27.76s/it] 21%|██ | 1050/5000 [8:50:03<30:27:46, 27.76s/it] 21%|██ | 1051/5000 [8:50:31<30:35:30, 27.89s/it] 21%|██ | 1052/5000 [8:50:59<30:23:13, 27.71s/it] 21%|██ | 1053/5000 [8:51:26<30:20:08, 27.67s/it] 21%|██ | 1054/5000 [8:51:55<30:40:01, 27.98s/it] 21%|██ | 1055/5000 [8:52:22<30:26:27, 27.78s/it] 21%|██ | 1056/5000 [8:52:51<30:35:40, 27.93s/it] 21%|██ | 1057/5000 [8:53:19<30:49:56, 28.15s/it] 21%|██ | 1058/5000 [8:53:47<30:32:42, 27.90s/it] 21%|██ | 1059/5000 [8:54:15<30:49:41, 28.16s/it] 21%|██ | 1060/5000 [8:54:44<30:49:17, 28.16s/it] 21%|██ | 1061/5000 [8:55:11<30:35:19, 27.96s/it] 21%|██ | 1062/5000 [8:55:39<30:31:43, 27.91s/it] 21%|██▏ | 1063/5000 [8:56:07<30:35:45, 27.98s/it] 21%|██▏ | 1064/5000 [8:56:34<30:18:02, 27.71s/it] 21%|██▏ | 1065/5000 [8:57:02<30:22:09, 27.78s/it] 21%|██▏ | 1066/5000 [8:57:30<30:19:09, 27.75s/it] 21%|██▏ | 1067/5000 [8:57:58<30:32:30, 27.96s/it] 21%|██▏ | 1068/5000 [8:58:26<30:22:01, 27.80s/it] 21%|██▏ | 1069/5000 [8:58:53<30:23:43, 27.84s/it] 21%|██▏ | 1070/5000 [8:59:22<30:30:16, 27.94s/it] 21%|██▏ | 1071/5000 [8:59:49<30:15:17, 27.72s/it] 21%|██▏ | 1072/5000 [9:00:17<30:25:31, 27.88s/it] 21%|██▏ | 1073/5000 [9:00:45<30:29:45, 27.96s/it] 21%|██▏ | 1074/5000 [9:01:12<30:10:29, 27.67s/it] 22%|██▏ | 1075/5000 [9:01:40<30:19:39, 27.82s/it] 22%|██▏ | 1075/5000 [9:01:40<30:19:39, 27.82s/it] 22%|██▏ | 1076/5000 [9:02:09<30:31:41, 28.01s/it] 22%|██▏ | 1077/5000 [9:02:36<30:21:10, 27.85s/it] 22%|██▏ | 1078/5000 [9:03:10<32:09:28, 29.52s/it] 22%|██▏ | 1079/5000 [9:03:38<31:35:39, 29.01s/it] 22%|██▏ | 1080/5000 [9:04:05<31:02:48, 28.51s/it] 22%|██▏ | 1081/5000 [9:04:34<31:04:38, 28.55s/it] 22%|██▏ | 1082/5000 [9:05:01<30:49:32, 28.32s/it] 22%|██▏ | 1083/5000 [9:05:29<30:27:31, 27.99s/it] 22%|██▏ | 1084/5000 [9:05:57<30:35:11, 28.12s/it] 22%|██▏ | 1085/5000 [9:06:24<30:16:06, 27.83s/it] 22%|██▏ | 1086/5000 [9:06:51<30:05:13, 27.67s/it] 22%|██▏ | 1087/5000 [9:07:19<30:12:16, 27.79s/it] 22%|██▏ | 1088/5000 [9:07:47<29:59:56, 27.61s/it] 22%|██▏ | 1089/5000 [9:08:14<29:53:36, 27.52s/it] 22%|██▏ | 1090/5000 [9:08:42<30:12:33, 27.81s/it] 22%|██▏ | 1091/5000 [9:09:11<30:22:11, 27.97s/it] 22%|██▏ | 1092/5000 [9:09:38<30:04:13, 27.70s/it] 22%|██▏ | 1093/5000 [9:10:07<30:23:16, 28.00s/it] 22%|██▏ | 1094/5000 [9:10:35<30:28:50, 28.09s/it] 22%|██▏ | 1095/5000 [9:11:02<30:13:00, 27.86s/it] 22%|██▏ | 1096/5000 [9:11:30<30:06:49, 27.77s/it] 22%|██▏ | 1097/5000 [9:11:57<29:52:09, 27.55s/it] 22%|██▏ | 1098/5000 [9:12:24<29:39:33, 27.36s/it] 22%|██▏ | 1099/5000 [9:12:51<29:28:49, 27.21s/it] 22%|██▏ | 1100/5000 [9:13:19<29:59:10, 27.68s/it] 22%|██▏ | 1100/5000 [9:13:19<29:59:10, 27.68s/it] 22%|██▏ | 1101/5000 [9:13:47<29:51:50, 27.57s/it] 22%|██▏ | 1102/5000 [9:14:14<29:47:06, 27.51s/it] 22%|██▏ | 1103/5000 [9:14:44<30:29:52, 28.17s/it] 22%|██▏ | 1104/5000 [9:15:11<30:10:21, 27.88s/it] 22%|██▏ | 1105/5000 [9:15:38<29:59:50, 27.73s/it] 22%|██▏ | 1106/5000 [9:16:07<30:17:25, 28.00s/it] 22%|██▏ | 1107/5000 [9:16:34<29:53:47, 27.65s/it] 22%|██▏ | 1108/5000 [9:17:01<29:46:04, 27.53s/it] 22%|██▏ | 1109/5000 [9:17:30<30:16:16, 28.01s/it] 22%|██▏ | 1110/5000 [9:17:56<29:32:50, 27.34s/it] 22%|██▏ | 1111/5000 [9:18:22<29:10:58, 27.01s/it] 22%|██▏ | 1112/5000 [9:18:49<29:11:25, 27.03s/it] 22%|██▏ | 1113/5000 [9:19:21<30:44:52, 28.48s/it] 22%|██▏ | 1114/5000 [9:19:48<30:21:43, 28.13s/it] 22%|██▏ | 1115/5000 [9:20:16<30:03:12, 27.85s/it] 22%|██▏ | 1116/5000 [9:20:44<30:16:39, 28.06s/it] 22%|██▏ | 1117/5000 [9:21:11<29:59:08, 27.80s/it] 22%|██▏ | 1118/5000 [9:21:39<29:47:14, 27.62s/it] 22%|██▏ | 1119/5000 [9:22:08<30:25:13, 28.22s/it] 22%|██▏ | 1120/5000 [9:22:35<30:04:44, 27.91s/it] 22%|██▏ | 1121/5000 [9:23:03<29:55:42, 27.78s/it] 22%|██▏ | 1122/5000 [9:23:31<30:02:09, 27.88s/it] 22%|██▏ | 1123/5000 [9:23:59<29:54:51, 27.78s/it] 22%|██▏ | 1124/5000 [9:24:26<29:46:03, 27.65s/it] 22%|██▎ | 1125/5000 [9:24:55<30:10:55, 28.04s/it] 22%|██▎ | 1125/5000 [9:24:55<30:10:55, 28.04s/it] 23%|██▎ | 1126/5000 [9:25:22<29:51:53, 27.75s/it] 23%|██▎ | 1127/5000 [9:25:49<29:45:21, 27.66s/it] 23%|██▎ | 1128/5000 [9:26:20<30:44:37, 28.58s/it] 23%|██▎ | 1129/5000 [9:26:47<30:10:15, 28.06s/it] 23%|██▎ | 1130/5000 [9:27:14<29:55:46, 27.84s/it] 23%|██▎ | 1131/5000 [9:27:44<30:27:45, 28.34s/it] 23%|██▎ | 1132/5000 [9:28:11<30:09:16, 28.07s/it] 23%|██▎ | 1133/5000 [9:28:39<29:54:37, 27.85s/it] 23%|██▎ | 1134/5000 [9:29:07<30:12:17, 28.13s/it] 23%|██▎ | 1135/5000 [9:29:35<29:55:13, 27.87s/it] 23%|██▎ | 1136/5000 [9:30:02<29:43:08, 27.69s/it] 23%|██▎ | 1137/5000 [9:30:31<30:11:03, 28.13s/it] 23%|██▎ | 1138/5000 [9:30:58<29:51:31, 27.83s/it] 23%|██▎ | 1139/5000 [9:31:25<29:39:25, 27.65s/it] 23%|██▎ | 1140/5000 [9:31:55<30:11:51, 28.16s/it] 23%|██▎ | 1141/5000 [9:32:08<25:33:16, 23.84s/it] 23%|██▎ | 1142/5000 [9:32:19<21:24:24, 19.98s/it] 23%|██▎ | 1143/5000 [9:32:30<18:27:07, 17.22s/it] 23%|██▎ | 1144/5000 [9:32:41<16:23:31, 15.30s/it]{'loss': 0.0512, 'learning_rate': 8.83777777777778e-06, 'epoch': 6.01} {'loss': 0.048, 'learning_rate': 8.782222222222223e-06, 'epoch': 6.01} {'loss': 0.034, 'learning_rate': 8.726666666666667e-06, 'epoch': 6.02} {'loss': 0.0415, 'learning_rate': 8.671111111111113e-06, 'epoch': 6.02} {'loss': 0.0402, 'learning_rate': 8.615555555555555e-06, 'epoch': 6.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 1.05it/s] Reading metadata...: 14652it [00:01, 19129.68it/s] Reading metadata...: 23270it [00:01, 13633.81it/s] Reading metadata...: 28043it [00:01, 14523.88it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 1.19it/s] Reading metadata...: 10438it [00:00, 11491.98it/s] 23%|██▎ | 1145/5000 [9:34:08<39:27:24, 36.85s/it] 23%|██▎ | 1146/5000 [9:34:36<36:36:12, 34.19s/it] 23%|██▎ | 1147/5000 [9:35:04<34:35:24, 32.32s/it] 23%|██▎ | 1148/5000 [9:35:33<33:24:02, 31.22s/it] 23%|██▎ | 1149/5000 [9:36:00<32:12:01, 30.10s/it] 23%|██▎ | 1150/5000 [9:36:28<31:22:23, 29.34s/it] 23%|██▎ | 1150/5000 [9:36:28<31:22:23, 29.34s/it] 23%|██▎ | 1151/5000 [9:36:56<30:57:29, 28.96s/it] 23%|██▎ | 1152/5000 [9:37:25<31:06:31, 29.10s/it] 23%|██▎ | 1153/5000 [9:37:54<30:52:06, 28.89s/it] 23%|██▎ | 1154/5000 [9:38:21<30:24:52, 28.47s/it] 23%|██▎ | 1155/5000 [9:38:49<30:20:14, 28.40s/it] 23%|██▎ | 1156/5000 [9:39:18<30:20:51, 28.42s/it] 23%|██▎ | 1157/5000 [9:39:45<30:01:50, 28.13s/it] 23%|██▎ | 1158/5000 [9:40:17<31:17:11, 29.32s/it] 23%|██▎ | 1159/5000 [9:40:46<31:04:33, 29.13s/it] 23%|██▎ | 1160/5000 [9:41:14<30:47:06, 28.86s/it] 23%|██▎ | 1161/5000 [9:41:42<30:31:23, 28.62s/it] 23%|██▎ | 1162/5000 [9:42:09<30:00:08, 28.14s/it] 23%|██▎ | 1163/5000 [9:42:38<30:07:34, 28.27s/it] 23%|██▎ | 1164/5000 [9:43:07<30:22:38, 28.51s/it] 23%|██▎ | 1165/5000 [9:43:34<29:54:41, 28.08s/it] 23%|██▎ | 1166/5000 [9:44:03<29:59:17, 28.16s/it] 23%|██▎ | 1167/5000 [9:44:30<29:46:29, 27.96s/it] 23%|██▎ | 1168/5000 [9:44:57<29:33:03, 27.76s/it] 23%|██▎ | 1169/5000 [9:45:26<29:42:04, 27.91s/it] 23%|██▎ | 1170/5000 [9:45:54<29:43:32, 27.94s/it] 23%|██▎ | 1171/5000 [9:46:21<29:24:42, 27.65s/it] 23%|██▎ | 1172/5000 [9:46:48<29:28:19, 27.72s/it] 23%|██▎ | 1173/5000 [9:47:16<29:26:16, 27.69s/it] 23%|██▎ | 1174/5000 [9:47:44<29:21:08, 27.62s/it] 24%|██▎ | 1175/5000 [9:48:12<29:43:46, 27.98s/it] 24%|██▎ | 1175/5000 [9:48:12<29:43:46, 27.98s/it] 24%|██▎ | 1176/5000 [9:48:40<29:35:28, 27.86s/it] 24%|██▎ | 1177/5000 [9:49:07<29:24:07, 27.69s/it] 24%|██▎ | 1178/5000 [9:49:37<30:00:47, 28.27s/it] 24%|██▎ | 1179/5000 [9:50:04<29:40:06, 27.95s/it] 24%|██▎ | 1180/5000 [9:50:31<29:28:19, 27.77s/it] 24%|██▎ | 1181/5000 [9:51:00<29:45:25, 28.05s/it] 24%|██▎ | 1182/5000 [9:51:27<29:27:17, 27.77s/it] 24%|██▎ | 1183/5000 [9:51:55<29:18:55, 27.65s/it] 24%|██▎ | 1184/5000 [9:52:22<29:17:14, 27.63s/it] 24%|██▎ | 1185/5000 [9:52:51<29:35:04, 27.92s/it] 24%|██▎ | 1186/5000 [9:53:18<29:12:33, 27.57s/it] 24%|██▎ | 1187/5000 [9:53:46<29:28:17, 27.83s/it] 24%|██▍ | 1188/5000 [9:54:14<29:28:42, 27.84s/it] 24%|██▍ | 1189/5000 [9:54:41<29:19:26, 27.70s/it] 24%|██▍ | 1190/5000 [9:55:09<29:21:17, 27.74s/it] 24%|██▍ | 1191/5000 [9:55:36<29:16:01, 27.66s/it] 24%|██▍ | 1192/5000 [9:56:05<29:27:45, 27.85s/it] 24%|██▍ | 1193/5000 [9:56:32<29:21:57, 27.77s/it] 24%|██▍ | 1194/5000 [9:57:00<29:12:55, 27.63s/it] 24%|██▍ | 1195/5000 [9:57:28<29:26:52, 27.86s/it] 24%|██▍ | 1196/5000 [9:57:55<29:17:25, 27.72s/it] 24%|██▍ | 1197/5000 [9:58:28<30:39:41, 29.02s/it] 24%|██▍ | 1198/5000 [9:58:55<30:08:15, 28.54s/it] 24%|██▍ | 1199/5000 [9:59:22<29:47:08, 28.21s/it] 24%|██▍ | 1200/5000 [9:59:51<29:47:38, 28.23s/it] 24%|██▍ | 1200/5000 [9:59:51<29:47:38, 28.23s/it] 24%|██▍ | 1201/5000 [10:00:19<29:50:16, 28.27s/it] 24%|██▍ | 1202/5000 [10:00:46<29:28:02, 27.93s/it] 24%|██▍ | 1203/5000 [10:01:16<30:07:10, 28.56s/it] 24%|██▍ | 1204/5000 [10:01:44<29:44:04, 28.20s/it] 24%|██▍ | 1205/5000 [10:02:10<29:11:59, 27.70s/it] 24%|██▍ | 1206/5000 [10:02:39<29:31:12, 28.01s/it] 24%|██▍ | 1207/5000 [10:03:06<29:21:01, 27.86s/it] 24%|██▍ | 1208/5000 [10:03:34<29:12:16, 27.73s/it] 24%|██▍ | 1209/5000 [10:04:02<29:14:38, 27.77s/it] 24%|██▍ | 1210/5000 [10:04:29<29:15:48, 27.80s/it] 24%|██▍ | 1211/5000 [10:04:57<29:09:36, 27.71s/it] 24%|██▍ | 1212/5000 [10:05:25<29:14:42, 27.79s/it] 24%|██▍ | 1213/5000 [10:05:54<29:29:44, 28.04s/it] 24%|██▍ | 1214/5000 [10:06:21<29:14:43, 27.81s/it] 24%|██▍ | 1215/5000 [10:06:49<29:27:37, 28.02s/it] 24%|██▍ | 1216/5000 [10:07:17<29:20:27, 27.91s/it] 24%|██▍ | 1217/5000 [10:07:44<29:05:54, 27.69s/it] 24%|██▍ | 1218/5000 [10:08:13<29:28:06, 28.05s/it] 24%|██▍ | 1219/5000 [10:08:42<29:37:11, 28.20s/it] 24%|██▍ | 1220/5000 [10:09:10<29:38:02, 28.22s/it] 24%|██▍ | 1221/5000 [10:09:37<29:23:29, 28.00s/it] 24%|██▍ | 1222/5000 [10:10:05<29:17:45, 27.92s/it] 24%|██▍ | 1223/5000 [10:10:34<29:29:06, 28.10s/it] 24%|██▍ | 1224/5000 [10:11:01<29:19:05, 27.95s/it] 24%|██▍ | 1225/5000 [10:11:33<30:38:14, 29.22s/it] 24%|██▍ | 1225/5000 [10:11:33<30:38:14, 29.22s/it] 25%|██▍ | 1226/5000 [10:12:06<31:41:12, 30.23s/it] 25%|██▍ | 1227/5000 [10:12:33<30:48:50, 29.40s/it] 25%|██▍ | 1228/5000 [10:13:02<30:26:55, 29.06s/it] 25%|██▍ | 1229/5000 [10:13:29<30:01:58, 28.67s/it] 25%|██▍ | 1230/5000 [10:13:57<29:40:08, 28.33s/it] 25%|██▍ | 1231/5000 [10:14:26<29:43:02, 28.38s/it] 25%|██▍ | 1232/5000 [10:14:54<29:44:59, 28.42s/it] 25%|██▍ | 1233/5000 [10:15:22<29:28:20, 28.17s/it] 25%|██▍ | 1234/5000 [10:15:50<29:33:45, 28.26s/it] 25%|██▍ | 1235/5000 [10:16:18<29:25:23, 28.13s/it] 25%|██▍ | 1236/5000 [10:16:45<29:09:52, 27.89s/it] 25%|██▍ | 1237/5000 [10:17:14<29:28:07, 28.19s/it] 25%|██▍ | 1238/5000 [10:17:43<29:34:26, 28.30s/it] 25%|██▍ | 1239/5000 [10:18:10<29:20:32, 28.09s/it] 25%|██▍ | 1240/5000 [10:18:38<29:15:22, 28.01s/it] 25%|██▍ | 1241/5000 [10:19:07<29:26:21, 28.19s/it] 25%|██▍ | 1242/5000 [10:19:34<29:11:01, 27.96s/it] 25%|██▍ | 1243/5000 [10:20:02<29:14:14, 28.02s/it] 25%|██▍ | 1244/5000 [10:20:32<29:41:58, 28.47s/it] 25%|██▍ | 1245/5000 [10:21:00<29:39:22, 28.43s/it] 25%|██▍ | 1246/5000 [10:21:28<29:18:40, 28.11s/it] 25%|██▍ | 1247/5000 [10:21:56<29:20:01, 28.14s/it] 25%|██▍ | 1248/5000 [10:22:24<29:23:37, 28.20s/it] 25%|██▍ | 1249/5000 [10:22:52<29:08:23, 27.97s/it] 25%|██▌ | 1250/5000 [10:23:19<28:54:41, 27.75s/it] 25%|██▌ | 1250/5000 [10:23:19<28:54:41, 27.75s/it] 25%|██▌ | 1251/5000 [10:23:47<29:01:29, 27.87s/it] 25%|██▌ | 1252/5000 [10:24:14<28:50:15, 27.70s/it] 25%|██▌ | 1253/5000 [10:24:41<28:39:31, 27.53s/it] 25%|██▌ | 1254/5000 [10:25:10<29:06:11, 27.97s/it] 25%|██▌ | 1255/5000 [10:25:38<28:52:51, 27.76s/it] 25%|██▌ | 1256/5000 [10:26:05<28:41:32, 27.59s/it] 25%|██▌ | 1257/5000 [10:26:34<29:16:45, 28.16s/it] 25%|██▌ | 1258/5000 [10:27:02<29:00:11, 27.90s/it] 25%|██▌ | 1259/5000 [10:27:34<30:29:19, 29.34s/it] 25%|██▌ | 1260/5000 [10:28:03<30:23:14, 29.25s/it] 25%|██▌ | 1261/5000 [10:28:32<30:06:56, 29.00s/it] 25%|██▌ | 1262/5000 [10:28:59<29:33:12, 28.46s/it] 25%|██▌ | 1263/5000 [10:29:28<29:37:23, 28.54s/it] 25%|██▌ | 1264/5000 [10:29:56<29:31:45, 28.45s/it] 25%|██▌ | 1265/5000 [10:30:23<29:09:47, 28.11s/it] 25%|██▌ | 1266/5000 [10:30:53<29:31:51, 28.47s/it] 25%|██▌ | 1267/5000 [10:31:20<29:17:45, 28.25s/it] 25%|██▌ | 1268/5000 [10:31:48<29:01:45, 28.00s/it] 25%|██▌ | 1269/5000 [10:32:19<29:57:39, 28.91s/it] 25%|██▌ | 1270/5000 [10:32:47<29:37:03, 28.59s/it] 25%|██▌ | 1271/5000 [10:33:14<29:16:45, 28.27s/it] 25%|██▌ | 1272/5000 [10:33:41<28:57:17, 27.96s/it] 25%|██▌ | 1273/5000 [10:34:11<29:25:01, 28.41s/it] 25%|██▌ | 1274/5000 [10:34:38<29:04:30, 28.09s/it] 26%|██▌ | 1275/5000 [10:35:05<28:48:22, 27.84s/it] 26%|██▌ | 1275/5000 [10:35:05<28:48:22, 27.84s/it] 26%|██▌ | 1276/5000 [10:35:35<29:21:57, 28.39s/it] 26%|██▌ | 1277/5000 [10:36:02<29:00:15, 28.05s/it] 26%|██▌ | 1278/5000 [10:36:29<28:42:16, 27.76s/it] 26%|██▌ | 1279/5000 [10:37:01<29:54:01, 28.93s/it] 26%|██▌ | 1280/5000 [10:37:28<29:21:26, 28.41s/it] 26%|██▌ | 1281/5000 [10:37:56<28:59:06, 28.06s/it] 26%|██▌ | 1282/5000 [10:38:24<28:58:21, 28.05s/it] 26%|██▌ | 1283/5000 [10:38:52<29:02:29, 28.13s/it] 26%|██▌ | 1284/5000 [10:39:19<28:46:09, 27.87s/it] 26%|██▌ | 1285/5000 [10:39:46<28:25:42, 27.55s/it] 26%|██▌ | 1286/5000 [10:40:14<28:31:23, 27.65s/it] 26%|██▌ | 1287/5000 [10:40:41<28:24:06, 27.54s/it] 26%|██▌ | 1288/5000 [10:41:11<28:59:54, 28.12s/it] 26%|██▌ | 1289/5000 [10:41:38<28:53:49, 28.03s/it] 26%|██▌ | 1290/5000 [10:42:06<28:38:34, 27.79s/it] 26%|██▌ | 1291/5000 [10:42:34<28:48:10, 27.96s/it] 26%|██▌ | 1292/5000 [10:43:02<28:48:21, 27.97s/it] 26%|██▌ | 1293/5000 [10:43:29<28:36:59, 27.79s/it] 26%|██▌ | 1294/5000 [10:43:58<28:51:53, 28.04s/it] 26%|██▌ | 1295/5000 [10:44:25<28:38:48, 27.84s/it] 26%|██▌ | 1296/5000 [10:44:53<28:29:34, 27.69s/it] 26%|██▌ | 1297/5000 [10:45:22<28:57:55, 28.16s/it] 26%|██▌ | 1298/5000 [10:45:49<28:43:09, 27.93s/it] 26%|██▌ | 1299/5000 [10:46:17<28:38:13, 27.86s/it] 26%|██▌ | 1300/5000 [10:46:45<28:46:40, 28.00s/it] 26%|██▌ | 1300/5000 [10:46:45<28:46:40, 28.00s/it] 26%|██▌ | 1301/5000 [10:47:12<28:29:08, 27.72s/it] 26%|██▌ | 1302/5000 [10:47:40<28:18:07, 27.55s/it] 26%|██▌ | 1303/5000 [10:48:09<28:49:19, 28.07s/it] 26%|██▌ | 1304/5000 [10:48:31<27:00:51, 26.31s/it] 26%|██▌ | 1305/5000 [10:48:42<22:15:13, 21.68s/it] 26%|██▌ | 1306/5000 [10:48:53<18:52:30, 18.39s/it] 26%|██▌ | 1307/5000 [10:49:04<16:33:11, 16.14s/it] 26%|██▌ | 1308/5000 [10:49:12<14:02:21, 13.69s/it]{'loss': 0.0321, 'learning_rate': 8.560000000000001e-06, 'epoch': 7.0} {'loss': 0.0312, 'learning_rate': 8.504444444444445e-06, 'epoch': 7.01} {'loss': 0.0309, 'learning_rate': 8.448888888888889e-06, 'epoch': 7.01} {'loss': 0.0235, 'learning_rate': 8.393333333333335e-06, 'epoch': 7.02} {'loss': 0.0225, 'learning_rate': 8.337777777777777e-06, 'epoch': 7.02} {'loss': 0.0255, 'learning_rate': 8.282222222222223e-06, 'epoch': 7.03} {'loss': 0.0221, 'learning_rate': 8.226666666666667e-06, 'epoch': 7.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 1.02it/s] Reading metadata...: 15160it [00:01, 19331.91it/s] Reading metadata...: 24077it [00:01, 19421.87it/s] Reading metadata...: 28043it [00:01, 17982.36it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 3.02it/s] Reading metadata...: 10438it [00:00, 26165.49it/s] 26%|██▌ | 1309/5000 [10:50:51<40:30:13, 39.51s/it] 26%|██▌ | 1310/5000 [10:51:19<36:51:28, 35.96s/it] 26%|██▌ | 1311/5000 [10:51:46<34:15:18, 33.43s/it] 26%|██▌ | 1312/5000 [10:52:15<32:48:56, 32.03s/it] 26%|██▋ | 1313/5000 [10:52:44<31:42:44, 30.96s/it] 26%|██▋ | 1314/5000 [10:53:12<30:44:16, 30.02s/it] 26%|██▋ | 1315/5000 [10:53:38<29:47:00, 29.10s/it] 26%|██▋ | 1316/5000 [10:54:07<29:30:34, 28.84s/it] 26%|██▋ | 1317/5000 [10:54:35<29:23:11, 28.72s/it] 26%|██▋ | 1318/5000 [10:55:02<28:47:46, 28.15s/it] 26%|██▋ | 1319/5000 [10:55:29<28:34:03, 27.94s/it] 26%|██▋ | 1320/5000 [10:55:57<28:19:25, 27.71s/it] 26%|██▋ | 1321/5000 [10:56:24<28:13:39, 27.62s/it] 26%|██▋ | 1322/5000 [10:56:52<28:27:31, 27.86s/it] 26%|██▋ | 1323/5000 [10:57:20<28:20:03, 27.74s/it] 26%|██▋ | 1324/5000 [10:57:48<28:34:52, 27.99s/it] 26%|██▋ | 1325/5000 [10:58:18<29:02:53, 28.46s/it] 26%|██▋ | 1325/5000 [10:58:18<29:02:53, 28.46s/it] 27%|██▋ | 1326/5000 [10:58:45<28:42:25, 28.13s/it] 27%|██▋ | 1327/5000 [10:59:14<28:55:53, 28.36s/it] 27%|██▋ | 1328/5000 [10:59:42<28:36:24, 28.05s/it] 27%|██▋ | 1329/5000 [11:00:09<28:20:46, 27.80s/it] 27%|██▋ | 1330/5000 [11:00:37<28:19:44, 27.79s/it] 27%|██▋ | 1331/5000 [11:01:04<28:12:05, 27.67s/it] 27%|██▋ | 1332/5000 [11:01:31<27:58:19, 27.45s/it] 27%|██▋ | 1333/5000 [11:02:00<28:22:16, 27.85s/it] 27%|██▋ | 1334/5000 [11:02:27<28:12:17, 27.70s/it] 27%|██▋ | 1335/5000 [11:02:54<28:02:03, 27.54s/it] 27%|██▋ | 1336/5000 [11:03:23<28:19:08, 27.82s/it] 27%|██▋ | 1337/5000 [11:03:50<28:14:55, 27.76s/it] 27%|██▋ | 1338/5000 [11:04:18<28:08:07, 27.66s/it] 27%|██▋ | 1339/5000 [11:04:46<28:27:51, 27.99s/it] 27%|██▋ | 1340/5000 [11:05:14<28:17:34, 27.83s/it] 27%|██▋ | 1341/5000 [11:05:41<28:04:00, 27.61s/it] 27%|██▋ | 1342/5000 [11:06:11<28:41:41, 28.24s/it] 27%|██▋ | 1343/5000 [11:06:37<28:07:48, 27.69s/it] 27%|██▋ | 1344/5000 [11:07:04<27:55:34, 27.50s/it] 27%|██▋ | 1345/5000 [11:07:33<28:16:45, 27.85s/it] 27%|██▋ | 1346/5000 [11:08:00<28:10:19, 27.76s/it] 27%|██▋ | 1347/5000 [11:08:28<27:58:30, 27.57s/it] 27%|██▋ | 1348/5000 [11:08:57<28:30:17, 28.10s/it] 27%|██▋ | 1349/5000 [11:09:24<28:12:24, 27.81s/it] 27%|██▋ | 1350/5000 [11:09:55<29:06:48, 28.71s/it] 27%|██▋ | 1350/5000 [11:09:55<29:06:48, 28.71s/it] 27%|██▋ | 1351/5000 [11:10:24<29:08:35, 28.75s/it] 27%|██▋ | 1352/5000 [11:10:51<28:39:20, 28.28s/it] 27%|██▋ | 1353/5000 [11:11:24<29:59:24, 29.60s/it] 27%|██▋ | 1354/5000 [11:11:51<29:19:51, 28.96s/it] 27%|██▋ | 1355/5000 [11:12:18<28:48:42, 28.46s/it] 27%|██▋ | 1356/5000 [11:12:46<28:43:07, 28.37s/it] 27%|██▋ | 1357/5000 [11:13:14<28:23:55, 28.06s/it] 27%|██▋ | 1358/5000 [11:13:41<28:15:04, 27.93s/it] 27%|██▋ | 1359/5000 [11:14:09<28:17:31, 27.97s/it] 27%|██▋ | 1360/5000 [11:14:37<28:05:50, 27.79s/it] 27%|██▋ | 1361/5000 [11:15:05<28:16:14, 27.97s/it] 27%|██▋ | 1362/5000 [11:15:33<28:04:12, 27.78s/it] 27%|██▋ | 1363/5000 [11:16:00<27:57:20, 27.67s/it] 27%|██▋ | 1364/5000 [11:16:29<28:24:49, 28.13s/it] 27%|██▋ | 1365/5000 [11:16:56<28:03:22, 27.79s/it] 27%|██▋ | 1366/5000 [11:17:23<27:53:22, 27.63s/it] 27%|██▋ | 1367/5000 [11:17:52<28:03:10, 27.80s/it] 27%|██▋ | 1368/5000 [11:18:18<27:42:37, 27.47s/it] 27%|██▋ | 1369/5000 [11:18:45<27:30:31, 27.27s/it] 27%|██▋ | 1370/5000 [11:19:15<28:13:04, 27.98s/it] 27%|██▋ | 1371/5000 [11:19:42<27:57:43, 27.74s/it] 27%|██▋ | 1372/5000 [11:20:08<27:32:56, 27.34s/it] 27%|██▋ | 1373/5000 [11:20:38<28:06:40, 27.90s/it] 27%|██▋ | 1374/5000 [11:21:05<27:52:56, 27.68s/it] 28%|██▊ | 1375/5000 [11:21:33<28:04:21, 27.88s/it] 28%|██▊ | 1375/5000 [11:21:33<28:04:21, 27.88s/it] 28%|██▊ | 1376/5000 [11:22:01<28:08:49, 27.96s/it] 28%|██▊ | 1377/5000 [11:22:29<27:59:07, 27.81s/it] 28%|██▊ | 1378/5000 [11:22:57<28:08:09, 27.96s/it] 28%|██▊ | 1379/5000 [11:23:24<27:54:55, 27.75s/it] 28%|██▊ | 1380/5000 [11:23:52<27:58:23, 27.82s/it] 28%|██▊ | 1381/5000 [11:24:20<28:04:00, 27.92s/it] 28%|██▊ | 1382/5000 [11:24:47<27:48:29, 27.67s/it] 28%|██▊ | 1383/5000 [11:25:15<27:44:58, 27.62s/it] 28%|██▊ | 1384/5000 [11:25:42<27:26:25, 27.32s/it] 28%|██▊ | 1385/5000 [11:26:09<27:23:13, 27.27s/it] 28%|██▊ | 1386/5000 [11:26:37<27:44:26, 27.63s/it] 28%|██▊ | 1387/5000 [11:27:05<27:50:53, 27.75s/it] 28%|██▊ | 1388/5000 [11:27:32<27:35:04, 27.49s/it] 28%|██▊ | 1389/5000 [11:28:00<27:46:04, 27.68s/it] 28%|██▊ | 1390/5000 [11:28:29<27:59:32, 27.91s/it] 28%|██▊ | 1391/5000 [11:28:56<27:47:18, 27.72s/it] 28%|██▊ | 1392/5000 [11:29:24<27:54:20, 27.84s/it] 28%|██▊ | 1393/5000 [11:29:52<27:55:22, 27.87s/it] 28%|██▊ | 1394/5000 [11:30:21<28:13:02, 28.17s/it] 28%|██▊ | 1395/5000 [11:30:48<27:51:33, 27.82s/it] 28%|██▊ | 1396/5000 [11:31:16<27:58:08, 27.94s/it] 28%|██▊ | 1397/5000 [11:31:45<28:12:19, 28.18s/it] 28%|██▊ | 1398/5000 [11:32:12<27:57:18, 27.94s/it] 28%|██▊ | 1399/5000 [11:32:41<28:02:08, 28.03s/it] 28%|██▊ | 1400/5000 [11:33:09<28:05:10, 28.09s/it] 28%|██▊ | 1400/5000 [11:33:09<28:05:10, 28.09s/it] 28%|██▊ | 1401/5000 [11:33:36<27:53:26, 27.90s/it] 28%|██▊ | 1402/5000 [11:34:05<28:02:49, 28.06s/it] 28%|██▊ | 1403/5000 [11:34:33<28:06:52, 28.14s/it] 28%|██▊ | 1404/5000 [11:35:01<27:55:53, 27.96s/it] 28%|██▊ | 1405/5000 [11:35:29<28:01:31, 28.06s/it] 28%|██▊ | 1406/5000 [11:35:57<28:04:24, 28.12s/it] 28%|██▊ | 1407/5000 [11:36:24<27:51:24, 27.91s/it] 28%|██▊ | 1408/5000 [11:36:52<27:49:08, 27.88s/it] 28%|██▊ | 1409/5000 [11:37:21<28:01:06, 28.09s/it] 28%|██▊ | 1410/5000 [11:37:49<27:53:02, 27.96s/it] 28%|██▊ | 1411/5000 [11:38:18<28:12:46, 28.30s/it] 28%|██▊ | 1412/5000 [11:38:45<27:53:21, 27.98s/it] 28%|██▊ | 1413/5000 [11:39:12<27:34:11, 27.67s/it] 28%|██▊ | 1414/5000 [11:39:41<28:00:46, 28.12s/it] 28%|██▊ | 1415/5000 [11:40:08<27:41:46, 27.81s/it] 28%|██▊ | 1416/5000 [11:40:35<27:26:29, 27.56s/it] 28%|██▊ | 1417/5000 [11:41:03<27:28:07, 27.60s/it] 28%|██▊ | 1418/5000 [11:41:31<27:36:52, 27.75s/it] 28%|██▊ | 1419/5000 [11:41:58<27:30:02, 27.65s/it] 28%|██▊ | 1420/5000 [11:42:27<27:40:21, 27.83s/it] 28%|██▊ | 1421/5000 [11:42:55<27:45:15, 27.92s/it] 28%|██▊ | 1422/5000 [11:43:22<27:34:57, 27.75s/it] 28%|██▊ | 1423/5000 [11:43:50<27:44:26, 27.92s/it] 28%|██▊ | 1424/5000 [11:44:18<27:42:26, 27.89s/it] 28%|██▊ | 1425/5000 [11:44:46<27:43:38, 27.92s/it] 28%|██▊ | 1425/5000 [11:44:46<27:43:38, 27.92s/it] 29%|██▊ | 1426/5000 [11:45:14<27:33:33, 27.76s/it] 29%|██▊ | 1427/5000 [11:45:43<27:55:02, 28.13s/it] 29%|██▊ | 1428/5000 [11:46:10<27:38:56, 27.87s/it] 29%|██▊ | 1429/5000 [11:46:37<27:22:36, 27.60s/it] 29%|██▊ | 1430/5000 [11:47:05<27:38:14, 27.87s/it] 29%|██▊ | 1431/5000 [11:47:32<27:24:41, 27.65s/it] 29%|██▊ | 1432/5000 [11:48:00<27:15:12, 27.50s/it] 29%|██▊ | 1433/5000 [11:48:32<28:37:24, 28.89s/it] 29%|██▊ | 1434/5000 [11:48:59<28:09:46, 28.43s/it] 29%|██▊ | 1435/5000 [11:49:26<27:50:08, 28.11s/it] 29%|██▊ | 1436/5000 [11:49:55<27:58:10, 28.25s/it] 29%|██▊ | 1437/5000 [11:50:22<27:38:38, 27.93s/it] 29%|██▉ | 1438/5000 [11:50:49<27:26:56, 27.74s/it] 29%|██▉ | 1439/5000 [11:51:17<27:18:33, 27.61s/it] 29%|██▉ | 1440/5000 [11:51:47<28:03:57, 28.38s/it] 29%|██▉ | 1441/5000 [11:52:14<27:49:17, 28.14s/it] 29%|██▉ | 1442/5000 [11:52:42<27:39:29, 27.98s/it] 29%|██▉ | 1443/5000 [11:53:11<27:51:56, 28.20s/it] 29%|██▉ | 1444/5000 [11:53:38<27:39:04, 27.99s/it] 29%|██▉ | 1445/5000 [11:54:05<27:18:37, 27.66s/it] 29%|██▉ | 1446/5000 [11:54:35<28:01:38, 28.39s/it] 29%|██▉ | 1447/5000 [11:55:02<27:38:57, 28.02s/it] 29%|██▉ | 1448/5000 [11:55:30<27:21:35, 27.73s/it] 29%|██▉ | 1449/5000 [11:55:59<27:53:39, 28.28s/it] 29%|██▉ | 1450/5000 [11:56:27<27:40:17, 28.06s/it] 29%|██▉ | 1450/5000 [11:56:27<27:40:17, 28.06s/it] 29%|██▉ | 1451/5000 [11:56:54<27:26:28, 27.84s/it] 29%|██▉ | 1452/5000 [11:57:23<27:48:44, 28.22s/it] 29%|██▉ | 1453/5000 [11:57:50<27:30:27, 27.92s/it] 29%|██▉ | 1454/5000 [11:58:16<26:58:29, 27.39s/it] 29%|██▉ | 1455/5000 [11:58:46<27:28:33, 27.90s/it] 29%|██▉ | 1456/5000 [11:59:13<27:13:41, 27.66s/it] 29%|██▉ | 1457/5000 [11:59:40<27:05:51, 27.53s/it] 29%|██▉ | 1458/5000 [12:00:09<27:32:16, 27.99s/it] 29%|██▉ | 1459/5000 [12:00:36<27:17:05, 27.74s/it] 29%|██▉ | 1460/5000 [12:01:04<27:11:50, 27.66s/it] 29%|██▉ | 1461/5000 [12:01:32<27:30:47, 27.99s/it] 29%|██▉ | 1462/5000 [12:02:00<27:19:06, 27.80s/it] 29%|██▉ | 1463/5000 [12:02:27<27:11:13, 27.67s/it] 29%|██▉ | 1464/5000 [12:02:57<27:43:10, 28.22s/it] 29%|██▉ | 1465/5000 [12:03:24<27:28:50, 27.99s/it] 29%|██▉ | 1466/5000 [12:03:51<27:17:06, 27.79s/it] 29%|██▉ | 1467/5000 [12:04:21<27:47:06, 28.31s/it] 29%|██▉ | 1468/5000 [12:04:35<23:32:33, 24.00s/it] 29%|██▉ | 1469/5000 [12:04:45<19:37:47, 20.01s/it] 29%|██▉ | 1470/5000 [12:04:56<16:57:46, 17.30s/it] 29%|██▉ | 1471/5000 [12:05:07<15:02:59, 15.35s/it]{'loss': 0.0189, 'learning_rate': 8.171111111111113e-06, 'epoch': 8.0} {'loss': 0.0201, 'learning_rate': 8.115555555555557e-06, 'epoch': 8.01} {'loss': 0.0182, 'learning_rate': 8.06e-06, 'epoch': 8.01} {'loss': 0.013, 'learning_rate': 8.004444444444445e-06, 'epoch': 8.02} {'loss': 0.0148, 'learning_rate': 7.948888888888889e-06, 'epoch': 8.02} {'loss': 0.016, 'learning_rate': 7.893333333333335e-06, 'epoch': 8.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 1.09it/s] Reading metadata...: 14729it [00:01, 19892.17it/s] Reading metadata...: 23393it [00:02, 12147.21it/s] Reading metadata...: 28043it [00:02, 13424.59it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 3.07it/s] Reading metadata...: 10438it [00:00, 26428.08it/s] 29%|██▉ | 1472/5000 [12:06:37<36:46:56, 37.53s/it] 29%|██▉ | 1473/5000 [12:07:05<34:05:45, 34.80s/it] 29%|██▉ | 1474/5000 [12:07:33<32:09:36, 32.83s/it] 30%|██▉ | 1475/5000 [12:08:01<30:35:51, 31.25s/it] 30%|██▉ | 1475/5000 [12:08:01<30:35:51, 31.25s/it] 30%|██▉ | 1476/5000 [12:08:27<29:14:42, 29.88s/it] 30%|██▉ | 1477/5000 [12:08:56<28:45:55, 29.39s/it] 30%|██▉ | 1478/5000 [12:09:24<28:32:11, 29.17s/it] 30%|██▉ | 1479/5000 [12:09:53<28:20:02, 28.97s/it] 30%|██▉ | 1480/5000 [12:10:21<28:00:31, 28.65s/it] 30%|██▉ | 1481/5000 [12:10:48<27:27:23, 28.09s/it] 30%|██▉ | 1482/5000 [12:11:16<27:26:09, 28.08s/it] 30%|██▉ | 1483/5000 [12:11:44<27:37:15, 28.27s/it] 30%|██▉ | 1484/5000 [12:12:12<27:25:44, 28.08s/it] 30%|██▉ | 1485/5000 [12:12:40<27:27:43, 28.13s/it] 30%|██▉ | 1486/5000 [12:13:08<27:29:56, 28.17s/it] 30%|██▉ | 1487/5000 [12:13:37<27:34:55, 28.27s/it] 30%|██▉ | 1488/5000 [12:14:05<27:37:15, 28.31s/it] 30%|██▉ | 1489/5000 [12:14:33<27:21:52, 28.06s/it] 30%|██▉ | 1490/5000 [12:15:06<28:46:38, 29.52s/it] 30%|██▉ | 1491/5000 [12:15:35<28:49:28, 29.57s/it] 30%|██▉ | 1492/5000 [12:16:03<28:10:52, 28.92s/it] 30%|██▉ | 1493/5000 [12:16:31<28:03:07, 28.80s/it] 30%|██▉ | 1494/5000 [12:17:00<27:58:07, 28.72s/it] 30%|██▉ | 1495/5000 [12:17:27<27:30:58, 28.26s/it] 30%|██▉ | 1496/5000 [12:17:56<27:39:50, 28.42s/it] 30%|██▉ | 1497/5000 [12:18:24<27:28:43, 28.24s/it] 30%|██▉ | 1498/5000 [12:18:51<27:10:42, 27.94s/it] 30%|██▉ | 1499/5000 [12:19:20<27:21:26, 28.13s/it] 30%|███ | 1500/5000 [12:19:47<27:11:41, 27.97s/it] 30%|███ | 1500/5000 [12:19:47<27:11:41, 27.97s/it] 30%|███ | 1501/5000 [12:20:15<27:01:40, 27.81s/it] 30%|███ | 1502/5000 [12:20:43<27:12:26, 28.00s/it] 30%|███ | 1503/5000 [12:21:11<27:04:51, 27.88s/it] 30%|███ | 1504/5000 [12:21:38<26:58:32, 27.78s/it] 30%|███ | 1505/5000 [12:22:07<27:24:25, 28.23s/it] 30%|███ | 1506/5000 [12:22:35<27:07:09, 27.94s/it] 30%|███ | 1507/5000 [12:23:02<26:52:11, 27.69s/it] 30%|███ | 1508/5000 [12:23:30<27:08:51, 27.99s/it] 30%|███ | 1509/5000 [12:23:59<27:13:40, 28.08s/it] 30%|███ | 1510/5000 [12:24:26<27:06:58, 27.97s/it] 30%|███ | 1511/5000 [12:24:55<27:21:39, 28.23s/it] 30%|███ | 1512/5000 [12:25:24<27:23:35, 28.27s/it] 30%|███ | 1513/5000 [12:25:50<26:56:58, 27.82s/it] 30%|███ | 1514/5000 [12:26:19<27:08:41, 28.03s/it] 30%|███ | 1515/5000 [12:26:47<27:14:18, 28.14s/it] 30%|███ | 1516/5000 [12:27:14<26:55:57, 27.83s/it] 30%|███ | 1517/5000 [12:27:43<27:07:38, 28.04s/it] 30%|███ | 1518/5000 [12:28:11<27:02:39, 27.96s/it] 30%|███ | 1519/5000 [12:28:38<26:54:32, 27.83s/it] 30%|███ | 1520/5000 [12:29:06<26:44:12, 27.66s/it] 30%|███ | 1521/5000 [12:29:33<26:39:53, 27.59s/it] 30%|███ | 1522/5000 [12:30:03<27:21:36, 28.32s/it] 30%|███ | 1523/5000 [12:30:31<27:07:24, 28.08s/it] 30%|███ | 1524/5000 [12:31:04<28:34:47, 29.60s/it] 30%|███ | 1525/5000 [12:31:31<27:52:56, 28.89s/it] 30%|███ | 1525/5000 [12:31:31<27:52:56, 28.89s/it] 31%|███ | 1526/5000 [12:31:57<27:01:16, 28.00s/it] 31%|███ | 1527/5000 [12:32:25<27:06:31, 28.10s/it] 31%|███ | 1528/5000 [12:32:53<26:54:47, 27.91s/it] 31%|███ | 1529/5000 [12:33:20<26:42:00, 27.69s/it] 31%|███ | 1530/5000 [12:33:49<26:59:04, 28.00s/it] 31%|███ | 1531/5000 [12:34:16<26:44:41, 27.75s/it] 31%|███ | 1532/5000 [12:34:43<26:36:35, 27.62s/it] 31%|███ | 1533/5000 [12:35:12<27:00:56, 28.05s/it] 31%|███ | 1534/5000 [12:35:39<26:46:28, 27.81s/it] 31%|███ | 1535/5000 [12:36:06<26:34:27, 27.61s/it] 31%|███ | 1536/5000 [12:36:35<26:47:15, 27.84s/it] 31%|███ | 1537/5000 [12:37:03<26:48:11, 27.86s/it] 31%|███ | 1538/5000 [12:37:30<26:33:13, 27.61s/it] 31%|███ | 1539/5000 [12:37:58<26:36:55, 27.68s/it] 31%|███ | 1540/5000 [12:38:26<26:50:03, 27.92s/it] 31%|███ | 1541/5000 [12:38:54<26:41:16, 27.78s/it] 31%|███ | 1542/5000 [12:39:27<28:19:05, 29.48s/it] 31%|███ | 1543/5000 [12:39:55<27:53:20, 29.04s/it] 31%|███ | 1544/5000 [12:40:22<27:22:52, 28.52s/it] 31%|███ | 1545/5000 [12:40:52<27:50:04, 29.00s/it] 31%|███ | 1546/5000 [12:41:21<27:36:28, 28.77s/it] 31%|███ | 1547/5000 [12:41:49<27:32:42, 28.72s/it] 31%|███ | 1548/5000 [12:42:17<27:12:32, 28.38s/it] 31%|███ | 1549/5000 [12:42:45<27:06:36, 28.28s/it] 31%|███ | 1550/5000 [12:43:13<27:03:45, 28.24s/it] 31%|███ | 1550/5000 [12:43:13<27:03:45, 28.24s/it] 31%|███ | 1551/5000 [12:43:40<26:46:01, 27.94s/it] 31%|███ | 1552/5000 [12:44:09<26:51:27, 28.04s/it] 31%|███ | 1553/5000 [12:44:36<26:46:02, 27.96s/it] 31%|███ | 1554/5000 [12:45:04<26:33:26, 27.74s/it] 31%|███ | 1555/5000 [12:45:31<26:30:35, 27.70s/it] 31%|███ | 1556/5000 [12:46:00<26:48:06, 28.02s/it] 31%|███ | 1557/5000 [12:46:27<26:37:19, 27.84s/it] 31%|███ | 1558/5000 [12:46:56<26:44:03, 27.96s/it] 31%|███ | 1559/5000 [12:47:24<26:44:24, 27.98s/it] 31%|███ | 1560/5000 [12:47:51<26:32:17, 27.77s/it] 31%|███ | 1561/5000 [12:48:19<26:35:55, 27.84s/it] 31%|███ | 1562/5000 [12:48:47<26:45:43, 28.02s/it] 31%|███▏ | 1563/5000 [12:49:15<26:36:00, 27.86s/it] 31%|███▏ | 1564/5000 [12:49:43<26:47:23, 28.07s/it] 31%|███▏ | 1565/5000 [12:50:11<26:30:26, 27.78s/it] 31%|███▏ | 1566/5000 [12:50:37<26:06:30, 27.37s/it] 31%|███▏ | 1567/5000 [12:51:05<26:15:14, 27.53s/it] 31%|███▏ | 1568/5000 [12:51:33<26:32:55, 27.85s/it] 31%|███▏ | 1569/5000 [12:52:01<26:22:45, 27.68s/it] 31%|███▏ | 1570/5000 [12:52:29<26:29:00, 27.80s/it] 31%|███▏ | 1571/5000 [12:52:57<26:42:31, 28.04s/it] 31%|███▏ | 1572/5000 [12:53:26<26:44:39, 28.09s/it] 31%|███▏ | 1573/5000 [12:53:53<26:33:34, 27.90s/it] 31%|███▏ | 1574/5000 [12:54:23<26:59:56, 28.37s/it] 32%|███▏ | 1575/5000 [12:54:51<26:59:30, 28.37s/it] 32%|███▏ | 1575/5000 [12:54:51<26:59:30, 28.37s/it] 32%|███▏ | 1576/5000 [12:55:18<26:38:35, 28.01s/it] 32%|███▏ | 1577/5000 [12:55:45<26:17:00, 27.64s/it] 32%|███▏ | 1578/5000 [12:56:13<26:30:11, 27.88s/it] 32%|███▏ | 1579/5000 [12:56:40<26:16:59, 27.66s/it] 32%|███▏ | 1580/5000 [12:57:07<26:06:31, 27.48s/it] 32%|███▏ | 1581/5000 [12:57:37<26:43:35, 28.14s/it] 32%|███▏ | 1582/5000 [12:58:04<26:24:12, 27.81s/it] 32%|███▏ | 1583/5000 [12:58:31<26:08:29, 27.54s/it] 32%|███▏ | 1584/5000 [12:59:00<26:33:22, 27.99s/it] 32%|███▏ | 1585/5000 [12:59:27<26:14:21, 27.66s/it] 32%|███▏ | 1586/5000 [12:59:55<26:17:01, 27.72s/it] 32%|███▏ | 1587/5000 [13:00:27<27:34:23, 29.08s/it] 32%|███▏ | 1588/5000 [13:00:55<27:15:15, 28.76s/it] 32%|███▏ | 1589/5000 [13:01:23<26:51:00, 28.34s/it] 32%|███▏ | 1590/5000 [13:01:51<26:58:44, 28.48s/it] 32%|███▏ | 1591/5000 [13:02:19<26:52:04, 28.37s/it] 32%|███▏ | 1592/5000 [13:02:47<26:29:16, 27.98s/it] 32%|███▏ | 1593/5000 [13:03:15<26:30:34, 28.01s/it] 32%|███▏ | 1594/5000 [13:03:43<26:33:39, 28.07s/it] 32%|███▏ | 1595/5000 [13:04:10<26:21:22, 27.87s/it] 32%|███▏ | 1596/5000 [13:04:38<26:27:24, 27.98s/it] 32%|███▏ | 1597/5000 [13:05:06<26:24:36, 27.94s/it] 32%|███▏ | 1598/5000 [13:05:34<26:16:14, 27.80s/it] 32%|███▏ | 1599/5000 [13:06:01<26:09:28, 27.69s/it] 32%|███▏ | 1600/5000 [13:06:30<26:27:03, 28.01s/it] 32%|███▏ | 1600/5000 [13:06:30<26:27:03, 28.01s/it] 32%|███▏ | 1601/5000 [13:06:56<26:00:03, 27.54s/it] 32%|███▏ | 1602/5000 [13:07:24<25:56:00, 27.48s/it] 32%|███▏ | 1603/5000 [13:07:53<26:30:08, 28.09s/it] 32%|███▏ | 1604/5000 [13:08:21<26:18:15, 27.88s/it] 32%|███▏ | 1605/5000 [13:08:47<25:57:29, 27.53s/it] 32%|███▏ | 1606/5000 [13:09:17<26:27:58, 28.07s/it] 32%|███▏ | 1607/5000 [13:09:44<26:12:10, 27.80s/it] 32%|███▏ | 1608/5000 [13:10:11<26:03:09, 27.65s/it] 32%|███▏ | 1609/5000 [13:10:39<26:07:40, 27.74s/it] 32%|███▏ | 1610/5000 [13:11:07<26:05:11, 27.70s/it] 32%|███▏ | 1611/5000 [13:11:34<25:53:08, 27.50s/it] 32%|███▏ | 1612/5000 [13:12:02<26:08:36, 27.78s/it] 32%|███▏ | 1613/5000 [13:12:30<26:17:08, 27.94s/it] 32%|███▏ | 1614/5000 [13:12:58<26:06:32, 27.76s/it] 32%|███▏ | 1615/5000 [13:13:26<26:15:27, 27.93s/it] 32%|███▏ | 1616/5000 [13:13:54<26:07:22, 27.79s/it] 32%|███▏ | 1617/5000 [13:14:21<25:58:13, 27.64s/it] 32%|███▏ | 1618/5000 [13:14:49<26:12:17, 27.89s/it] 32%|███▏ | 1619/5000 [13:15:18<26:19:24, 28.03s/it] 32%|███▏ | 1620/5000 [13:15:45<26:06:14, 27.80s/it] 32%|███▏ | 1621/5000 [13:16:15<26:36:44, 28.35s/it] 32%|███▏ | 1622/5000 [13:16:42<26:20:53, 28.08s/it] 32%|███▏ | 1623/5000 [13:17:09<26:08:41, 27.87s/it] 32%|███▏ | 1624/5000 [13:17:39<26:33:00, 28.31s/it] 32%|███▎ | 1625/5000 [13:18:05<26:04:53, 27.82s/it] 32%|███▎ | 1625/5000 [13:18:05<26:04:53, 27.82s/it] 33%|███▎ | 1626/5000 [13:18:32<25:39:43, 27.38s/it] 33%|███▎ | 1627/5000 [13:19:00<25:57:00, 27.70s/it] 33%|███▎ | 1628/5000 [13:19:28<26:02:38, 27.80s/it] 33%|███▎ | 1629/5000 [13:19:56<25:55:25, 27.68s/it] 33%|███▎ | 1630/5000 [13:20:25<26:13:16, 28.01s/it] 33%|███▎ | 1631/5000 [13:20:47<24:36:03, 26.29s/it] 33%|███▎ | 1632/5000 [13:20:58<20:14:27, 21.64s/it] 33%|███▎ | 1633/5000 [13:21:09<17:14:14, 18.43s/it] 33%|███▎ | 1634/5000 [13:21:19<15:06:12, 16.15s/it] 33%|███▎ | 1635/5000 [13:21:27<12:44:04, 13.62s/it]{'loss': 0.0138, 'learning_rate': 7.837777777777779e-06, 'epoch': 9.0} {'loss': 0.0132, 'learning_rate': 7.782222222222223e-06, 'epoch': 9.01} {'loss': 0.0128, 'learning_rate': 7.726666666666667e-06, 'epoch': 9.01} {'loss': 0.0096, 'learning_rate': 7.67111111111111e-06, 'epoch': 9.02} {'loss': 0.0085, 'learning_rate': 7.6155555555555564e-06, 'epoch': 9.02} {'loss': 0.0093, 'learning_rate': 7.5600000000000005e-06, 'epoch': 9.03} {'loss': 0.0108, 'learning_rate': 7.504444444444445e-06, 'epoch': 9.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 1.09it/s] Reading metadata...: 14297it [00:01, 19335.87it/s] Reading metadata...: 22707it [00:01, 13626.53it/s] Reading metadata...: 28043it [00:01, 14883.39it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 3.31it/s] Reading metadata...: 10438it [00:00, 27803.36it/s] 33%|███▎ | 1636/5000 [13:23:09<37:25:31, 40.05s/it] 33%|███▎ | 1637/5000 [13:23:37<34:06:50, 36.52s/it] 33%|███▎ | 1638/5000 [13:24:04<31:28:36, 33.71s/it] 33%|███▎ | 1639/5000 [13:24:33<29:57:34, 32.09s/it] 33%|███▎ | 1640/5000 [13:25:01<28:52:57, 30.95s/it] 33%|███▎ | 1641/5000 [13:25:29<28:05:03, 30.10s/it] 33%|███▎ | 1642/5000 [13:25:56<27:21:45, 29.33s/it] 33%|███▎ | 1643/5000 [13:26:24<26:57:20, 28.91s/it] 33%|███▎ | 1644/5000 [13:26:53<26:45:01, 28.70s/it] 33%|███▎ | 1645/5000 [13:27:20<26:29:16, 28.42s/it] 33%|███▎ | 1646/5000 [13:27:48<26:09:44, 28.08s/it] 33%|███▎ | 1647/5000 [13:28:16<26:17:17, 28.22s/it] 33%|███▎ | 1648/5000 [13:28:44<26:07:09, 28.05s/it] 33%|███▎ | 1649/5000 [13:29:11<25:50:31, 27.76s/it] 33%|███▎ | 1650/5000 [13:29:38<25:39:58, 27.58s/it] 33%|███▎ | 1650/5000 [13:29:38<25:39:58, 27.58s/it] 33%|███▎ | 1651/5000 [13:30:05<25:34:33, 27.49s/it] 33%|███▎ | 1652/5000 [13:30:33<25:44:02, 27.67s/it] 33%|███▎ | 1653/5000 [13:31:01<25:33:44, 27.49s/it] 33%|███▎ | 1654/5000 [13:31:30<25:57:38, 27.93s/it] 33%|███▎ | 1655/5000 [13:31:57<25:45:05, 27.71s/it] 33%|███▎ | 1656/5000 [13:32:24<25:32:51, 27.50s/it] 33%|███▎ | 1657/5000 [13:32:52<25:47:19, 27.77s/it] 33%|███▎ | 1658/5000 [13:33:20<25:52:06, 27.87s/it] 33%|███▎ | 1659/5000 [13:33:47<25:39:52, 27.65s/it] 33%|███▎ | 1660/5000 [13:34:16<25:53:40, 27.91s/it] 33%|███▎ | 1661/5000 [13:34:43<25:42:34, 27.72s/it] 33%|███▎ | 1662/5000 [13:35:10<25:32:21, 27.54s/it] 33%|███▎ | 1663/5000 [13:35:38<25:30:18, 27.52s/it] 33%|███▎ | 1664/5000 [13:36:05<25:22:15, 27.38s/it] 33%|███▎ | 1665/5000 [13:36:31<25:09:22, 27.16s/it] 33%|███▎ | 1666/5000 [13:37:01<25:44:47, 27.80s/it] 33%|███▎ | 1667/5000 [13:37:28<25:32:31, 27.59s/it] 33%|███▎ | 1668/5000 [13:37:54<25:08:43, 27.17s/it] 33%|███▎ | 1669/5000 [13:38:23<25:31:37, 27.59s/it] 33%|███▎ | 1670/5000 [13:38:50<25:21:33, 27.42s/it] 33%|███▎ | 1671/5000 [13:39:16<25:09:14, 27.20s/it] 33%|███▎ | 1672/5000 [13:39:45<25:38:28, 27.74s/it] 33%|███▎ | 1673/5000 [13:40:12<25:15:14, 27.33s/it] 33%|███▎ | 1674/5000 [13:40:38<25:03:16, 27.12s/it] 34%|███▎ | 1675/5000 [13:41:06<25:19:32, 27.42s/it] 34%|███▎ | 1675/5000 [13:41:06<25:19:32, 27.42s/it] 34%|███▎ | 1676/5000 [13:41:33<25:12:57, 27.31s/it] 34%|███▎ | 1677/5000 [13:42:02<25:31:21, 27.65s/it] 34%|███▎ | 1678/5000 [13:42:30<25:30:24, 27.64s/it] 34%|███▎ | 1679/5000 [13:42:56<25:13:08, 27.34s/it] 34%|███▎ | 1680/5000 [13:43:25<25:45:10, 27.92s/it] 34%|███▎ | 1681/5000 [13:43:53<25:37:04, 27.79s/it] 34%|███▎ | 1682/5000 [13:44:21<25:43:44, 27.92s/it] 34%|███▎ | 1683/5000 [13:44:49<25:40:16, 27.86s/it] 34%|███▎ | 1684/5000 [13:45:16<25:27:47, 27.64s/it] 34%|███▎ | 1685/5000 [13:45:44<25:32:00, 27.73s/it] 34%|███▎ | 1686/5000 [13:46:12<25:29:49, 27.70s/it] 34%|███▎ | 1687/5000 [13:46:38<25:15:57, 27.45s/it] 34%|███▍ | 1688/5000 [13:47:07<25:26:03, 27.65s/it] 34%|███▍ | 1689/5000 [13:47:33<25:11:11, 27.38s/it] 34%|███▍ | 1690/5000 [13:48:01<25:09:34, 27.36s/it] 34%|███▍ | 1691/5000 [13:48:29<25:24:40, 27.65s/it] 34%|███▍ | 1692/5000 [13:48:56<25:16:12, 27.50s/it] 34%|███▍ | 1693/5000 [13:49:23<25:01:04, 27.23s/it] 34%|███▍ | 1694/5000 [13:49:51<25:22:41, 27.64s/it] 34%|███▍ | 1695/5000 [13:50:18<25:14:46, 27.50s/it] 34%|███▍ | 1696/5000 [13:50:46<25:10:07, 27.42s/it] 34%|███▍ | 1697/5000 [13:51:14<25:29:33, 27.78s/it] 34%|███▍ | 1698/5000 [13:51:41<25:18:17, 27.59s/it] 34%|███▍ | 1699/5000 [13:52:08<25:06:27, 27.38s/it] 34%|███▍ | 1700/5000 [13:52:38<25:50:17, 28.19s/it] 34%|███▍ | 1700/5000 [13:52:38<25:50:17, 28.19s/it] 34%|███▍ | 1701/5000 [13:53:04<25:13:16, 27.52s/it] 34%|███▍ | 1702/5000 [13:53:33<25:30:11, 27.84s/it] 34%|███▍ | 1703/5000 [13:54:01<25:37:22, 27.98s/it] 34%|███▍ | 1704/5000 [13:54:29<25:27:40, 27.81s/it] 34%|███▍ | 1705/5000 [13:54:57<25:42:19, 28.08s/it] 34%|███▍ | 1706/5000 [13:55:25<25:33:51, 27.94s/it] 34%|███▍ | 1707/5000 [13:55:53<25:33:44, 27.95s/it] 34%|███▍ | 1708/5000 [13:56:21<25:37:52, 28.03s/it] 34%|███▍ | 1709/5000 [13:56:48<25:20:10, 27.72s/it] 34%|███▍ | 1710/5000 [13:57:16<25:28:39, 27.88s/it] 34%|███▍ | 1711/5000 [13:57:44<25:21:54, 27.76s/it] 34%|███▍ | 1712/5000 [13:58:12<25:18:23, 27.71s/it] 34%|███▍ | 1713/5000 [13:58:40<25:30:41, 27.94s/it] 34%|███▍ | 1714/5000 [13:59:08<25:29:34, 27.93s/it] 34%|███▍ | 1715/5000 [13:59:35<25:15:18, 27.68s/it] 34%|███▍ | 1716/5000 [14:00:05<25:47:17, 28.27s/it] 34%|███▍ | 1717/5000 [14:00:34<25:57:15, 28.46s/it] 34%|███▍ | 1718/5000 [14:01:01<25:41:12, 28.18s/it] 34%|███▍ | 1719/5000 [14:01:30<25:45:47, 28.27s/it] 34%|███▍ | 1720/5000 [14:01:57<25:36:59, 28.12s/it] 34%|███▍ | 1721/5000 [14:02:30<26:45:27, 29.38s/it] 34%|███▍ | 1722/5000 [14:02:57<26:07:19, 28.69s/it] 34%|███▍ | 1723/5000 [14:03:29<27:06:41, 29.78s/it] 34%|███▍ | 1724/5000 [14:03:58<26:45:27, 29.40s/it] 34%|███▍ | 1725/5000 [14:04:25<26:15:06, 28.86s/it] 34%|███▍ | 1725/5000 [14:04:25<26:15:06, 28.86s/it] 35%|███▍ | 1726/5000 [14:04:53<25:57:25, 28.54s/it] 35%|███▍ | 1727/5000 [14:05:21<25:53:10, 28.47s/it] 35%|███▍ | 1728/5000 [14:05:49<25:34:32, 28.14s/it] 35%|███▍ | 1729/5000 [14:06:17<25:44:27, 28.33s/it] 35%|███▍ | 1730/5000 [14:06:46<25:41:59, 28.29s/it] 35%|███▍ | 1731/5000 [14:07:13<25:24:43, 27.99s/it] 35%|███▍ | 1732/5000 [14:07:41<25:34:15, 28.17s/it] 35%|███▍ | 1733/5000 [14:08:10<25:33:06, 28.16s/it] 35%|███▍ | 1734/5000 [14:08:37<25:20:07, 27.93s/it] 35%|███▍ | 1735/5000 [14:09:05<25:20:19, 27.94s/it] 35%|███▍ | 1736/5000 [14:09:33<25:25:28, 28.04s/it] 35%|███▍ | 1737/5000 [14:10:01<25:12:55, 27.82s/it] 35%|███▍ | 1738/5000 [14:10:30<25:42:46, 28.38s/it] 35%|███▍ | 1739/5000 [14:10:58<25:27:33, 28.11s/it] 35%|███▍ | 1740/5000 [14:11:25<25:16:29, 27.91s/it] 35%|███▍ | 1741/5000 [14:11:54<25:34:43, 28.26s/it] 35%|███▍ | 1742/5000 [14:12:21<25:15:11, 27.90s/it] 35%|███▍ | 1743/5000 [14:12:48<25:03:08, 27.69s/it] 35%|███▍ | 1744/5000 [14:13:17<25:16:42, 27.95s/it] 35%|███▍ | 1745/5000 [14:13:45<25:22:19, 28.06s/it] 35%|███▍ | 1746/5000 [14:14:13<25:08:41, 27.82s/it] 35%|███▍ | 1747/5000 [14:14:41<25:19:52, 28.03s/it] 35%|███▍ | 1748/5000 [14:15:08<25:05:57, 27.79s/it] 35%|███▍ | 1749/5000 [14:15:36<25:00:19, 27.69s/it] 35%|███▌ | 1750/5000 [14:16:09<26:21:11, 29.19s/it] 35%|███▌ | 1750/5000 [14:16:09<26:21:11, 29.19s/it] 35%|███▌ | 1751/5000 [14:16:39<26:42:55, 29.60s/it] 35%|███▌ | 1752/5000 [14:17:07<26:22:37, 29.24s/it] 35%|███▌ | 1753/5000 [14:17:34<25:45:51, 28.57s/it] 35%|███▌ | 1754/5000 [14:18:03<25:46:47, 28.59s/it] 35%|███▌ | 1755/5000 [14:18:31<25:26:59, 28.23s/it] 35%|███▌ | 1756/5000 [14:18:58<25:07:45, 27.89s/it] 35%|███▌ | 1757/5000 [14:19:27<25:31:09, 28.33s/it] 35%|███▌ | 1758/5000 [14:19:54<25:15:10, 28.04s/it] 35%|███▌ | 1759/5000 [14:20:21<24:59:00, 27.75s/it] 35%|███▌ | 1760/5000 [14:20:51<25:21:43, 28.18s/it] 35%|███▌ | 1761/5000 [14:21:18<25:06:13, 27.90s/it] 35%|███▌ | 1762/5000 [14:21:45<24:53:38, 27.68s/it] 35%|███▌ | 1763/5000 [14:22:19<26:32:47, 29.52s/it] 35%|███▌ | 1764/5000 [14:22:46<25:54:14, 28.82s/it] 35%|███▌ | 1765/5000 [14:23:13<25:23:46, 28.26s/it] 35%|███▌ | 1766/5000 [14:23:40<25:07:10, 27.96s/it] 35%|███▌ | 1767/5000 [14:24:09<25:19:26, 28.20s/it] 35%|███▌ | 1768/5000 [14:24:36<25:06:20, 27.96s/it] 35%|███▌ | 1769/5000 [14:25:03<24:50:11, 27.67s/it] 35%|███▌ | 1770/5000 [14:25:33<25:17:01, 28.18s/it] 35%|███▌ | 1771/5000 [14:26:00<24:59:39, 27.87s/it] 35%|███▌ | 1772/5000 [14:26:27<24:53:26, 27.76s/it] 35%|███▌ | 1773/5000 [14:26:56<25:13:41, 28.14s/it] 35%|███▌ | 1774/5000 [14:27:24<24:58:53, 27.88s/it] 36%|███▌ | 1775/5000 [14:27:51<24:48:36, 27.69s/it] 36%|███▌ | 1775/5000 [14:27:51<24:48:36, 27.69s/it] 36%|███▌ | 1776/5000 [14:28:19<24:56:20, 27.85s/it] 36%|███▌ | 1777/5000 [14:28:47<24:51:23, 27.76s/it] 36%|███▌ | 1778/5000 [14:29:14<24:42:01, 27.60s/it] 36%|███▌ | 1779/5000 [14:29:42<24:53:03, 27.81s/it] 36%|███▌ | 1780/5000 [14:30:10<24:48:58, 27.74s/it] 36%|███▌ | 1781/5000 [14:30:37<24:40:13, 27.59s/it] 36%|███▌ | 1782/5000 [14:31:07<25:12:30, 28.20s/it] 36%|███▌ | 1783/5000 [14:31:34<25:00:34, 27.99s/it] 36%|███▌ | 1784/5000 [14:32:02<24:50:43, 27.81s/it] 36%|███▌ | 1785/5000 [14:32:31<25:21:21, 28.39s/it] 36%|███▌ | 1786/5000 [14:32:59<25:05:46, 28.11s/it] 36%|███▌ | 1787/5000 [14:33:26<24:48:35, 27.80s/it] 36%|███▌ | 1788/5000 [14:33:54<24:49:08, 27.82s/it] 36%|███▌ | 1789/5000 [14:34:21<24:41:28, 27.68s/it] 36%|███▌ | 1790/5000 [14:34:48<24:32:55, 27.53s/it] 36%|███▌ | 1791/5000 [14:35:17<24:59:21, 28.03s/it] 36%|███▌ | 1792/5000 [14:35:45<24:48:49, 27.85s/it] 36%|███▌ | 1793/5000 [14:36:12<24:30:17, 27.51s/it] 36%|███▌ | 1794/5000 [14:36:41<24:54:35, 27.97s/it] 36%|███▌ | 1795/5000 [14:36:54<21:05:37, 23.69s/it] 36%|███▌ | 1796/5000 [14:37:05<17:37:54, 19.81s/it] 36%|███▌ | 1797/5000 [14:37:16<15:14:25, 17.13s/it] 36%|███▌ | 1798/5000 [14:37:27<13:32:16, 15.22s/it]{'loss': 0.0086, 'learning_rate': 7.44888888888889e-06, 'epoch': 10.0} {'loss': 0.0085, 'learning_rate': 7.393333333333333e-06, 'epoch': 10.01} {'loss': 0.0087, 'learning_rate': 7.337777777777778e-06, 'epoch': 10.01} {'loss': 0.006, 'learning_rate': 7.282222222222222e-06, 'epoch': 10.02} {'loss': 0.0064, 'learning_rate': 7.226666666666667e-06, 'epoch': 10.02} {'loss': 0.0072, 'learning_rate': 7.171111111111112e-06, 'epoch': 10.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 1.19it/s] Reading metadata...: 15185it [00:00, 22097.72it/s] Reading metadata...: 24117it [00:01, 15316.75it/s] Reading metadata...: 28043it [00:01, 15924.27it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 1.26it/s] Reading metadata...: 10438it [00:00, 12100.29it/s] 36%|███▌ | 1799/5000 [14:39:04<35:31:12, 39.95s/it] 36%|███▌ | 1800/5000 [14:39:33<32:23:23, 36.44s/it] 36%|███▌ | 1800/5000 [14:39:33<32:23:23, 36.44s/it] 36%|███▌ | 1801/5000 [14:40:01<30:07:58, 33.91s/it] 36%|███▌ | 1802/5000 [14:40:29<28:37:19, 32.22s/it] 36%|███▌ | 1803/5000 [14:40:56<27:13:59, 30.67s/it] 36%|███▌ | 1804/5000 [14:41:25<26:41:19, 30.06s/it] 36%|███▌ | 1805/5000 [14:41:52<26:05:36, 29.40s/it] 36%|███▌ | 1806/5000 [14:42:20<25:31:20, 28.77s/it] 36%|███▌ | 1807/5000 [14:42:48<25:20:18, 28.57s/it] 36%|███▌ | 1808/5000 [14:43:15<24:55:49, 28.12s/it] 36%|███▌ | 1809/5000 [14:43:43<24:54:39, 28.10s/it] 36%|███▌ | 1810/5000 [14:44:16<26:11:14, 29.55s/it] 36%|███▌ | 1811/5000 [14:44:43<25:37:31, 28.93s/it] 36%|███▌ | 1812/5000 [14:45:12<25:37:11, 28.93s/it] 36%|███▋ | 1813/5000 [14:45:41<25:31:39, 28.84s/it] 36%|███▋ | 1814/5000 [14:46:09<25:20:29, 28.63s/it] 36%|███▋ | 1815/5000 [14:46:37<25:07:30, 28.40s/it] 36%|███▋ | 1816/5000 [14:47:05<24:53:54, 28.15s/it] 36%|███▋ | 1817/5000 [14:47:32<24:45:59, 28.01s/it] 36%|███▋ | 1818/5000 [14:48:01<24:51:11, 28.12s/it] 36%|███▋ | 1819/5000 [14:48:28<24:40:39, 27.93s/it] 36%|███▋ | 1820/5000 [14:48:57<24:54:24, 28.20s/it] 36%|███▋ | 1821/5000 [14:49:25<24:52:56, 28.18s/it] 36%|███▋ | 1822/5000 [14:49:52<24:35:39, 27.86s/it] 36%|███▋ | 1823/5000 [14:50:20<24:37:45, 27.91s/it] 36%|███▋ | 1824/5000 [14:50:48<24:39:31, 27.95s/it] 36%|███▋ | 1825/5000 [14:51:15<24:28:15, 27.75s/it] 36%|███▋ | 1825/5000 [14:51:16<24:28:15, 27.75s/it] 37%|███▋ | 1826/5000 [14:51:44<24:45:45, 28.09s/it] 37%|███▋ | 1827/5000 [14:52:13<24:48:43, 28.15s/it] 37%|███▋ | 1828/5000 [14:52:40<24:32:10, 27.85s/it] 37%|███▋ | 1829/5000 [14:53:09<24:45:40, 28.11s/it] 37%|███▋ | 1830/5000 [14:53:35<24:20:58, 27.65s/it] 37%|███▋ | 1831/5000 [14:54:02<24:12:02, 27.49s/it] 37%|███▋ | 1832/5000 [14:54:31<24:38:40, 28.01s/it] 37%|███▋ | 1833/5000 [14:54:58<24:14:57, 27.56s/it] 37%|███▋ | 1834/5000 [14:55:25<24:03:56, 27.36s/it] 37%|███▋ | 1835/5000 [14:55:53<24:10:50, 27.50s/it] 37%|███▋ | 1836/5000 [14:56:20<24:14:11, 27.58s/it] 37%|███▋ | 1837/5000 [14:56:48<24:09:05, 27.49s/it] 37%|███▋ | 1838/5000 [14:57:16<24:17:08, 27.65s/it] 37%|███▋ | 1839/5000 [14:57:44<24:25:02, 27.81s/it] 37%|███▋ | 1840/5000 [14:58:11<24:18:52, 27.70s/it] 37%|███▋ | 1841/5000 [14:58:39<24:19:25, 27.72s/it] 37%|███▋ | 1842/5000 [14:59:07<24:26:37, 27.86s/it] 37%|███▋ | 1843/5000 [14:59:34<24:02:28, 27.41s/it] 37%|███▋ | 1844/5000 [15:00:02<24:08:51, 27.54s/it] 37%|███▋ | 1845/5000 [15:00:29<24:12:56, 27.63s/it] 37%|███▋ | 1846/5000 [15:00:59<24:39:11, 28.14s/it] 37%|███▋ | 1847/5000 [15:01:25<24:13:52, 27.67s/it] 37%|███▋ | 1848/5000 [15:01:52<24:03:48, 27.48s/it] 37%|███▋ | 1849/5000 [15:02:21<24:26:37, 27.93s/it] 37%|███▋ | 1850/5000 [15:02:48<24:08:51, 27.60s/it] 37%|███▋ | 1850/5000 [15:02:48<24:08:51, 27.60s/it] 37%|███▋ | 1851/5000 [15:03:16<24:18:32, 27.79s/it] 37%|███▋ | 1852/5000 [15:03:44<24:17:52, 27.79s/it] 37%|███▋ | 1853/5000 [15:04:11<23:58:03, 27.42s/it] 37%|███▋ | 1854/5000 [15:04:42<25:01:11, 28.63s/it] 37%|███▋ | 1855/5000 [15:05:10<24:48:46, 28.40s/it] 37%|███▋ | 1856/5000 [15:05:37<24:30:29, 28.06s/it] 37%|███▋ | 1857/5000 [15:06:07<24:49:58, 28.44s/it] 37%|███▋ | 1858/5000 [15:06:34<24:31:44, 28.10s/it] 37%|███▋ | 1859/5000 [15:07:01<24:12:28, 27.75s/it] 37%|███▋ | 1860/5000 [15:07:30<24:31:36, 28.12s/it] 37%|███▋ | 1861/5000 [15:07:57<24:20:04, 27.91s/it] 37%|███▋ | 1862/5000 [15:08:25<24:10:41, 27.74s/it] 37%|███▋ | 1863/5000 [15:08:53<24:25:45, 28.04s/it] 37%|███▋ | 1864/5000 [15:09:21<24:25:20, 28.04s/it] 37%|███▋ | 1865/5000 [15:09:49<24:11:48, 27.79s/it] 37%|███▋ | 1866/5000 [15:10:16<24:07:50, 27.72s/it] 37%|███▋ | 1867/5000 [15:10:44<24:08:44, 27.74s/it] 37%|███▋ | 1868/5000 [15:11:11<23:50:26, 27.40s/it] 37%|███▋ | 1869/5000 [15:11:40<24:19:09, 27.96s/it] 37%|███▋ | 1870/5000 [15:12:07<24:05:39, 27.71s/it] 37%|███▋ | 1871/5000 [15:12:34<23:57:34, 27.57s/it] 37%|███▋ | 1872/5000 [15:13:03<24:15:18, 27.92s/it] 37%|███▋ | 1873/5000 [15:13:35<25:25:58, 29.28s/it] 37%|███▋ | 1874/5000 [15:14:04<25:09:04, 28.97s/it] 38%|███▊ | 1875/5000 [15:14:31<24:36:48, 28.35s/it] 38%|███▊ | 1875/5000 [15:14:31<24:36:48, 28.35s/it] 38%|███▊ | 1876/5000 [15:14:59<24:32:58, 28.29s/it] 38%|███▊ | 1877/5000 [15:15:27<24:31:43, 28.28s/it] 38%|███▊ | 1878/5000 [15:15:54<24:18:21, 28.03s/it] 38%|███▊ | 1879/5000 [15:16:27<25:23:08, 29.28s/it] 38%|███▊ | 1880/5000 [15:16:55<25:07:16, 28.99s/it] 38%|███▊ | 1881/5000 [15:17:22<24:42:22, 28.52s/it] 38%|███▊ | 1882/5000 [15:17:51<24:43:18, 28.54s/it] 38%|███▊ | 1883/5000 [15:18:19<24:43:19, 28.55s/it] 38%|███▊ | 1884/5000 [15:18:47<24:27:07, 28.25s/it] 38%|███▊ | 1885/5000 [15:19:15<24:29:34, 28.31s/it] 38%|███▊ | 1886/5000 [15:19:44<24:30:25, 28.33s/it] 38%|███▊ | 1887/5000 [15:20:11<24:14:05, 28.03s/it] 38%|███▊ | 1888/5000 [15:20:39<24:18:14, 28.12s/it] 38%|███▊ | 1889/5000 [15:21:07<24:14:50, 28.06s/it] 38%|███▊ | 1890/5000 [15:21:35<24:06:48, 27.91s/it] 38%|███▊ | 1891/5000 [15:22:04<24:17:57, 28.14s/it] 38%|███▊ | 1892/5000 [15:22:32<24:26:48, 28.32s/it] 38%|███▊ | 1893/5000 [15:23:00<24:09:43, 28.00s/it] 38%|███▊ | 1894/5000 [15:23:28<24:15:09, 28.11s/it] 38%|███▊ | 1895/5000 [15:23:57<24:22:03, 28.25s/it] 38%|███▊ | 1896/5000 [15:24:24<24:04:27, 27.92s/it] 38%|███▊ | 1897/5000 [15:24:52<24:07:46, 27.99s/it] 38%|███▊ | 1898/5000 [15:25:20<24:12:01, 28.09s/it] 38%|███▊ | 1899/5000 [15:25:48<24:12:41, 28.11s/it] 38%|███▊ | 1900/5000 [15:26:16<24:03:23, 27.94s/it] 38%|███▊ | 1900/5000 [15:26:16<24:03:23, 27.94s/it] 38%|███▊ | 1901/5000 [15:26:44<24:03:15, 27.94s/it] 38%|███▊ | 1902/5000 [15:27:12<24:05:04, 27.99s/it] 38%|███▊ | 1903/5000 [15:27:39<23:54:36, 27.79s/it] 38%|███▊ | 1904/5000 [15:28:06<23:42:02, 27.56s/it] 38%|███▊ | 1905/5000 [15:28:38<24:41:45, 28.73s/it] 38%|███▊ | 1906/5000 [15:29:04<24:08:42, 28.09s/it] 38%|███▊ | 1907/5000 [15:29:32<23:55:44, 27.85s/it] 38%|███▊ | 1908/5000 [15:30:01<24:18:45, 28.31s/it] 38%|███▊ | 1909/5000 [15:30:28<24:02:18, 28.00s/it] 38%|███▊ | 1910/5000 [15:30:55<23:47:19, 27.72s/it] 38%|███▊ | 1911/5000 [15:31:30<25:29:01, 29.70s/it] 38%|███▊ | 1912/5000 [15:31:57<24:54:20, 29.04s/it] 38%|███▊ | 1913/5000 [15:32:26<24:46:17, 28.89s/it] 38%|███▊ | 1914/5000 [15:32:54<24:40:57, 28.79s/it] 38%|███▊ | 1915/5000 [15:33:23<24:33:13, 28.65s/it] 38%|███▊ | 1916/5000 [15:33:50<24:14:18, 28.29s/it] 38%|███▊ | 1917/5000 [15:34:18<24:12:28, 28.27s/it] 38%|███▊ | 1918/5000 [15:34:47<24:11:54, 28.27s/it] 38%|███▊ | 1919/5000 [15:35:14<23:57:00, 27.98s/it] 38%|███▊ | 1920/5000 [15:35:42<24:04:11, 28.13s/it] 38%|███▊ | 1921/5000 [15:36:11<24:07:01, 28.20s/it] 38%|███▊ | 1922/5000 [15:36:38<23:55:02, 27.97s/it] 38%|███▊ | 1923/5000 [15:37:07<24:04:12, 28.16s/it] 38%|███▊ | 1924/5000 [15:37:35<24:05:39, 28.20s/it] 38%|███▊ | 1925/5000 [15:38:02<23:49:05, 27.88s/it] 38%|███▊ | 1925/5000 [15:38:02<23:49:05, 27.88s/it] 39%|███▊ | 1926/5000 [15:38:30<23:41:10, 27.74s/it] 39%|███▊ | 1927/5000 [15:38:58<23:57:47, 28.07s/it] 39%|███▊ | 1928/5000 [15:39:26<23:43:48, 27.81s/it] 39%|███▊ | 1929/5000 [15:39:53<23:36:07, 27.67s/it] 39%|███▊ | 1930/5000 [15:40:22<23:57:46, 28.10s/it] 39%|███▊ | 1931/5000 [15:40:50<23:48:12, 27.92s/it] 39%|███▊ | 1932/5000 [15:41:17<23:36:02, 27.69s/it] 39%|███▊ | 1933/5000 [15:41:46<23:56:01, 28.09s/it] 39%|███▊ | 1934/5000 [15:42:13<23:42:50, 27.84s/it] 39%|███▊ | 1935/5000 [15:42:40<23:24:27, 27.49s/it] 39%|███▊ | 1936/5000 [15:43:08<23:35:31, 27.72s/it] 39%|███▊ | 1937/5000 [15:43:36<23:42:08, 27.86s/it] 39%|███▉ | 1938/5000 [15:44:04<23:35:47, 27.74s/it] 39%|███▉ | 1939/5000 [15:44:32<23:42:19, 27.88s/it] 39%|███▉ | 1940/5000 [15:45:00<23:43:14, 27.91s/it] 39%|███▉ | 1941/5000 [15:45:27<23:30:02, 27.66s/it] 39%|███▉ | 1942/5000 [15:45:55<23:42:36, 27.91s/it] 39%|███▉ | 1943/5000 [15:46:24<23:48:55, 28.05s/it] 39%|███▉ | 1944/5000 [15:46:51<23:40:41, 27.89s/it] 39%|███▉ | 1945/5000 [15:47:20<23:50:14, 28.09s/it] 39%|███▉ | 1946/5000 [15:47:48<23:52:55, 28.15s/it] 39%|███▉ | 1947/5000 [15:48:15<23:31:55, 27.75s/it] 39%|███▉ | 1948/5000 [15:48:47<24:44:37, 29.19s/it] 39%|███▉ | 1949/5000 [15:49:15<24:14:04, 28.60s/it] 39%|███▉ | 1950/5000 [15:49:42<23:58:06, 28.29s/it] 39%|███▉ | 1950/5000 [15:49:42<23:58:06, 28.29s/it] 39%|███▉ | 1951/5000 [15:50:10<23:56:41, 28.27s/it] 39%|███▉ | 1952/5000 [15:50:37<23:27:30, 27.71s/it] 39%|███▉ | 1953/5000 [15:51:04<23:20:49, 27.58s/it] 39%|███▉ | 1954/5000 [15:51:33<23:33:34, 27.84s/it] 39%|███▉ | 1955/5000 [15:52:00<23:31:36, 27.81s/it] 39%|███▉ | 1956/5000 [15:52:28<23:27:02, 27.73s/it] 39%|███▉ | 1957/5000 [15:52:57<23:48:47, 28.17s/it] 39%|███▉ | 1958/5000 [15:53:19<22:17:09, 26.37s/it] 39%|███▉ | 1959/5000 [15:53:30<18:20:33, 21.71s/it] 39%|███▉ | 1960/5000 [15:53:41<15:34:09, 18.44s/it] 39%|███▉ | 1961/5000 [15:53:52<13:37:41, 16.14s/it] 39%|███▉ | 1962/5000 [15:54:00<11:31:58, 13.67s/it]{'loss': 0.0063, 'learning_rate': 7.115555555555557e-06, 'epoch': 11.0} {'loss': 0.006, 'learning_rate': 7.06e-06, 'epoch': 11.01} {'loss': 0.006, 'learning_rate': 7.004444444444445e-06, 'epoch': 11.01} {'loss': 0.0053, 'learning_rate': 6.948888888888889e-06, 'epoch': 11.02} {'loss': 0.0044, 'learning_rate': 6.893333333333334e-06, 'epoch': 11.02} {'loss': 0.0045, 'learning_rate': 6.837777777777779e-06, 'epoch': 11.03} {'loss': 0.0051, 'learning_rate': 6.782222222222222e-06, 'epoch': 11.03} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 1.18it/s] Reading metadata...: 14228it [00:00, 20577.69it/s] Reading metadata...: 22597it [00:02, 10604.53it/s] Reading metadata...: 28043it [00:02, 12606.30it/s] Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 2.55it/s] Reading metadata...: 10438it [00:00, 22408.00it/s] 39%|███▉ | 1963/5000 [15:55:49<35:48:04, 42.44s/it] 39%|███▉ | 1964/5000 [15:56:18<32:14:09, 38.22s/it] 39%|███▉ | 1965/5000 [15:56:45<29:34:02, 35.07s/it] 39%|███▉ | 1966/5000 [15:57:13<27:49:21, 33.01s/it] 39%|███▉ | 1967/5000 [15:57:42<26:33:25, 31.52s/it] 39%|███▉ | 1968/5000 [15:58:09<25:31:00, 30.30s/it] 39%|███▉ | 1969/5000 [15:58:35<24:25:58, 29.02s/it] 39%|███▉ | 1970/5000 [15:59:02<23:58:24, 28.48s/it] 39%|███▉ | 1971/5000 [15:59:30<23:54:33, 28.42s/it] 39%|███▉ | 1972/5000 [15:59:59<23:56:40, 28.47s/it] 39%|███▉ | 1973/5000 [16:00:26<23:39:18, 28.13s/it] 39%|███▉ | 1974/5000 [16:00:55<23:40:04, 28.16s/it] 40%|███▉ | 1975/5000 [16:01:23<23:43:18, 28.23s/it] 40%|███▉ | 1975/5000 [16:01:23<23:43:18, 28.23s/it] 40%|███▉ | 1976/5000 [16:01:50<23:18:54, 27.76s/it] 40%|███▉ | 1977/5000 [16:02:15<22:48:42, 27.17s/it] 40%|███▉ | 1978/5000 [16:02:44<23:09:22, 27.59s/it] 40%|███▉ | 1979/5000 [16:03:12<23:16:07, 27.73s/it] 40%|███▉ | 1980/5000 [16:03:40<23:11:08, 27.64s/it] 40%|███▉ | 1981/5000 [16:04:10<23:51:34, 28.45s/it] 40%|███▉ | 1982/5000 [16:04:37<23:28:46, 28.01s/it] 40%|███▉ | 1983/5000 [16:05:03<22:58:47, 27.42s/it] 40%|███▉ | 1984/5000 [16:05:31<23:10:05, 27.65s/it] 40%|███▉ | 1985/5000 [16:05:59<23:17:24, 27.81s/it] 40%|███▉ | 1986/5000 [16:06:27<23:10:17, 27.68s/it] 40%|███▉ | 1987/5000 [16:06:55<23:13:45, 27.76s/it] 40%|███▉ | 1988/5000 [16:07:21<22:49:43, 27.29s/it] 40%|███▉ | 1989/5000 [16:07:47<22:30:02, 26.90s/it] 40%|███▉ | 1990/5000 [16:08:17<23:12:12, 27.75s/it] 40%|███▉ | 1991/5000 [16:08:44<23:01:36, 27.55s/it] 40%|███▉ | 1992/5000 [16:09:11<22:56:53, 27.46s/it] 40%|███▉ | 1993/5000 [16:09:44<24:18:04, 29.09s/it] 40%|███▉ | 1994/5000 [16:10:11<23:50:07, 28.55s/it] 40%|███▉ | 1995/5000 [16:10:38<23:32:28, 28.20s/it] 40%|███▉ | 1996/5000 [16:11:07<23:34:32, 28.25s/it] 40%|███▉ | 1997/5000 [16:11:34<23:14:15, 27.86s/it] 40%|███▉ | 1998/5000 [16:12:01<22:57:58, 27.54s/it] 40%|███▉ | 1999/5000 [16:12:28<22:50:43, 27.41s/it] 40%|████ | 2000/5000 [16:12:55<22:45:51, 27.32s/it] 40%|████ | 2000/5000 [16:12:55<22:45:51, 27.32s/it][INFO|trainer.py:3138] 2023-05-08 02:47:05,817 >> ***** Running Evaluation ***** [INFO|trainer.py:3142] 2023-05-08 02:47:05,817 >> Num examples: Unknown [INFO|trainer.py:3143] 2023-05-08 02:47:05,817 >> Batch size = 64 {'loss': 0.005, 'learning_rate': 6.726666666666667e-06, 'epoch': 12.0} {'loss': 0.0042, 'learning_rate': 6.671111111111112e-06, 'epoch': 12.01} Reading metadata...: 0it [00:00, ?it/s] Reading metadata...: 1it [00:00, 3.63it/s] Reading metadata...: 10440it [00:00, 30685.17it/s] [INFO|trainer_utils.py:693] 2023-05-08 02:47:16,424 >> The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: input_length. If input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message. 40%|████ | 2000/5000 [16:48:47<22:45:51, 27.32s/it][INFO|trainer.py:2877] 2023-05-08 03:22:58,551 >> Saving model checkpoint to ./checkpoint-2000 [INFO|configuration_utils.py:458] 2023-05-08 03:22:58,556 >> Configuration saved in ./checkpoint-2000/config.json [INFO|configuration_utils.py:364] 2023-05-08 03:22:58,560 >> Configuration saved in ./checkpoint-2000/generation_config.json [INFO|modeling_utils.py:1855] 2023-05-08 03:23:01,997 >> Model weights saved in ./checkpoint-2000/pytorch_model.bin [INFO|feature_extraction_utils.py:369] 2023-05-08 03:23:02,003 >> Feature extractor saved in ./checkpoint-2000/preprocessor_config.json [INFO|feature_extraction_utils.py:369] 2023-05-08 03:23:12,574 >> Feature extractor saved in ./preprocessor_config.json