============================================================ New run started at 2024-11-21.05:01:02.161680 sys.argv: "-m --config-name=roboflow_yolo_nas_m" ============================================================ The console stream is logged into /root/sg_logs/console.log [2024-11-21 05:01:04,935][super_gradients.training.models.model_factory][WARNING] - Passing num_classes through arch_params is deprecated and will be removed in the next version. Pass num_classes explicitly to models.get [2024-11-21 05:01:05,412][super_gradients.training.utils.checkpoint_utils][WARNING] - :warning: The pre-trained models provided by SuperGradients may have their own licenses or terms and conditions derived from the dataset used for pre-training. It is your responsibility to determine whether you have permission to use the models for your use case. The model you have requested was pre-trained on the coco dataset, published under the following terms: https://cocodataset.org/#termsofuse [2024-11-21 05:01:05,412][super_gradients.training.utils.checkpoint_utils][INFO] - License Notification: YOLO-NAS pre-trained weights are subjected to the specific license terms and conditions detailed in https://github.com/Deci-AI/super-gradients/blob/master/LICENSE.YOLONAS.md By downloading the pre-trained weight files you agree to comply with these terms. [2024-11-21 05:01:05,580][super_gradients.training.utils.checkpoint_utils][INFO] - Successfully loaded pretrained weights for architecture yolo_nas_m [2024-11-21 05:01:06,199][super_gradients.training.datasets.detection_datasets.detection_dataset][INFO] - Dataset Initialization in progress. `cache_annotations=True` causes the process to take longer due to full dataset indexing. [2024-11-21 05:01:06,418][super_gradients.training.datasets.detection_datasets.detection_dataset][INFO] - Dataset Initialization in progress. `cache_annotations=True` causes the process to take longer due to full dataset indexing. [2024-11-21 05:01:06,443][super_gradients.training.sg_trainer.sg_trainer][WARNING] - Train dataset size % batch_size != 0 and drop_last=False, this might result in smaller last batch. [2024-11-21 05:01:06,668][super_gradients.training.sg_trainer.sg_trainer][INFO] - Starting a new run with `run_id=RUN_20241121_050106_668266` [2024-11-21 05:01:06,668][super_gradients.training.sg_trainer.sg_trainer][INFO] - Checkpoints directory: /opt/conda/envs/app/lib/python3.10/checkpoints/yolo_nas_m_roboflow_final-final-c2j0n-mdjfm/3/RUN_20241121_050106_668266 The console stream is now moved to /opt/conda/envs/app/lib/python3.10/checkpoints/yolo_nas_m_roboflow_final-final-c2j0n-mdjfm/3/RUN_20241121_050106_668266/console_Nov21_05_01_06.txt [2024-11-21 05:01:06] INFO - sg_trainer.py - Using EMA with params {'decay': 0.9, 'decay_type': 'threshold', 'beta': 15} /opt/conda/envs/app/lib/python3.10/site-packages/super_gradients/training/sg_trainer/sg_trainer.py:1765: FutureWarning: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead. self.scaler = GradScaler(enabled=mixed_precision_enabled) [2024-11-21 05:01:08] INFO - sg_trainer_utils.py - TRAINING PARAMETERS: - Mode: OFF - Number of GPUs: 1 (1 available on the machine) - Full dataset size: 2399 (len(train_set)) - Batch size per GPU: 12 (batch_size) - Batch Accumulate: 1 (batch_accumulate) - Total batch size: 12 (num_gpus * batch_size) - Effective Batch size: 12 (num_gpus * batch_size * batch_accumulate) - Iterations per epoch: 200 (len(train_loader)) - Gradient updates per epoch: 200 (len(train_loader) / batch_accumulate) - Model: YoloNAS_M (51.13M parameters, 51.13M optimized) - Learning Rates and Weight Decays: - default: (51.13M parameters). LR: 0.0004 (51.13M parameters) WD: 0.0, (72.22K parameters), WD: 0.0001, (51.06M parameters) [2024-11-21 05:01:08] INFO - sg_trainer.py - Started training for 100 epochs (0/99) 0%| | 0/200 [00:00