GPU available: True (cuda), used: True | |
TPU available: False, using: 0 TPU cores | |
IPU available: False, using: 0 IPUs | |
HPU available: False, using: 0 HPUs | |
---------------------------------------------------------------------------------------------------- | |
distributed_backend=nccl | |
All distributed processes registered. Starting with 8 processes | |
---------------------------------------------------------------------------------------------------- | |
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7] | |
| Name | Type | Params | |
---------------------------------------- | |
0 | model | Float16Module | 2.1 B | |
---------------------------------------- | |
2.1 B Trainable params | |
0 Non-trainable params | |
2.1 B Total params | |
8,538.206 Total estimated model params size (MB) | |
Epoch 1, global step 613: 'validation_loss' was not in top 5 | |