mace-universal / pretrained /2023-12-10-mace-128-L0.log
cyrusyc's picture
reorgnize foundation models
06db4b8
raw
history blame
240 kB
2023-12-07 00:33:40.378 INFO: Process group initialized: True
2023-12-07 00:33:40.380 INFO: Processes: 80
2023-12-07 00:33:40.380 INFO: MACE version: 0.3.0
2023-12-07 00:33:40.381 INFO: Configuration: Namespace(name='05-128-L0', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=0, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='05-128-L0', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight'])
2023-12-07 00:33:40.381 INFO: CUDA version: 11.8, CUDA device: 0
2023-12-07 00:33:40.382 INFO: Using statistics json file
2023-12-07 00:33:40.382 INFO: Using atomic numbers from statistics file
2023-12-07 00:33:40.382 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94)
2023-12-07 00:33:40.382 INFO: Atomic Energies not in training file, using command line argument E0s
2023-12-07 00:33:40.383 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245]
2023-12-07 00:34:13.262 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000)
2023-12-07 00:34:13.265 INFO: Average number of neighbors: 61.964672446250916
2023-12-07 00:34:13.265 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False}
2023-12-07 00:34:13.265 INFO: Building model
2023-12-07 00:34:13.265 INFO: Hidden irreps: 128x0e
2023-12-07 00:34:17.386 WARNING: Cannot find checkpoint with tag '05-128-L0_run-1' in 'checkpoints'
2023-12-07 00:34:17.390 INFO: ScaleShiftMACE(
(node_embedding): LinearNodeEmbeddingBlock(
(linear): Linear(89x0e -> 128x0e | 11392 weights)
)
(radial_embedding): RadialEmbeddingBlock(
(bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False)
(cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0)
)
(spherical_harmonics): SphericalHarmonics()
(atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828])
(interactions): ModuleList(
(0-1): 2 x RealAgnosticResidualInteractionBlock(
(linear_up): Linear(128x0e -> 128x0e | 16384 weights)
(conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights)
(conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512]
(linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights)
(skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e | 1458176 paths | 1458176 weights)
(reshape): reshape_irreps()
)
)
(products): ModuleList(
(0-1): 2 x EquivariantProductBasisBlock(
(symmetric_contractions): SymmetricContraction(
(contractions): ModuleList(
(0): Contraction(
(contractions_weighting): ModuleList(
(0-1): 2 x GraphModule()
)
(contractions_features): ModuleList(
(0-1): 2 x GraphModule()
)
(weights): ParameterList(
(0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)]
(1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)]
)
(graph_opt_main): GraphModule()
)
)
)
(linear): Linear(128x0e -> 128x0e | 16384 weights)
)
)
(readouts): ModuleList(
(0): LinearReadoutBlock(
(linear): Linear(128x0e -> 1x0e | 128 weights)
)
(1): NonLinearReadoutBlock(
(linear_1): Linear(128x0e -> 16x0e | 2048 weights)
(non_linearity): Activation [x] (16x0e -> 16x0e)
(linear_2): Linear(16x0e -> 1x0e | 16 weights)
)
)
(scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097)
)
2023-12-07 00:34:17.396 INFO: Number of parameters: 3847696
2023-12-07 00:34:17.396 INFO: Optimizer: Adam (
Parameter Group 0
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.005
maximize: False
name: embedding
weight_decay: 0.0
Parameter Group 1
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.005
maximize: False
name: interactions_decay
weight_decay: 1e-08
Parameter Group 2
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.005
maximize: False
name: interactions_no_decay
weight_decay: 0.0
Parameter Group 3
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.005
maximize: False
name: products
weight_decay: 1e-08
Parameter Group 4
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.005
maximize: False
name: readouts
weight_decay: 0.0
)
2023-12-07 00:34:17.396 INFO: Using Weights and Biases for logging
2023-12-07 00:34:54.386 INFO: Using gradient clipping with tolerance=100.000
2023-12-07 00:34:54.387 INFO: Started training
2023-12-07 00:35:01.728 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.728 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.728 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.728 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.731 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.731 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.735 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.735 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.735 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.731 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.735 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.735 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.735 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.731 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.735 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.735 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.735 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.734 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.735 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.735 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:35:01.735 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-07 00:47:36.881 INFO: Epoch 0: loss=1.6009e-02, MAE_E_per_atom=187.2113 meV, MAE_F=115.8141 meV / A, MAE_stress_per_atom=0.3528 meV / A^3
2023-12-07 00:51:48.638 INFO: Epoch 1: loss=1.3584e-02, MAE_E_per_atom=122.0209 meV, MAE_F=109.8117 meV / A, MAE_stress_per_atom=0.3193 meV / A^3
2023-12-07 00:55:59.027 INFO: Epoch 2: loss=1.2124e-02, MAE_E_per_atom=89.9911 meV, MAE_F=101.5480 meV / A, MAE_stress_per_atom=0.3189 meV / A^3
2023-12-07 01:00:09.400 INFO: Epoch 3: loss=1.1230e-02, MAE_E_per_atom=77.9575 meV, MAE_F=95.0004 meV / A, MAE_stress_per_atom=0.2579 meV / A^3
2023-12-07 01:04:20.103 INFO: Epoch 4: loss=1.0757e-02, MAE_E_per_atom=69.7162 meV, MAE_F=91.1872 meV / A, MAE_stress_per_atom=0.1991 meV / A^3
2023-12-07 01:08:29.530 INFO: Epoch 5: loss=1.0242e-02, MAE_E_per_atom=64.8052 meV, MAE_F=87.1196 meV / A, MAE_stress_per_atom=0.1608 meV / A^3
2023-12-07 01:12:38.385 INFO: Epoch 6: loss=9.8151e-03, MAE_E_per_atom=61.9018 meV, MAE_F=83.9708 meV / A, MAE_stress_per_atom=0.1543 meV / A^3
2023-12-07 01:16:47.759 INFO: Epoch 7: loss=9.5291e-03, MAE_E_per_atom=58.7411 meV, MAE_F=80.6295 meV / A, MAE_stress_per_atom=0.1587 meV / A^3
2023-12-07 01:20:57.204 INFO: Epoch 8: loss=9.3087e-03, MAE_E_per_atom=55.6613 meV, MAE_F=78.6384 meV / A, MAE_stress_per_atom=0.1481 meV / A^3
2023-12-07 01:25:06.505 INFO: Epoch 9: loss=9.1403e-03, MAE_E_per_atom=53.5070 meV, MAE_F=77.0883 meV / A, MAE_stress_per_atom=0.1383 meV / A^3
2023-12-07 01:29:16.219 INFO: Epoch 10: loss=8.7853e-03, MAE_E_per_atom=52.3093 meV, MAE_F=75.5157 meV / A, MAE_stress_per_atom=0.1358 meV / A^3
2023-12-07 01:33:25.862 INFO: Epoch 11: loss=8.5486e-03, MAE_E_per_atom=50.0046 meV, MAE_F=74.3539 meV / A, MAE_stress_per_atom=0.1372 meV / A^3
2023-12-07 01:37:35.368 INFO: Epoch 12: loss=8.4192e-03, MAE_E_per_atom=49.1666 meV, MAE_F=73.6558 meV / A, MAE_stress_per_atom=0.1371 meV / A^3
2023-12-07 01:41:48.499 INFO: Epoch 13: loss=8.3057e-03, MAE_E_per_atom=48.0096 meV, MAE_F=72.8189 meV / A, MAE_stress_per_atom=0.1406 meV / A^3
2023-12-07 01:45:58.385 INFO: Epoch 14: loss=8.1411e-03, MAE_E_per_atom=46.7136 meV, MAE_F=72.1274 meV / A, MAE_stress_per_atom=0.1272 meV / A^3
2023-12-07 01:50:09.057 INFO: Epoch 15: loss=8.1027e-03, MAE_E_per_atom=45.1286 meV, MAE_F=71.8962 meV / A, MAE_stress_per_atom=0.1319 meV / A^3
2023-12-07 01:54:20.890 INFO: Epoch 16: loss=8.0104e-03, MAE_E_per_atom=43.0746 meV, MAE_F=71.3171 meV / A, MAE_stress_per_atom=0.1314 meV / A^3
2023-12-07 01:58:31.853 INFO: Epoch 17: loss=7.8737e-03, MAE_E_per_atom=42.4045 meV, MAE_F=70.6761 meV / A, MAE_stress_per_atom=0.1380 meV / A^3
2023-12-07 02:02:43.467 INFO: Epoch 18: loss=7.7324e-03, MAE_E_per_atom=41.6146 meV, MAE_F=69.5367 meV / A, MAE_stress_per_atom=0.1363 meV / A^3
2023-12-07 02:06:54.332 INFO: Epoch 19: loss=7.9112e-03, MAE_E_per_atom=42.2078 meV, MAE_F=70.9675 meV / A, MAE_stress_per_atom=0.1463 meV / A^3
2023-12-07 02:11:05.180 INFO: Epoch 20: loss=7.7206e-03, MAE_E_per_atom=40.9243 meV, MAE_F=69.1989 meV / A, MAE_stress_per_atom=0.1410 meV / A^3
2023-12-07 02:15:17.455 INFO: Epoch 21: loss=7.6370e-03, MAE_E_per_atom=40.2580 meV, MAE_F=68.5678 meV / A, MAE_stress_per_atom=0.1407 meV / A^3
2023-12-07 02:19:29.153 INFO: Epoch 22: loss=7.5433e-03, MAE_E_per_atom=39.1509 meV, MAE_F=68.0733 meV / A, MAE_stress_per_atom=0.1420 meV / A^3
2023-12-07 02:23:43.435 INFO: Epoch 23: loss=7.5021e-03, MAE_E_per_atom=38.2951 meV, MAE_F=67.6139 meV / A, MAE_stress_per_atom=0.1430 meV / A^3
2023-12-07 02:27:57.160 INFO: Epoch 24: loss=7.4443e-03, MAE_E_per_atom=38.3011 meV, MAE_F=67.2022 meV / A, MAE_stress_per_atom=0.1410 meV / A^3
2023-12-08 20:56:46.615 INFO: Process group initialized: True
2023-12-08 20:56:46.617 INFO: Processes: 80
2023-12-08 20:56:46.617 INFO: MACE version: 0.3.0
2023-12-08 20:56:46.617 INFO: Configuration: Namespace(name='05-128-L0', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=0, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='05-128-L0', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight'])
2023-12-08 20:56:46.617 INFO: CUDA version: 11.8, CUDA device: 0
2023-12-08 20:56:46.618 INFO: Using statistics json file
2023-12-08 20:56:46.618 INFO: Using atomic numbers from statistics file
2023-12-08 20:56:46.618 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94)
2023-12-08 20:56:46.618 INFO: Atomic Energies not in training file, using command line argument E0s
2023-12-08 20:56:46.619 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245]
2023-12-08 20:57:19.068 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000)
2023-12-08 20:57:19.071 INFO: Average number of neighbors: 61.964672446250916
2023-12-08 20:57:19.071 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False}
2023-12-08 20:57:19.071 INFO: Building model
2023-12-08 20:57:19.071 INFO: Hidden irreps: 128x0e
2023-12-08 20:57:23.088 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint.
2023-12-08 20:57:23.089 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-24.pt
2023-12-08 20:57:23.270 INFO: ScaleShiftMACE(
(node_embedding): LinearNodeEmbeddingBlock(
(linear): Linear(89x0e -> 128x0e | 11392 weights)
)
(radial_embedding): RadialEmbeddingBlock(
(bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False)
(cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0)
)
(spherical_harmonics): SphericalHarmonics()
(atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828])
(interactions): ModuleList(
(0-1): 2 x RealAgnosticResidualInteractionBlock(
(linear_up): Linear(128x0e -> 128x0e | 16384 weights)
(conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights)
(conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512]
(linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights)
(skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e | 1458176 paths | 1458176 weights)
(reshape): reshape_irreps()
)
)
(products): ModuleList(
(0-1): 2 x EquivariantProductBasisBlock(
(symmetric_contractions): SymmetricContraction(
(contractions): ModuleList(
(0): Contraction(
(contractions_weighting): ModuleList(
(0-1): 2 x GraphModule()
)
(contractions_features): ModuleList(
(0-1): 2 x GraphModule()
)
(weights): ParameterList(
(0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)]
(1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)]
)
(graph_opt_main): GraphModule()
)
)
)
(linear): Linear(128x0e -> 128x0e | 16384 weights)
)
)
(readouts): ModuleList(
(0): LinearReadoutBlock(
(linear): Linear(128x0e -> 1x0e | 128 weights)
)
(1): NonLinearReadoutBlock(
(linear_1): Linear(128x0e -> 16x0e | 2048 weights)
(non_linearity): Activation [x] (16x0e -> 16x0e)
(linear_2): Linear(16x0e -> 1x0e | 16 weights)
)
)
(scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097)
)
2023-12-08 20:57:23.276 INFO: Number of parameters: 3847696
2023-12-08 20:57:23.277 INFO: Optimizer: Adam (
Parameter Group 0
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.005
maximize: False
name: embedding
weight_decay: 0.0
Parameter Group 1
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.005
maximize: False
name: interactions_decay
weight_decay: 1e-08
Parameter Group 2
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.005
maximize: False
name: interactions_no_decay
weight_decay: 0.0
Parameter Group 3
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.005
maximize: False
name: products
weight_decay: 1e-08
Parameter Group 4
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.005
maximize: False
name: readouts
weight_decay: 0.0
)
2023-12-08 20:57:23.277 INFO: Using Weights and Biases for logging
2023-12-08 20:57:35.504 INFO: Using gradient clipping with tolerance=100.000
2023-12-08 20:57:35.504 INFO: Started training
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.793 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 20:57:42.794 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-08 21:10:17.137 INFO: Epoch 24: loss=7.4367e-03, MAE_E_per_atom=37.9559 meV, MAE_F=67.0446 meV / A, MAE_stress_per_atom=0.1377 meV / A^3
2023-12-08 21:14:35.375 INFO: Epoch 25: loss=7.3869e-03, MAE_E_per_atom=37.7703 meV, MAE_F=66.8778 meV / A, MAE_stress_per_atom=0.1362 meV / A^3
2023-12-08 21:18:46.071 INFO: Epoch 26: loss=7.2414e-03, MAE_E_per_atom=36.5784 meV, MAE_F=66.2884 meV / A, MAE_stress_per_atom=0.1378 meV / A^3
2023-12-08 21:22:56.455 INFO: Epoch 27: loss=7.2518e-03, MAE_E_per_atom=35.8647 meV, MAE_F=65.7828 meV / A, MAE_stress_per_atom=0.1392 meV / A^3
2023-12-08 21:27:06.894 INFO: Epoch 28: loss=7.2098e-03, MAE_E_per_atom=35.9810 meV, MAE_F=65.4098 meV / A, MAE_stress_per_atom=0.1391 meV / A^3
2023-12-08 21:31:18.351 INFO: Epoch 29: loss=7.1269e-03, MAE_E_per_atom=35.3170 meV, MAE_F=65.1186 meV / A, MAE_stress_per_atom=0.1472 meV / A^3
2023-12-08 21:35:28.873 INFO: Epoch 30: loss=7.0992e-03, MAE_E_per_atom=35.5020 meV, MAE_F=64.9542 meV / A, MAE_stress_per_atom=0.1310 meV / A^3
2023-12-08 21:39:40.412 INFO: Epoch 31: loss=7.0520e-03, MAE_E_per_atom=34.6439 meV, MAE_F=64.5942 meV / A, MAE_stress_per_atom=0.1277 meV / A^3
2023-12-08 21:43:50.450 INFO: Epoch 32: loss=7.0858e-03, MAE_E_per_atom=34.4616 meV, MAE_F=64.6209 meV / A, MAE_stress_per_atom=0.1323 meV / A^3
2023-12-08 21:48:00.938 INFO: Epoch 33: loss=7.0714e-03, MAE_E_per_atom=34.0312 meV, MAE_F=64.6531 meV / A, MAE_stress_per_atom=0.1462 meV / A^3
2023-12-08 21:52:12.885 INFO: Epoch 34: loss=7.0462e-03, MAE_E_per_atom=34.1923 meV, MAE_F=64.1193 meV / A, MAE_stress_per_atom=0.1284 meV / A^3
2023-12-08 21:56:23.681 INFO: Epoch 35: loss=7.3660e-03, MAE_E_per_atom=36.0976 meV, MAE_F=67.3376 meV / A, MAE_stress_per_atom=0.1461 meV / A^3
2023-12-08 22:00:34.548 INFO: Epoch 36: loss=1.2967e-02, MAE_E_per_atom=89.2297 meV, MAE_F=115.7714 meV / A, MAE_stress_per_atom=0.2307 meV / A^3
2023-12-08 22:04:48.957 INFO: Epoch 37: loss=9.3120e-03, MAE_E_per_atom=56.6786 meV, MAE_F=90.0621 meV / A, MAE_stress_per_atom=0.1564 meV / A^3
2023-12-08 22:09:00.742 INFO: Epoch 38: loss=7.5177e-03, MAE_E_per_atom=40.1777 meV, MAE_F=72.0057 meV / A, MAE_stress_per_atom=0.1388 meV / A^3
2023-12-08 22:13:12.329 INFO: Epoch 39: loss=7.2810e-03, MAE_E_per_atom=36.3678 meV, MAE_F=69.3740 meV / A, MAE_stress_per_atom=0.1200 meV / A^3
2023-12-08 22:17:24.012 INFO: Epoch 40: loss=7.2440e-03, MAE_E_per_atom=34.7443 meV, MAE_F=68.5586 meV / A, MAE_stress_per_atom=0.1169 meV / A^3
2023-12-08 22:21:36.101 INFO: Epoch 41: loss=7.1482e-03, MAE_E_per_atom=33.3788 meV, MAE_F=67.6282 meV / A, MAE_stress_per_atom=0.1204 meV / A^3
2023-12-08 22:25:48.207 INFO: Epoch 42: loss=7.0966e-03, MAE_E_per_atom=32.7777 meV, MAE_F=66.7299 meV / A, MAE_stress_per_atom=0.1262 meV / A^3
2023-12-08 22:30:00.430 INFO: Epoch 43: loss=7.0756e-03, MAE_E_per_atom=32.6211 meV, MAE_F=66.5010 meV / A, MAE_stress_per_atom=0.1277 meV / A^3
2023-12-08 22:34:12.347 INFO: Epoch 44: loss=7.0545e-03, MAE_E_per_atom=32.3837 meV, MAE_F=66.0624 meV / A, MAE_stress_per_atom=0.1293 meV / A^3
2023-12-08 22:38:25.598 INFO: Epoch 45: loss=7.0771e-03, MAE_E_per_atom=32.2415 meV, MAE_F=65.7500 meV / A, MAE_stress_per_atom=0.1313 meV / A^3
2023-12-08 22:42:39.179 INFO: Epoch 46: loss=7.0261e-03, MAE_E_per_atom=31.9724 meV, MAE_F=65.3239 meV / A, MAE_stress_per_atom=0.1419 meV / A^3
2023-12-08 22:46:52.333 INFO: Epoch 47: loss=7.0190e-03, MAE_E_per_atom=31.7067 meV, MAE_F=64.9530 meV / A, MAE_stress_per_atom=0.1402 meV / A^3
2023-12-08 22:51:05.700 INFO: Epoch 48: loss=6.9644e-03, MAE_E_per_atom=31.8255 meV, MAE_F=64.5449 meV / A, MAE_stress_per_atom=0.1383 meV / A^3
2023-12-09 07:43:02.676 INFO: Process group initialized: True
2023-12-09 07:43:02.678 INFO: Processes: 80
2023-12-09 07:43:02.678 INFO: MACE version: 0.3.0
2023-12-09 07:43:02.678 INFO: Configuration: Namespace(name='05-128-L0', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=0, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='05-128-L0', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight'])
2023-12-09 07:43:02.678 INFO: CUDA version: 11.8, CUDA device: 0
2023-12-09 07:43:02.679 INFO: Using statistics json file
2023-12-09 07:43:02.679 INFO: Using atomic numbers from statistics file
2023-12-09 07:43:02.679 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94)
2023-12-09 07:43:02.679 INFO: Atomic Energies not in training file, using command line argument E0s
2023-12-09 07:43:02.680 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245]
2023-12-09 07:43:39.015 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000)
2023-12-09 07:43:39.018 INFO: Average number of neighbors: 61.964672446250916
2023-12-09 07:43:39.018 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False}
2023-12-09 07:43:39.018 INFO: Building model
2023-12-09 07:43:39.018 INFO: Hidden irreps: 128x0e
2023-12-09 07:43:43.208 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint.
2023-12-09 07:43:43.209 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-48.pt
2023-12-09 07:43:43.412 INFO: ScaleShiftMACE(
(node_embedding): LinearNodeEmbeddingBlock(
(linear): Linear(89x0e -> 128x0e | 11392 weights)
)
(radial_embedding): RadialEmbeddingBlock(
(bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False)
(cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0)
)
(spherical_harmonics): SphericalHarmonics()
(atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828])
(interactions): ModuleList(
(0-1): 2 x RealAgnosticResidualInteractionBlock(
(linear_up): Linear(128x0e -> 128x0e | 16384 weights)
(conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights)
(conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512]
(linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights)
(skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e | 1458176 paths | 1458176 weights)
(reshape): reshape_irreps()
)
)
(products): ModuleList(
(0-1): 2 x EquivariantProductBasisBlock(
(symmetric_contractions): SymmetricContraction(
(contractions): ModuleList(
(0): Contraction(
(contractions_weighting): ModuleList(
(0-1): 2 x GraphModule()
)
(contractions_features): ModuleList(
(0-1): 2 x GraphModule()
)
(weights): ParameterList(
(0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)]
(1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)]
)
(graph_opt_main): GraphModule()
)
)
)
(linear): Linear(128x0e -> 128x0e | 16384 weights)
)
)
(readouts): ModuleList(
(0): LinearReadoutBlock(
(linear): Linear(128x0e -> 1x0e | 128 weights)
)
(1): NonLinearReadoutBlock(
(linear_1): Linear(128x0e -> 16x0e | 2048 weights)
(non_linearity): Activation [x] (16x0e -> 16x0e)
(linear_2): Linear(16x0e -> 1x0e | 16 weights)
)
)
(scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097)
)
2023-12-09 07:43:43.418 INFO: Number of parameters: 3847696
2023-12-09 07:43:43.418 INFO: Optimizer: Adam (
Parameter Group 0
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.004
maximize: False
name: embedding
weight_decay: 0.0
Parameter Group 1
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.004
maximize: False
name: interactions_decay
weight_decay: 1e-08
Parameter Group 2
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.004
maximize: False
name: interactions_no_decay
weight_decay: 0.0
Parameter Group 3
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.004
maximize: False
name: products
weight_decay: 1e-08
Parameter Group 4
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.004
maximize: False
name: readouts
weight_decay: 0.0
)
2023-12-09 07:43:43.419 INFO: Using Weights and Biases for logging
2023-12-09 07:44:03.803 INFO: Using gradient clipping with tolerance=100.000
2023-12-09 07:44:03.803 INFO: Started training
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.443 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:44:11.444 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 07:56:58.088 INFO: Epoch 48: loss=6.9678e-03, MAE_E_per_atom=31.6694 meV, MAE_F=64.4060 meV / A, MAE_stress_per_atom=0.1419 meV / A^3
2023-12-09 08:01:09.594 INFO: Epoch 49: loss=6.9284e-03, MAE_E_per_atom=31.7512 meV, MAE_F=64.0500 meV / A, MAE_stress_per_atom=0.1381 meV / A^3
2023-12-09 08:05:20.157 INFO: Epoch 50: loss=6.9682e-03, MAE_E_per_atom=31.6025 meV, MAE_F=63.8874 meV / A, MAE_stress_per_atom=0.1554 meV / A^3
2023-12-09 08:09:31.284 INFO: Epoch 51: loss=6.8730e-03, MAE_E_per_atom=31.8092 meV, MAE_F=63.5059 meV / A, MAE_stress_per_atom=0.1519 meV / A^3
2023-12-09 08:13:41.514 INFO: Epoch 52: loss=6.9161e-03, MAE_E_per_atom=31.3330 meV, MAE_F=63.6144 meV / A, MAE_stress_per_atom=0.1557 meV / A^3
2023-12-09 08:17:52.400 INFO: Epoch 53: loss=6.9078e-03, MAE_E_per_atom=30.9978 meV, MAE_F=63.1260 meV / A, MAE_stress_per_atom=0.1674 meV / A^3
2023-12-09 08:22:02.163 INFO: Epoch 54: loss=6.9265e-03, MAE_E_per_atom=30.8717 meV, MAE_F=63.0460 meV / A, MAE_stress_per_atom=0.1721 meV / A^3
2023-12-09 08:26:12.318 INFO: Epoch 55: loss=6.8521e-03, MAE_E_per_atom=31.3128 meV, MAE_F=62.8252 meV / A, MAE_stress_per_atom=0.1564 meV / A^3
2023-12-09 08:30:22.671 INFO: Epoch 56: loss=6.8046e-03, MAE_E_per_atom=30.9394 meV, MAE_F=62.8296 meV / A, MAE_stress_per_atom=0.1503 meV / A^3
2023-12-09 08:34:35.112 INFO: Epoch 57: loss=6.8585e-03, MAE_E_per_atom=30.9023 meV, MAE_F=62.4298 meV / A, MAE_stress_per_atom=0.1644 meV / A^3
2023-12-09 08:38:45.770 INFO: Epoch 58: loss=6.8404e-03, MAE_E_per_atom=30.8785 meV, MAE_F=61.8781 meV / A, MAE_stress_per_atom=0.1680 meV / A^3
2023-12-09 08:42:56.636 INFO: Epoch 59: loss=6.8774e-03, MAE_E_per_atom=30.7300 meV, MAE_F=62.2127 meV / A, MAE_stress_per_atom=0.1710 meV / A^3
2023-12-09 08:47:06.840 INFO: Epoch 60: loss=6.8214e-03, MAE_E_per_atom=30.6110 meV, MAE_F=61.8021 meV / A, MAE_stress_per_atom=0.1663 meV / A^3
2023-12-09 08:51:20.006 INFO: Epoch 61: loss=6.8252e-03, MAE_E_per_atom=30.7573 meV, MAE_F=61.6328 meV / A, MAE_stress_per_atom=0.1683 meV / A^3
2023-12-09 08:55:32.417 INFO: Epoch 62: loss=6.7830e-03, MAE_E_per_atom=30.4569 meV, MAE_F=61.2038 meV / A, MAE_stress_per_atom=0.1569 meV / A^3
2023-12-09 08:59:44.377 INFO: Epoch 63: loss=6.7615e-03, MAE_E_per_atom=30.3817 meV, MAE_F=61.2203 meV / A, MAE_stress_per_atom=0.1658 meV / A^3
2023-12-09 09:03:56.634 INFO: Epoch 64: loss=6.7596e-03, MAE_E_per_atom=30.3893 meV, MAE_F=60.6066 meV / A, MAE_stress_per_atom=0.1656 meV / A^3
2023-12-09 09:08:09.274 INFO: Epoch 65: loss=6.8102e-03, MAE_E_per_atom=30.4013 meV, MAE_F=60.9242 meV / A, MAE_stress_per_atom=0.1785 meV / A^3
2023-12-09 09:12:21.152 INFO: Epoch 66: loss=6.6463e-03, MAE_E_per_atom=31.3657 meV, MAE_F=60.3111 meV / A, MAE_stress_per_atom=0.1562 meV / A^3
2023-12-09 09:16:33.044 INFO: Epoch 67: loss=6.7256e-03, MAE_E_per_atom=30.5648 meV, MAE_F=61.2993 meV / A, MAE_stress_per_atom=0.1633 meV / A^3
2023-12-09 09:20:47.452 INFO: Epoch 68: loss=6.7803e-03, MAE_E_per_atom=31.1402 meV, MAE_F=61.6759 meV / A, MAE_stress_per_atom=0.1730 meV / A^3
2023-12-09 09:25:00.519 INFO: Epoch 69: loss=6.7895e-03, MAE_E_per_atom=30.2957 meV, MAE_F=60.7934 meV / A, MAE_stress_per_atom=0.1777 meV / A^3
2023-12-09 09:29:13.350 INFO: Epoch 70: loss=6.7142e-03, MAE_E_per_atom=30.2902 meV, MAE_F=60.0835 meV / A, MAE_stress_per_atom=0.1749 meV / A^3
2023-12-09 09:33:26.909 INFO: Epoch 71: loss=6.7158e-03, MAE_E_per_atom=30.0476 meV, MAE_F=59.8551 meV / A, MAE_stress_per_atom=0.1725 meV / A^3
2023-12-09 09:37:40.544 INFO: Epoch 72: loss=6.6525e-03, MAE_E_per_atom=30.2933 meV, MAE_F=59.5501 meV / A, MAE_stress_per_atom=0.1685 meV / A^3
2023-12-09 09:51:29.401 INFO: Process group initialized: True
2023-12-09 09:51:29.403 INFO: Processes: 80
2023-12-09 09:51:29.403 INFO: MACE version: 0.3.0
2023-12-09 09:51:29.404 INFO: Configuration: Namespace(name='05-128-L0', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=0, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='05-128-L0', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight'])
2023-12-09 09:51:29.404 INFO: CUDA version: 11.8, CUDA device: 0
2023-12-09 09:51:29.404 INFO: Using statistics json file
2023-12-09 09:51:29.404 INFO: Using atomic numbers from statistics file
2023-12-09 09:51:29.405 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94)
2023-12-09 09:51:29.405 INFO: Atomic Energies not in training file, using command line argument E0s
2023-12-09 09:51:29.405 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245]
2023-12-09 09:52:01.521 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000)
2023-12-09 09:52:01.523 INFO: Average number of neighbors: 61.964672446250916
2023-12-09 09:52:01.523 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False}
2023-12-09 09:52:01.523 INFO: Building model
2023-12-09 09:52:01.524 INFO: Hidden irreps: 128x0e
2023-12-09 09:52:05.529 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint.
2023-12-09 09:52:05.530 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-72.pt
2023-12-09 09:52:05.717 INFO: ScaleShiftMACE(
(node_embedding): LinearNodeEmbeddingBlock(
(linear): Linear(89x0e -> 128x0e | 11392 weights)
)
(radial_embedding): RadialEmbeddingBlock(
(bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False)
(cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0)
)
(spherical_harmonics): SphericalHarmonics()
(atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828])
(interactions): ModuleList(
(0-1): 2 x RealAgnosticResidualInteractionBlock(
(linear_up): Linear(128x0e -> 128x0e | 16384 weights)
(conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights)
(conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512]
(linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights)
(skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e | 1458176 paths | 1458176 weights)
(reshape): reshape_irreps()
)
)
(products): ModuleList(
(0-1): 2 x EquivariantProductBasisBlock(
(symmetric_contractions): SymmetricContraction(
(contractions): ModuleList(
(0): Contraction(
(contractions_weighting): ModuleList(
(0-1): 2 x GraphModule()
)
(contractions_features): ModuleList(
(0-1): 2 x GraphModule()
)
(weights): ParameterList(
(0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)]
(1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)]
)
(graph_opt_main): GraphModule()
)
)
)
(linear): Linear(128x0e -> 128x0e | 16384 weights)
)
)
(readouts): ModuleList(
(0): LinearReadoutBlock(
(linear): Linear(128x0e -> 1x0e | 128 weights)
)
(1): NonLinearReadoutBlock(
(linear_1): Linear(128x0e -> 16x0e | 2048 weights)
(non_linearity): Activation [x] (16x0e -> 16x0e)
(linear_2): Linear(16x0e -> 1x0e | 16 weights)
)
)
(scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097)
)
2023-12-09 09:52:05.722 INFO: Number of parameters: 3847696
2023-12-09 09:52:05.722 INFO: Optimizer: Adam (
Parameter Group 0
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.004
maximize: False
name: embedding
weight_decay: 0.0
Parameter Group 1
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.004
maximize: False
name: interactions_decay
weight_decay: 1e-08
Parameter Group 2
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.004
maximize: False
name: interactions_no_decay
weight_decay: 0.0
Parameter Group 3
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.004
maximize: False
name: products
weight_decay: 1e-08
Parameter Group 4
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.004
maximize: False
name: readouts
weight_decay: 0.0
)
2023-12-09 09:52:05.722 INFO: Using Weights and Biases for logging
2023-12-09 09:52:19.817 INFO: Using gradient clipping with tolerance=100.000
2023-12-09 09:52:19.818 INFO: Started training
2023-12-09 09:52:27.185 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.185 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.185 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.185 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.190 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.190 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.190 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.190 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.190 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.190 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.189 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.190 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.190 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 09:52:27.190 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 10:04:50.113 INFO: Epoch 72: loss=6.6291e-03, MAE_E_per_atom=30.0677 meV, MAE_F=59.4059 meV / A, MAE_stress_per_atom=0.1674 meV / A^3
2023-12-09 10:09:18.788 INFO: Epoch 73: loss=6.7570e-03, MAE_E_per_atom=29.8798 meV, MAE_F=59.8023 meV / A, MAE_stress_per_atom=0.1761 meV / A^3
2023-12-09 10:13:29.801 INFO: Epoch 74: loss=6.6864e-03, MAE_E_per_atom=29.9506 meV, MAE_F=59.2347 meV / A, MAE_stress_per_atom=0.1718 meV / A^3
2023-12-09 10:17:41.146 INFO: Epoch 75: loss=6.6964e-03, MAE_E_per_atom=29.6975 meV, MAE_F=59.1020 meV / A, MAE_stress_per_atom=0.1681 meV / A^3
2023-12-09 10:21:51.585 INFO: Epoch 76: loss=6.6468e-03, MAE_E_per_atom=30.1357 meV, MAE_F=59.0826 meV / A, MAE_stress_per_atom=0.1641 meV / A^3
2023-12-09 10:26:01.920 INFO: Epoch 77: loss=6.6608e-03, MAE_E_per_atom=29.7528 meV, MAE_F=59.1551 meV / A, MAE_stress_per_atom=0.1721 meV / A^3
2023-12-09 10:30:13.333 INFO: Epoch 78: loss=6.6392e-03, MAE_E_per_atom=29.9661 meV, MAE_F=58.8928 meV / A, MAE_stress_per_atom=0.1598 meV / A^3
2023-12-09 10:34:24.855 INFO: Epoch 79: loss=6.6301e-03, MAE_E_per_atom=29.4907 meV, MAE_F=58.2955 meV / A, MAE_stress_per_atom=0.1732 meV / A^3
2023-12-09 10:38:36.579 INFO: Epoch 80: loss=6.6033e-03, MAE_E_per_atom=29.4659 meV, MAE_F=58.2065 meV / A, MAE_stress_per_atom=0.1627 meV / A^3
2023-12-09 10:42:47.811 INFO: Epoch 81: loss=6.6335e-03, MAE_E_per_atom=29.3739 meV, MAE_F=58.3974 meV / A, MAE_stress_per_atom=0.1664 meV / A^3
2023-12-09 10:46:59.071 INFO: Epoch 82: loss=6.6125e-03, MAE_E_per_atom=29.7664 meV, MAE_F=58.2438 meV / A, MAE_stress_per_atom=0.1676 meV / A^3
2023-12-09 10:51:09.773 INFO: Epoch 83: loss=6.6219e-03, MAE_E_per_atom=29.7177 meV, MAE_F=58.0942 meV / A, MAE_stress_per_atom=0.1578 meV / A^3
2023-12-09 10:55:20.173 INFO: Epoch 84: loss=6.6302e-03, MAE_E_per_atom=29.6119 meV, MAE_F=58.0996 meV / A, MAE_stress_per_atom=0.1561 meV / A^3
2023-12-09 10:59:34.357 INFO: Epoch 85: loss=6.5779e-03, MAE_E_per_atom=29.5253 meV, MAE_F=57.8376 meV / A, MAE_stress_per_atom=0.1619 meV / A^3
2023-12-09 11:03:45.742 INFO: Epoch 86: loss=6.6381e-03, MAE_E_per_atom=30.2969 meV, MAE_F=58.2626 meV / A, MAE_stress_per_atom=0.1652 meV / A^3
2023-12-09 11:07:56.490 INFO: Epoch 87: loss=6.5837e-03, MAE_E_per_atom=30.0660 meV, MAE_F=57.6879 meV / A, MAE_stress_per_atom=0.1609 meV / A^3
2023-12-09 11:12:08.561 INFO: Epoch 88: loss=6.5849e-03, MAE_E_per_atom=29.5614 meV, MAE_F=57.3817 meV / A, MAE_stress_per_atom=0.1684 meV / A^3
2023-12-09 11:16:20.222 INFO: Epoch 89: loss=6.5112e-03, MAE_E_per_atom=29.2924 meV, MAE_F=57.3089 meV / A, MAE_stress_per_atom=0.1676 meV / A^3
2023-12-09 11:20:33.390 INFO: Epoch 90: loss=6.5207e-03, MAE_E_per_atom=29.4966 meV, MAE_F=57.4853 meV / A, MAE_stress_per_atom=0.1571 meV / A^3
2023-12-09 11:24:46.199 INFO: Epoch 91: loss=6.4741e-03, MAE_E_per_atom=29.1693 meV, MAE_F=57.2256 meV / A, MAE_stress_per_atom=0.1569 meV / A^3
2023-12-09 11:28:59.550 INFO: Epoch 92: loss=6.4986e-03, MAE_E_per_atom=29.4430 meV, MAE_F=56.7963 meV / A, MAE_stress_per_atom=0.1637 meV / A^3
2023-12-09 11:33:12.700 INFO: Epoch 93: loss=6.5061e-03, MAE_E_per_atom=28.8391 meV, MAE_F=57.0774 meV / A, MAE_stress_per_atom=0.1514 meV / A^3
2023-12-09 11:37:25.967 INFO: Epoch 94: loss=6.4717e-03, MAE_E_per_atom=29.0896 meV, MAE_F=56.7711 meV / A, MAE_stress_per_atom=0.1503 meV / A^3
2023-12-09 11:41:38.271 INFO: Epoch 95: loss=6.5328e-03, MAE_E_per_atom=29.2869 meV, MAE_F=56.9126 meV / A, MAE_stress_per_atom=0.1653 meV / A^3
2023-12-09 11:45:51.044 INFO: Epoch 96: loss=6.5005e-03, MAE_E_per_atom=29.1685 meV, MAE_F=56.7231 meV / A, MAE_stress_per_atom=0.1515 meV / A^3
2023-12-09 11:54:39.990 INFO: Process group initialized: True
2023-12-09 11:54:39.992 INFO: Processes: 80
2023-12-09 11:54:39.993 INFO: MACE version: 0.3.0
2023-12-09 11:54:39.993 INFO: Configuration: Namespace(name='05-128-L0', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=0, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='05-128-L0', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight'])
2023-12-09 11:54:39.993 INFO: CUDA version: 11.8, CUDA device: 0
2023-12-09 11:54:39.994 INFO: Using statistics json file
2023-12-09 11:54:39.994 INFO: Using atomic numbers from statistics file
2023-12-09 11:54:39.994 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94)
2023-12-09 11:54:39.994 INFO: Atomic Energies not in training file, using command line argument E0s
2023-12-09 11:54:39.994 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245]
2023-12-09 11:55:11.767 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000)
2023-12-09 11:55:11.769 INFO: Average number of neighbors: 61.964672446250916
2023-12-09 11:55:11.770 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False}
2023-12-09 11:55:11.770 INFO: Building model
2023-12-09 11:55:11.770 INFO: Hidden irreps: 128x0e
2023-12-09 11:55:15.842 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint.
2023-12-09 11:55:15.844 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-96.pt
2023-12-09 11:55:16.025 INFO: ScaleShiftMACE(
(node_embedding): LinearNodeEmbeddingBlock(
(linear): Linear(89x0e -> 128x0e | 11392 weights)
)
(radial_embedding): RadialEmbeddingBlock(
(bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False)
(cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0)
)
(spherical_harmonics): SphericalHarmonics()
(atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828])
(interactions): ModuleList(
(0-1): 2 x RealAgnosticResidualInteractionBlock(
(linear_up): Linear(128x0e -> 128x0e | 16384 weights)
(conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights)
(conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512]
(linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights)
(skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e | 1458176 paths | 1458176 weights)
(reshape): reshape_irreps()
)
)
(products): ModuleList(
(0-1): 2 x EquivariantProductBasisBlock(
(symmetric_contractions): SymmetricContraction(
(contractions): ModuleList(
(0): Contraction(
(contractions_weighting): ModuleList(
(0-1): 2 x GraphModule()
)
(contractions_features): ModuleList(
(0-1): 2 x GraphModule()
)
(weights): ParameterList(
(0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)]
(1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)]
)
(graph_opt_main): GraphModule()
)
)
)
(linear): Linear(128x0e -> 128x0e | 16384 weights)
)
)
(readouts): ModuleList(
(0): LinearReadoutBlock(
(linear): Linear(128x0e -> 1x0e | 128 weights)
)
(1): NonLinearReadoutBlock(
(linear_1): Linear(128x0e -> 16x0e | 2048 weights)
(non_linearity): Activation [x] (16x0e -> 16x0e)
(linear_2): Linear(16x0e -> 1x0e | 16 weights)
)
)
(scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097)
)
2023-12-09 11:55:16.031 INFO: Number of parameters: 3847696
2023-12-09 11:55:16.031 INFO: Optimizer: Adam (
Parameter Group 0
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0032
maximize: False
name: embedding
weight_decay: 0.0
Parameter Group 1
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0032
maximize: False
name: interactions_decay
weight_decay: 1e-08
Parameter Group 2
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0032
maximize: False
name: interactions_no_decay
weight_decay: 0.0
Parameter Group 3
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0032
maximize: False
name: products
weight_decay: 1e-08
Parameter Group 4
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0032
maximize: False
name: readouts
weight_decay: 0.0
)
2023-12-09 11:55:16.031 INFO: Using Weights and Biases for logging
2023-12-09 11:55:29.215 INFO: Using gradient clipping with tolerance=100.000
2023-12-09 11:55:29.216 INFO: Started training
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.576 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.576 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.576 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.574 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.575 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 11:55:36.576 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 12:07:54.162 INFO: Epoch 96: loss=6.5224e-03, MAE_E_per_atom=29.1637 meV, MAE_F=56.7547 meV / A, MAE_stress_per_atom=0.1604 meV / A^3
2023-12-09 12:12:01.951 INFO: Epoch 97: loss=6.4824e-03, MAE_E_per_atom=28.9644 meV, MAE_F=56.9172 meV / A, MAE_stress_per_atom=0.1527 meV / A^3
2023-12-09 12:16:09.441 INFO: Epoch 98: loss=6.4489e-03, MAE_E_per_atom=29.1551 meV, MAE_F=56.5285 meV / A, MAE_stress_per_atom=0.1556 meV / A^3
2023-12-09 12:20:16.352 INFO: Epoch 99: loss=6.5145e-03, MAE_E_per_atom=29.2044 meV, MAE_F=56.4415 meV / A, MAE_stress_per_atom=0.1739 meV / A^3
2023-12-09 12:24:22.920 INFO: Epoch 100: loss=6.4623e-03, MAE_E_per_atom=28.7001 meV, MAE_F=56.2326 meV / A, MAE_stress_per_atom=0.1675 meV / A^3
2023-12-09 12:28:29.721 INFO: Epoch 101: loss=6.4597e-03, MAE_E_per_atom=28.4748 meV, MAE_F=56.3163 meV / A, MAE_stress_per_atom=0.1598 meV / A^3
2023-12-09 12:32:35.561 INFO: Epoch 102: loss=6.4087e-03, MAE_E_per_atom=28.8751 meV, MAE_F=56.2407 meV / A, MAE_stress_per_atom=0.1523 meV / A^3
2023-12-09 12:36:42.688 INFO: Epoch 103: loss=6.4215e-03, MAE_E_per_atom=28.8941 meV, MAE_F=56.2143 meV / A, MAE_stress_per_atom=0.1495 meV / A^3
2023-12-09 12:40:50.489 INFO: Epoch 104: loss=6.4426e-03, MAE_E_per_atom=28.5281 meV, MAE_F=56.2399 meV / A, MAE_stress_per_atom=0.1531 meV / A^3
2023-12-09 12:44:57.511 INFO: Epoch 105: loss=6.4283e-03, MAE_E_per_atom=28.9826 meV, MAE_F=55.9333 meV / A, MAE_stress_per_atom=0.1593 meV / A^3
2023-12-09 12:49:05.832 INFO: Epoch 106: loss=6.4474e-03, MAE_E_per_atom=28.9641 meV, MAE_F=55.9721 meV / A, MAE_stress_per_atom=0.1582 meV / A^3
2023-12-09 12:53:12.491 INFO: Epoch 107: loss=6.4278e-03, MAE_E_per_atom=28.5776 meV, MAE_F=56.0727 meV / A, MAE_stress_per_atom=0.1552 meV / A^3
2023-12-09 12:57:19.829 INFO: Epoch 108: loss=6.3534e-03, MAE_E_per_atom=28.3621 meV, MAE_F=55.8749 meV / A, MAE_stress_per_atom=0.1442 meV / A^3
2023-12-09 13:01:29.730 INFO: Epoch 109: loss=6.3959e-03, MAE_E_per_atom=28.7052 meV, MAE_F=55.6690 meV / A, MAE_stress_per_atom=0.1638 meV / A^3
2023-12-09 13:05:38.033 INFO: Epoch 110: loss=6.4020e-03, MAE_E_per_atom=28.8681 meV, MAE_F=55.6543 meV / A, MAE_stress_per_atom=0.1620 meV / A^3
2023-12-09 13:09:45.518 INFO: Epoch 111: loss=6.3708e-03, MAE_E_per_atom=28.5095 meV, MAE_F=55.4075 meV / A, MAE_stress_per_atom=0.1649 meV / A^3
2023-12-09 13:13:53.661 INFO: Epoch 112: loss=6.3400e-03, MAE_E_per_atom=28.6501 meV, MAE_F=55.5696 meV / A, MAE_stress_per_atom=0.1542 meV / A^3
2023-12-09 13:18:02.636 INFO: Epoch 113: loss=6.3865e-03, MAE_E_per_atom=28.6577 meV, MAE_F=55.4397 meV / A, MAE_stress_per_atom=0.1660 meV / A^3
2023-12-09 13:22:10.997 INFO: Epoch 114: loss=6.3322e-03, MAE_E_per_atom=28.3594 meV, MAE_F=55.4783 meV / A, MAE_stress_per_atom=0.1434 meV / A^3
2023-12-09 13:26:18.831 INFO: Epoch 115: loss=6.3794e-03, MAE_E_per_atom=28.4649 meV, MAE_F=55.3648 meV / A, MAE_stress_per_atom=0.1741 meV / A^3
2023-12-09 13:30:27.253 INFO: Epoch 116: loss=6.3222e-03, MAE_E_per_atom=28.5178 meV, MAE_F=55.3276 meV / A, MAE_stress_per_atom=0.1483 meV / A^3
2023-12-09 13:34:35.647 INFO: Epoch 117: loss=6.2964e-03, MAE_E_per_atom=28.4105 meV, MAE_F=55.1766 meV / A, MAE_stress_per_atom=0.1496 meV / A^3
2023-12-09 13:38:45.046 INFO: Epoch 118: loss=6.3291e-03, MAE_E_per_atom=28.5960 meV, MAE_F=55.2392 meV / A, MAE_stress_per_atom=0.1645 meV / A^3
2023-12-09 13:42:55.473 INFO: Epoch 119: loss=6.3100e-03, MAE_E_per_atom=28.6345 meV, MAE_F=55.1575 meV / A, MAE_stress_per_atom=0.1433 meV / A^3
2023-12-09 13:47:06.268 INFO: Epoch 120: loss=6.2657e-03, MAE_E_per_atom=28.3709 meV, MAE_F=55.0280 meV / A, MAE_stress_per_atom=0.1458 meV / A^3
2023-12-09 16:48:04.444 INFO: Process group initialized: True
2023-12-09 16:48:04.447 INFO: Processes: 80
2023-12-09 16:48:04.447 INFO: MACE version: 0.3.0
2023-12-09 16:48:04.447 INFO: Configuration: Namespace(name='05-128-L0', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=0, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='05-128-L0', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight'])
2023-12-09 16:48:04.447 INFO: CUDA version: 11.8, CUDA device: 0
2023-12-09 16:48:04.448 INFO: Using statistics json file
2023-12-09 16:48:04.448 INFO: Using atomic numbers from statistics file
2023-12-09 16:48:04.448 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94)
2023-12-09 16:48:04.449 INFO: Atomic Energies not in training file, using command line argument E0s
2023-12-09 16:48:04.449 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245]
2023-12-09 16:48:36.845 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000)
2023-12-09 16:48:36.847 INFO: Average number of neighbors: 61.964672446250916
2023-12-09 16:48:36.847 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False}
2023-12-09 16:48:36.847 INFO: Building model
2023-12-09 16:48:36.848 INFO: Hidden irreps: 128x0e
2023-12-09 16:48:41.207 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint.
2023-12-09 16:48:41.210 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-120.pt
2023-12-09 16:48:41.393 INFO: ScaleShiftMACE(
(node_embedding): LinearNodeEmbeddingBlock(
(linear): Linear(89x0e -> 128x0e | 11392 weights)
)
(radial_embedding): RadialEmbeddingBlock(
(bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False)
(cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0)
)
(spherical_harmonics): SphericalHarmonics()
(atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828])
(interactions): ModuleList(
(0-1): 2 x RealAgnosticResidualInteractionBlock(
(linear_up): Linear(128x0e -> 128x0e | 16384 weights)
(conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights)
(conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512]
(linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights)
(skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e | 1458176 paths | 1458176 weights)
(reshape): reshape_irreps()
)
)
(products): ModuleList(
(0-1): 2 x EquivariantProductBasisBlock(
(symmetric_contractions): SymmetricContraction(
(contractions): ModuleList(
(0): Contraction(
(contractions_weighting): ModuleList(
(0-1): 2 x GraphModule()
)
(contractions_features): ModuleList(
(0-1): 2 x GraphModule()
)
(weights): ParameterList(
(0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)]
(1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)]
)
(graph_opt_main): GraphModule()
)
)
)
(linear): Linear(128x0e -> 128x0e | 16384 weights)
)
)
(readouts): ModuleList(
(0): LinearReadoutBlock(
(linear): Linear(128x0e -> 1x0e | 128 weights)
)
(1): NonLinearReadoutBlock(
(linear_1): Linear(128x0e -> 16x0e | 2048 weights)
(non_linearity): Activation [x] (16x0e -> 16x0e)
(linear_2): Linear(16x0e -> 1x0e | 16 weights)
)
)
(scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097)
)
2023-12-09 16:48:41.399 INFO: Number of parameters: 3847696
2023-12-09 16:48:41.399 INFO: Optimizer: Adam (
Parameter Group 0
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0032
maximize: False
name: embedding
weight_decay: 0.0
Parameter Group 1
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0032
maximize: False
name: interactions_decay
weight_decay: 1e-08
Parameter Group 2
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0032
maximize: False
name: interactions_no_decay
weight_decay: 0.0
Parameter Group 3
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0032
maximize: False
name: products
weight_decay: 1e-08
Parameter Group 4
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0032
maximize: False
name: readouts
weight_decay: 0.0
)
2023-12-09 16:48:41.399 INFO: Using Weights and Biases for logging
2023-12-09 16:49:00.797 INFO: Using gradient clipping with tolerance=100.000
2023-12-09 16:49:00.797 INFO: Started training
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.421 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 16:49:08.422 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 17:01:34.395 INFO: Epoch 120: loss=6.2381e-03, MAE_E_per_atom=28.4038 meV, MAE_F=55.0967 meV / A, MAE_stress_per_atom=0.1446 meV / A^3
2023-12-09 17:05:41.597 INFO: Epoch 121: loss=6.3244e-03, MAE_E_per_atom=28.5191 meV, MAE_F=54.9660 meV / A, MAE_stress_per_atom=0.1445 meV / A^3
2023-12-09 17:09:41.317 INFO: Epoch 122: loss=6.2873e-03, MAE_E_per_atom=28.3946 meV, MAE_F=54.8375 meV / A, MAE_stress_per_atom=0.1445 meV / A^3
2023-12-09 17:13:41.533 INFO: Epoch 123: loss=6.3244e-03, MAE_E_per_atom=28.3166 meV, MAE_F=55.0107 meV / A, MAE_stress_per_atom=0.1503 meV / A^3
2023-12-09 17:17:42.084 INFO: Epoch 124: loss=6.2941e-03, MAE_E_per_atom=28.5511 meV, MAE_F=54.5585 meV / A, MAE_stress_per_atom=0.1540 meV / A^3
2023-12-09 17:21:42.397 INFO: Epoch 125: loss=6.3849e-03, MAE_E_per_atom=28.5009 meV, MAE_F=54.9062 meV / A, MAE_stress_per_atom=0.1637 meV / A^3
2023-12-09 17:25:42.564 INFO: Epoch 126: loss=6.2517e-03, MAE_E_per_atom=28.1980 meV, MAE_F=54.4633 meV / A, MAE_stress_per_atom=0.1430 meV / A^3
2023-12-09 17:29:44.107 INFO: Epoch 127: loss=6.2729e-03, MAE_E_per_atom=28.3683 meV, MAE_F=54.6498 meV / A, MAE_stress_per_atom=0.1529 meV / A^3
2023-12-09 17:33:45.141 INFO: Epoch 128: loss=6.3193e-03, MAE_E_per_atom=28.1145 meV, MAE_F=54.4467 meV / A, MAE_stress_per_atom=0.1660 meV / A^3
2023-12-09 17:37:46.059 INFO: Epoch 129: loss=6.2470e-03, MAE_E_per_atom=28.1336 meV, MAE_F=54.5996 meV / A, MAE_stress_per_atom=0.1461 meV / A^3
2023-12-09 17:41:45.891 INFO: Epoch 130: loss=6.2481e-03, MAE_E_per_atom=28.0700 meV, MAE_F=54.2895 meV / A, MAE_stress_per_atom=0.1462 meV / A^3
2023-12-09 17:45:45.734 INFO: Epoch 131: loss=6.2323e-03, MAE_E_per_atom=28.2773 meV, MAE_F=54.2697 meV / A, MAE_stress_per_atom=0.1586 meV / A^3
2023-12-09 17:49:46.128 INFO: Epoch 132: loss=6.2550e-03, MAE_E_per_atom=28.0698 meV, MAE_F=54.2829 meV / A, MAE_stress_per_atom=0.1661 meV / A^3
2023-12-09 17:53:47.658 INFO: Epoch 133: loss=6.2071e-03, MAE_E_per_atom=28.3344 meV, MAE_F=54.1927 meV / A, MAE_stress_per_atom=0.1473 meV / A^3
2023-12-09 17:57:50.575 INFO: Epoch 134: loss=6.1466e-03, MAE_E_per_atom=28.1702 meV, MAE_F=54.0080 meV / A, MAE_stress_per_atom=0.1410 meV / A^3
2023-12-09 18:01:52.106 INFO: Epoch 135: loss=6.2181e-03, MAE_E_per_atom=28.1289 meV, MAE_F=54.2063 meV / A, MAE_stress_per_atom=0.1473 meV / A^3
2023-12-09 18:05:54.165 INFO: Epoch 136: loss=6.2579e-03, MAE_E_per_atom=28.2266 meV, MAE_F=54.0485 meV / A, MAE_stress_per_atom=0.1688 meV / A^3
2023-12-09 18:09:56.770 INFO: Epoch 137: loss=6.1476e-03, MAE_E_per_atom=28.1463 meV, MAE_F=53.7102 meV / A, MAE_stress_per_atom=0.1440 meV / A^3
2023-12-09 18:14:00.123 INFO: Epoch 138: loss=6.1820e-03, MAE_E_per_atom=27.9588 meV, MAE_F=53.8592 meV / A, MAE_stress_per_atom=0.1446 meV / A^3
2023-12-09 18:18:01.569 INFO: Epoch 139: loss=6.2206e-03, MAE_E_per_atom=27.9981 meV, MAE_F=54.1292 meV / A, MAE_stress_per_atom=0.1484 meV / A^3
2023-12-09 18:22:03.831 INFO: Epoch 140: loss=6.2169e-03, MAE_E_per_atom=28.3825 meV, MAE_F=53.9594 meV / A, MAE_stress_per_atom=0.1459 meV / A^3
2023-12-09 18:26:07.919 INFO: Epoch 141: loss=6.1708e-03, MAE_E_per_atom=27.8408 meV, MAE_F=53.9948 meV / A, MAE_stress_per_atom=0.1454 meV / A^3
2023-12-09 18:30:10.820 INFO: Epoch 142: loss=6.1939e-03, MAE_E_per_atom=27.9761 meV, MAE_F=53.7307 meV / A, MAE_stress_per_atom=0.1458 meV / A^3
2023-12-09 18:34:15.143 INFO: Epoch 143: loss=6.2026e-03, MAE_E_per_atom=27.9200 meV, MAE_F=53.5008 meV / A, MAE_stress_per_atom=0.1476 meV / A^3
2023-12-09 18:38:18.553 INFO: Epoch 144: loss=6.1903e-03, MAE_E_per_atom=27.6983 meV, MAE_F=53.6114 meV / A, MAE_stress_per_atom=0.1585 meV / A^3
2023-12-09 18:42:23.392 INFO: Epoch 145: loss=6.1711e-03, MAE_E_per_atom=27.9136 meV, MAE_F=53.4631 meV / A, MAE_stress_per_atom=0.1454 meV / A^3
2023-12-09 19:28:35.308 INFO: Process group initialized: True
2023-12-09 19:28:35.311 INFO: Processes: 80
2023-12-09 19:28:35.311 INFO: MACE version: 0.3.0
2023-12-09 19:28:35.311 INFO: Configuration: Namespace(name='05-128-L0', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=0, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='05-128-L0', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight'])
2023-12-09 19:28:35.311 INFO: CUDA version: 11.8, CUDA device: 0
2023-12-09 19:28:35.311 INFO: Using statistics json file
2023-12-09 19:28:35.311 INFO: Using atomic numbers from statistics file
2023-12-09 19:28:35.311 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94)
2023-12-09 19:28:35.312 INFO: Atomic Energies not in training file, using command line argument E0s
2023-12-09 19:28:35.312 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245]
2023-12-09 19:29:08.110 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000)
2023-12-09 19:29:08.112 INFO: Average number of neighbors: 61.964672446250916
2023-12-09 19:29:08.113 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False}
2023-12-09 19:29:08.113 INFO: Building model
2023-12-09 19:29:08.113 INFO: Hidden irreps: 128x0e
2023-12-09 19:29:12.222 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint.
2023-12-09 19:29:12.224 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-145.pt
2023-12-09 19:29:12.403 INFO: ScaleShiftMACE(
(node_embedding): LinearNodeEmbeddingBlock(
(linear): Linear(89x0e -> 128x0e | 11392 weights)
)
(radial_embedding): RadialEmbeddingBlock(
(bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False)
(cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0)
)
(spherical_harmonics): SphericalHarmonics()
(atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828])
(interactions): ModuleList(
(0-1): 2 x RealAgnosticResidualInteractionBlock(
(linear_up): Linear(128x0e -> 128x0e | 16384 weights)
(conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights)
(conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512]
(linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights)
(skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e | 1458176 paths | 1458176 weights)
(reshape): reshape_irreps()
)
)
(products): ModuleList(
(0-1): 2 x EquivariantProductBasisBlock(
(symmetric_contractions): SymmetricContraction(
(contractions): ModuleList(
(0): Contraction(
(contractions_weighting): ModuleList(
(0-1): 2 x GraphModule()
)
(contractions_features): ModuleList(
(0-1): 2 x GraphModule()
)
(weights): ParameterList(
(0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)]
(1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)]
)
(graph_opt_main): GraphModule()
)
)
)
(linear): Linear(128x0e -> 128x0e | 16384 weights)
)
)
(readouts): ModuleList(
(0): LinearReadoutBlock(
(linear): Linear(128x0e -> 1x0e | 128 weights)
)
(1): NonLinearReadoutBlock(
(linear_1): Linear(128x0e -> 16x0e | 2048 weights)
(non_linearity): Activation [x] (16x0e -> 16x0e)
(linear_2): Linear(16x0e -> 1x0e | 16 weights)
)
)
(scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097)
)
2023-12-09 19:29:12.410 INFO: Number of parameters: 3847696
2023-12-09 19:29:12.410 INFO: Optimizer: Adam (
Parameter Group 0
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0020480000000000003
maximize: False
name: embedding
weight_decay: 0.0
Parameter Group 1
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0020480000000000003
maximize: False
name: interactions_decay
weight_decay: 1e-08
Parameter Group 2
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0020480000000000003
maximize: False
name: interactions_no_decay
weight_decay: 0.0
Parameter Group 3
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0020480000000000003
maximize: False
name: products
weight_decay: 1e-08
Parameter Group 4
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0020480000000000003
maximize: False
name: readouts
weight_decay: 0.0
)
2023-12-09 19:29:12.410 INFO: Using Weights and Biases for logging
2023-12-09 19:29:28.027 INFO: Using gradient clipping with tolerance=100.000
2023-12-09 19:29:28.028 INFO: Started training
2023-12-09 19:29:35.357 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.357 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.357 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.357 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.361 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.361 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.362 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.362 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.361 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.364 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.364 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.364 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.362 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.362 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.362 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.361 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.364 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.364 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.364 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.362 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.362 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.362 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.361 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.364 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.364 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.364 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.362 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.362 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.362 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.364 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.363 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.364 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:29:35.364 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-09 19:42:10.722 INFO: Epoch 145: loss=6.1601e-03, MAE_E_per_atom=28.0074 meV, MAE_F=53.4262 meV / A, MAE_stress_per_atom=0.1457 meV / A^3
2023-12-09 19:46:29.360 INFO: Epoch 146: loss=6.1904e-03, MAE_E_per_atom=27.5232 meV, MAE_F=53.4133 meV / A, MAE_stress_per_atom=0.1584 meV / A^3
2023-12-09 19:50:39.629 INFO: Epoch 147: loss=6.1711e-03, MAE_E_per_atom=27.6978 meV, MAE_F=53.3883 meV / A, MAE_stress_per_atom=0.1513 meV / A^3
2023-12-09 19:54:51.300 INFO: Epoch 148: loss=6.1416e-03, MAE_E_per_atom=27.4259 meV, MAE_F=53.4415 meV / A, MAE_stress_per_atom=0.1423 meV / A^3
2023-12-09 19:59:01.138 INFO: Epoch 149: loss=6.1729e-03, MAE_E_per_atom=27.6111 meV, MAE_F=53.6015 meV / A, MAE_stress_per_atom=0.1414 meV / A^3
2023-12-09 20:03:11.683 INFO: Epoch 150: loss=6.1783e-03, MAE_E_per_atom=27.6477 meV, MAE_F=53.3329 meV / A, MAE_stress_per_atom=0.1449 meV / A^3
2023-12-09 20:07:22.747 INFO: Epoch 151: loss=6.1461e-03, MAE_E_per_atom=27.4620 meV, MAE_F=53.2475 meV / A, MAE_stress_per_atom=0.1545 meV / A^3
2023-12-09 20:11:33.638 INFO: Epoch 152: loss=6.1542e-03, MAE_E_per_atom=27.7065 meV, MAE_F=53.0195 meV / A, MAE_stress_per_atom=0.1472 meV / A^3
2023-12-09 20:15:43.658 INFO: Epoch 153: loss=6.1751e-03, MAE_E_per_atom=27.4856 meV, MAE_F=53.2153 meV / A, MAE_stress_per_atom=0.1478 meV / A^3
2023-12-09 20:19:54.993 INFO: Epoch 154: loss=6.1541e-03, MAE_E_per_atom=27.6581 meV, MAE_F=53.1045 meV / A, MAE_stress_per_atom=0.1481 meV / A^3
2023-12-09 20:24:05.072 INFO: Epoch 155: loss=6.1733e-03, MAE_E_per_atom=27.3364 meV, MAE_F=53.0424 meV / A, MAE_stress_per_atom=0.1556 meV / A^3
2023-12-09 20:28:16.185 INFO: Epoch 156: loss=6.1128e-03, MAE_E_per_atom=27.5652 meV, MAE_F=52.8692 meV / A, MAE_stress_per_atom=0.1470 meV / A^3
2023-12-09 20:32:27.498 INFO: Epoch 157: loss=6.1206e-03, MAE_E_per_atom=27.4606 meV, MAE_F=52.8977 meV / A, MAE_stress_per_atom=0.1486 meV / A^3
2023-12-09 20:36:42.496 INFO: Epoch 158: loss=6.1082e-03, MAE_E_per_atom=27.4367 meV, MAE_F=53.0769 meV / A, MAE_stress_per_atom=0.1390 meV / A^3
2023-12-09 20:40:54.825 INFO: Epoch 159: loss=6.1641e-03, MAE_E_per_atom=27.5547 meV, MAE_F=53.1523 meV / A, MAE_stress_per_atom=0.1486 meV / A^3
2023-12-09 20:45:06.083 INFO: Epoch 160: loss=6.1130e-03, MAE_E_per_atom=27.3092 meV, MAE_F=52.8195 meV / A, MAE_stress_per_atom=0.1494 meV / A^3
2023-12-09 20:49:18.599 INFO: Epoch 161: loss=6.1361e-03, MAE_E_per_atom=27.4470 meV, MAE_F=52.9618 meV / A, MAE_stress_per_atom=0.1520 meV / A^3
2023-12-09 20:53:30.723 INFO: Epoch 162: loss=6.1198e-03, MAE_E_per_atom=27.5033 meV, MAE_F=52.9071 meV / A, MAE_stress_per_atom=0.1486 meV / A^3
2023-12-09 20:57:42.216 INFO: Epoch 163: loss=6.1085e-03, MAE_E_per_atom=27.1362 meV, MAE_F=52.8020 meV / A, MAE_stress_per_atom=0.1514 meV / A^3
2023-12-09 21:01:54.437 INFO: Epoch 164: loss=6.0962e-03, MAE_E_per_atom=27.2322 meV, MAE_F=52.7503 meV / A, MAE_stress_per_atom=0.1469 meV / A^3
2023-12-09 21:06:07.436 INFO: Epoch 165: loss=6.1095e-03, MAE_E_per_atom=27.3183 meV, MAE_F=52.7188 meV / A, MAE_stress_per_atom=0.1477 meV / A^3
2023-12-09 21:10:20.575 INFO: Epoch 166: loss=6.1139e-03, MAE_E_per_atom=27.2057 meV, MAE_F=52.8176 meV / A, MAE_stress_per_atom=0.1480 meV / A^3
2023-12-09 21:14:34.353 INFO: Epoch 167: loss=6.1349e-03, MAE_E_per_atom=27.3647 meV, MAE_F=52.8669 meV / A, MAE_stress_per_atom=0.1540 meV / A^3
2023-12-09 21:18:47.966 INFO: Epoch 168: loss=6.0893e-03, MAE_E_per_atom=26.9946 meV, MAE_F=52.5857 meV / A, MAE_stress_per_atom=0.1595 meV / A^3
2023-12-09 21:23:01.411 INFO: Epoch 169: loss=6.1166e-03, MAE_E_per_atom=27.2157 meV, MAE_F=52.8198 meV / A, MAE_stress_per_atom=0.1513 meV / A^3
2023-12-10 02:16:31.555 INFO: Process group initialized: True
2023-12-10 02:16:31.558 INFO: Processes: 80
2023-12-10 02:16:31.558 INFO: MACE version: 0.3.0
2023-12-10 02:16:31.558 INFO: Configuration: Namespace(name='05-128-L0', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=0, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='05-128-L0', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight'])
2023-12-10 02:16:31.558 INFO: CUDA version: 11.8, CUDA device: 0
2023-12-10 02:16:31.558 INFO: Using statistics json file
2023-12-10 02:16:31.558 INFO: Using atomic numbers from statistics file
2023-12-10 02:16:31.558 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94)
2023-12-10 02:16:31.558 INFO: Atomic Energies not in training file, using command line argument E0s
2023-12-10 02:16:31.559 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245]
2023-12-10 02:17:02.563 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000)
2023-12-10 02:17:02.565 INFO: Average number of neighbors: 61.964672446250916
2023-12-10 02:17:02.565 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False}
2023-12-10 02:17:02.565 INFO: Building model
2023-12-10 02:17:02.566 INFO: Hidden irreps: 128x0e
2023-12-10 02:17:05.889 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint.
2023-12-10 02:17:05.892 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-169.pt
2023-12-10 02:17:06.072 INFO: ScaleShiftMACE(
(node_embedding): LinearNodeEmbeddingBlock(
(linear): Linear(89x0e -> 128x0e | 11392 weights)
)
(radial_embedding): RadialEmbeddingBlock(
(bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False)
(cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0)
)
(spherical_harmonics): SphericalHarmonics()
(atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828])
(interactions): ModuleList(
(0-1): 2 x RealAgnosticResidualInteractionBlock(
(linear_up): Linear(128x0e -> 128x0e | 16384 weights)
(conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights)
(conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512]
(linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights)
(skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e | 1458176 paths | 1458176 weights)
(reshape): reshape_irreps()
)
)
(products): ModuleList(
(0-1): 2 x EquivariantProductBasisBlock(
(symmetric_contractions): SymmetricContraction(
(contractions): ModuleList(
(0): Contraction(
(contractions_weighting): ModuleList(
(0-1): 2 x GraphModule()
)
(contractions_features): ModuleList(
(0-1): 2 x GraphModule()
)
(weights): ParameterList(
(0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)]
(1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)]
)
(graph_opt_main): GraphModule()
)
)
)
(linear): Linear(128x0e -> 128x0e | 16384 weights)
)
)
(readouts): ModuleList(
(0): LinearReadoutBlock(
(linear): Linear(128x0e -> 1x0e | 128 weights)
)
(1): NonLinearReadoutBlock(
(linear_1): Linear(128x0e -> 16x0e | 2048 weights)
(non_linearity): Activation [x] (16x0e -> 16x0e)
(linear_2): Linear(16x0e -> 1x0e | 16 weights)
)
)
(scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097)
)
2023-12-10 02:17:06.078 INFO: Number of parameters: 3847696
2023-12-10 02:17:06.078 INFO: Optimizer: Adam (
Parameter Group 0
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0013107200000000005
maximize: False
name: embedding
weight_decay: 0.0
Parameter Group 1
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0013107200000000005
maximize: False
name: interactions_decay
weight_decay: 1e-08
Parameter Group 2
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0013107200000000005
maximize: False
name: interactions_no_decay
weight_decay: 0.0
Parameter Group 3
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0013107200000000005
maximize: False
name: products
weight_decay: 1e-08
Parameter Group 4
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0013107200000000005
maximize: False
name: readouts
weight_decay: 0.0
)
2023-12-10 02:17:06.079 INFO: Using Weights and Biases for logging
2023-12-10 02:17:20.367 INFO: Using gradient clipping with tolerance=100.000
2023-12-10 02:17:20.367 INFO: Started training
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.786 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.786 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.786 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.786 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.784 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.785 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.786 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:17:27.786 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 02:29:50.029 INFO: Epoch 169: loss=6.1295e-03, MAE_E_per_atom=27.2614 meV, MAE_F=52.9881 meV / A, MAE_stress_per_atom=0.1485 meV / A^3
2023-12-10 02:34:02.096 INFO: Epoch 170: loss=6.1159e-03, MAE_E_per_atom=27.2460 meV, MAE_F=52.7066 meV / A, MAE_stress_per_atom=0.1538 meV / A^3
2023-12-10 02:38:11.517 INFO: Epoch 171: loss=6.0913e-03, MAE_E_per_atom=27.1345 meV, MAE_F=52.4369 meV / A, MAE_stress_per_atom=0.1450 meV / A^3
2023-12-10 02:42:23.113 INFO: Epoch 172: loss=6.0994e-03, MAE_E_per_atom=27.1567 meV, MAE_F=52.6168 meV / A, MAE_stress_per_atom=0.1479 meV / A^3
2023-12-10 02:46:34.153 INFO: Epoch 173: loss=6.1156e-03, MAE_E_per_atom=27.1396 meV, MAE_F=52.6060 meV / A, MAE_stress_per_atom=0.1546 meV / A^3
2023-12-10 02:50:46.156 INFO: Epoch 174: loss=6.0883e-03, MAE_E_per_atom=27.1990 meV, MAE_F=52.5728 meV / A, MAE_stress_per_atom=0.1478 meV / A^3
2023-12-10 02:54:57.422 INFO: Epoch 175: loss=6.0970e-03, MAE_E_per_atom=27.3060 meV, MAE_F=52.5022 meV / A, MAE_stress_per_atom=0.1478 meV / A^3
2023-12-10 02:59:08.137 INFO: Epoch 176: loss=6.1478e-03, MAE_E_per_atom=27.2093 meV, MAE_F=52.6157 meV / A, MAE_stress_per_atom=0.1561 meV / A^3
2023-12-10 03:03:19.722 INFO: Epoch 177: loss=6.0720e-03, MAE_E_per_atom=27.0265 meV, MAE_F=52.5328 meV / A, MAE_stress_per_atom=0.1480 meV / A^3
2023-12-10 03:07:30.646 INFO: Epoch 178: loss=6.0520e-03, MAE_E_per_atom=26.8176 meV, MAE_F=52.4528 meV / A, MAE_stress_per_atom=0.1533 meV / A^3
2023-12-10 03:11:41.159 INFO: Epoch 179: loss=6.0964e-03, MAE_E_per_atom=27.0291 meV, MAE_F=52.6118 meV / A, MAE_stress_per_atom=0.1530 meV / A^3
2023-12-10 03:15:52.205 INFO: Epoch 180: loss=6.0735e-03, MAE_E_per_atom=26.8912 meV, MAE_F=52.5325 meV / A, MAE_stress_per_atom=0.1456 meV / A^3
2023-12-10 03:20:02.518 INFO: Epoch 181: loss=6.1244e-03, MAE_E_per_atom=27.1006 meV, MAE_F=52.6784 meV / A, MAE_stress_per_atom=0.1493 meV / A^3
2023-12-10 03:24:17.094 INFO: Epoch 182: loss=6.0583e-03, MAE_E_per_atom=26.9603 meV, MAE_F=52.2691 meV / A, MAE_stress_per_atom=0.1469 meV / A^3
2023-12-10 03:28:29.077 INFO: Epoch 183: loss=6.0530e-03, MAE_E_per_atom=26.9039 meV, MAE_F=52.1888 meV / A, MAE_stress_per_atom=0.1474 meV / A^3
2023-12-10 03:32:41.052 INFO: Epoch 184: loss=6.1376e-03, MAE_E_per_atom=26.8455 meV, MAE_F=52.6466 meV / A, MAE_stress_per_atom=0.1544 meV / A^3
2023-12-10 03:36:52.985 INFO: Epoch 185: loss=6.0598e-03, MAE_E_per_atom=27.0467 meV, MAE_F=52.3365 meV / A, MAE_stress_per_atom=0.1464 meV / A^3
2023-12-10 03:41:04.316 INFO: Epoch 186: loss=6.0610e-03, MAE_E_per_atom=26.9983 meV, MAE_F=52.2823 meV / A, MAE_stress_per_atom=0.1494 meV / A^3
2023-12-10 03:45:17.578 INFO: Epoch 187: loss=6.0779e-03, MAE_E_per_atom=26.9226 meV, MAE_F=52.1582 meV / A, MAE_stress_per_atom=0.1537 meV / A^3
2023-12-10 03:49:30.851 INFO: Epoch 188: loss=6.0992e-03, MAE_E_per_atom=26.9394 meV, MAE_F=52.2702 meV / A, MAE_stress_per_atom=0.1509 meV / A^3
2023-12-10 03:53:43.788 INFO: Epoch 189: loss=6.0795e-03, MAE_E_per_atom=26.9932 meV, MAE_F=52.1241 meV / A, MAE_stress_per_atom=0.1466 meV / A^3
2023-12-10 03:57:56.872 INFO: Epoch 190: loss=6.0594e-03, MAE_E_per_atom=26.7273 meV, MAE_F=52.0371 meV / A, MAE_stress_per_atom=0.1499 meV / A^3
2023-12-10 04:02:09.797 INFO: Epoch 191: loss=6.0889e-03, MAE_E_per_atom=26.9634 meV, MAE_F=52.2234 meV / A, MAE_stress_per_atom=0.1470 meV / A^3
2023-12-10 04:06:23.555 INFO: Epoch 192: loss=6.0737e-03, MAE_E_per_atom=26.8468 meV, MAE_F=52.1619 meV / A, MAE_stress_per_atom=0.1517 meV / A^3
2023-12-10 04:10:37.978 INFO: Epoch 193: loss=6.0751e-03, MAE_E_per_atom=26.7409 meV, MAE_F=52.1070 meV / A, MAE_stress_per_atom=0.1484 meV / A^3
2023-12-10 05:54:51.987 INFO: Process group initialized: True
2023-12-10 05:54:51.989 INFO: Processes: 80
2023-12-10 05:54:51.989 INFO: MACE version: 0.3.0
2023-12-10 05:54:51.989 INFO: Configuration: Namespace(name='05-128-L0', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=0, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=10.0, swa_forces_weight=100.0, energy_weight=1.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=200, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='05-128-L0', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight'])
2023-12-10 05:54:51.989 INFO: CUDA version: 11.8, CUDA device: 0
2023-12-10 05:54:51.990 INFO: Using statistics json file
2023-12-10 05:54:51.990 INFO: Using atomic numbers from statistics file
2023-12-10 05:54:51.990 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94)
2023-12-10 05:54:51.990 INFO: Atomic Energies not in training file, using command line argument E0s
2023-12-10 05:54:51.991 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245]
2023-12-10 05:55:23.760 INFO: UniversalLoss(energy_weight=1.000, forces_weight=10.000, stress_weight=100.000)
2023-12-10 05:55:23.762 INFO: Average number of neighbors: 61.964672446250916
2023-12-10 05:55:23.762 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False}
2023-12-10 05:55:23.762 INFO: Building model
2023-12-10 05:55:23.763 INFO: Hidden irreps: 128x0e
2023-12-10 05:55:27.764 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint.
2023-12-10 05:55:27.768 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-193.pt
2023-12-10 05:55:27.947 INFO: ScaleShiftMACE(
(node_embedding): LinearNodeEmbeddingBlock(
(linear): Linear(89x0e -> 128x0e | 11392 weights)
)
(radial_embedding): RadialEmbeddingBlock(
(bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False)
(cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0)
)
(spherical_harmonics): SphericalHarmonics()
(atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828])
(interactions): ModuleList(
(0-1): 2 x RealAgnosticResidualInteractionBlock(
(linear_up): Linear(128x0e -> 128x0e | 16384 weights)
(conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights)
(conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512]
(linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights)
(skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e | 1458176 paths | 1458176 weights)
(reshape): reshape_irreps()
)
)
(products): ModuleList(
(0-1): 2 x EquivariantProductBasisBlock(
(symmetric_contractions): SymmetricContraction(
(contractions): ModuleList(
(0): Contraction(
(contractions_weighting): ModuleList(
(0-1): 2 x GraphModule()
)
(contractions_features): ModuleList(
(0-1): 2 x GraphModule()
)
(weights): ParameterList(
(0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)]
(1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)]
)
(graph_opt_main): GraphModule()
)
)
)
(linear): Linear(128x0e -> 128x0e | 16384 weights)
)
)
(readouts): ModuleList(
(0): LinearReadoutBlock(
(linear): Linear(128x0e -> 1x0e | 128 weights)
)
(1): NonLinearReadoutBlock(
(linear_1): Linear(128x0e -> 16x0e | 2048 weights)
(non_linearity): Activation [x] (16x0e -> 16x0e)
(linear_2): Linear(16x0e -> 1x0e | 16 weights)
)
)
(scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097)
)
2023-12-10 05:55:27.953 INFO: Number of parameters: 3847696
2023-12-10 05:55:27.953 INFO: Optimizer: Adam (
Parameter Group 0
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: embedding
weight_decay: 0.0
Parameter Group 1
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: interactions_decay
weight_decay: 1e-08
Parameter Group 2
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: interactions_no_decay
weight_decay: 0.0
Parameter Group 3
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: products
weight_decay: 1e-08
Parameter Group 4
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: readouts
weight_decay: 0.0
)
2023-12-10 05:55:27.953 INFO: Using Weights and Biases for logging
2023-12-10 05:55:41.103 INFO: Using gradient clipping with tolerance=100.000
2023-12-10 05:55:41.103 INFO: Started training
2023-12-10 05:55:48.487 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.487 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.487 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.487 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 05:55:48.491 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 06:08:46.943 INFO: Epoch 193: loss=6.0901e-03, MAE_E_per_atom=26.9053 meV, MAE_F=52.1947 meV / A, MAE_stress_per_atom=0.1486 meV / A^3
2023-12-10 06:13:04.527 INFO: Epoch 194: loss=6.0664e-03, MAE_E_per_atom=26.8856 meV, MAE_F=51.9760 meV / A, MAE_stress_per_atom=0.1477 meV / A^3
2023-12-10 06:17:15.112 INFO: Epoch 195: loss=6.0731e-03, MAE_E_per_atom=26.7701 meV, MAE_F=51.9995 meV / A, MAE_stress_per_atom=0.1487 meV / A^3
2023-12-10 06:21:25.601 INFO: Epoch 196: loss=6.0480e-03, MAE_E_per_atom=26.7818 meV, MAE_F=52.0789 meV / A, MAE_stress_per_atom=0.1472 meV / A^3
2023-12-10 06:25:35.447 INFO: Epoch 197: loss=6.0685e-03, MAE_E_per_atom=26.7129 meV, MAE_F=52.1319 meV / A, MAE_stress_per_atom=0.1482 meV / A^3
2023-12-10 06:29:45.746 INFO: Epoch 198: loss=6.0640e-03, MAE_E_per_atom=26.9808 meV, MAE_F=51.9965 meV / A, MAE_stress_per_atom=0.1480 meV / A^3
2023-12-10 06:33:57.061 INFO: Epoch 199: loss=6.0469e-03, MAE_E_per_atom=26.6417 meV, MAE_F=51.9664 meV / A, MAE_stress_per_atom=0.1485 meV / A^3
2023-12-10 06:33:57.258 INFO: Training complete
2023-12-10 06:33:57.259 INFO: Computing metrics for training, validation, and test sets
2023-12-10 06:33:57.265 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-199.pt
2023-12-10 06:33:57.527 INFO: Loaded model from epoch 199
2023-12-10 06:33:57.527 INFO: Evaluating train ...
2023-12-10 06:35:23.495 INFO: Evaluating valid ...
2023-12-10 06:35:24.464 INFO:
+-------------+--------------------+-----------------+------------------+
| config_type | MAE E / meV / atom | MAE F / meV / A | relative F MAE % |
+-------------+--------------------+-----------------+------------------+
| train | 28.8 | 51.2 | 32.39 |
| valid | 26.6 | 52.0 | 36.97 |
+-------------+--------------------+-----------------+------------------+
2023-12-10 06:35:24.464 INFO: Saving model to checkpoints/05-128-L0_run-1.model
2023-12-10 06:35:24.623 INFO: Done
2023-12-10 12:59:52.400 INFO: Process group initialized: True
2023-12-10 12:59:52.402 INFO: Processes: 80
2023-12-10 12:59:52.402 INFO: MACE version: 0.3.0
2023-12-10 12:59:52.402 INFO: Configuration: Namespace(name='05-128-L0', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=0, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=1.0, swa_forces_weight=100.0, energy_weight=10.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=250, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='05-128-L0', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight'])
2023-12-10 12:59:52.403 INFO: CUDA version: 11.8, CUDA device: 0
2023-12-10 12:59:52.404 INFO: Using statistics json file
2023-12-10 12:59:52.404 INFO: Using atomic numbers from statistics file
2023-12-10 12:59:52.404 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94)
2023-12-10 12:59:52.404 INFO: Atomic Energies not in training file, using command line argument E0s
2023-12-10 12:59:52.404 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245]
2023-12-10 13:00:24.656 INFO: UniversalLoss(energy_weight=10.000, forces_weight=1.000, stress_weight=100.000)
2023-12-10 13:00:24.659 INFO: Average number of neighbors: 61.964672446250916
2023-12-10 13:00:24.659 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False}
2023-12-10 13:00:24.659 INFO: Building model
2023-12-10 13:00:24.660 INFO: Hidden irreps: 128x0e
2023-12-10 13:00:28.714 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint.
2023-12-10 13:00:28.718 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-199.pt
2023-12-10 13:00:28.901 INFO: ScaleShiftMACE(
(node_embedding): LinearNodeEmbeddingBlock(
(linear): Linear(89x0e -> 128x0e | 11392 weights)
)
(radial_embedding): RadialEmbeddingBlock(
(bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False)
(cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0)
)
(spherical_harmonics): SphericalHarmonics()
(atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828])
(interactions): ModuleList(
(0-1): 2 x RealAgnosticResidualInteractionBlock(
(linear_up): Linear(128x0e -> 128x0e | 16384 weights)
(conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights)
(conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512]
(linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights)
(skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e | 1458176 paths | 1458176 weights)
(reshape): reshape_irreps()
)
)
(products): ModuleList(
(0-1): 2 x EquivariantProductBasisBlock(
(symmetric_contractions): SymmetricContraction(
(contractions): ModuleList(
(0): Contraction(
(contractions_weighting): ModuleList(
(0-1): 2 x GraphModule()
)
(contractions_features): ModuleList(
(0-1): 2 x GraphModule()
)
(weights): ParameterList(
(0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)]
(1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)]
)
(graph_opt_main): GraphModule()
)
)
)
(linear): Linear(128x0e -> 128x0e | 16384 weights)
)
)
(readouts): ModuleList(
(0): LinearReadoutBlock(
(linear): Linear(128x0e -> 1x0e | 128 weights)
)
(1): NonLinearReadoutBlock(
(linear_1): Linear(128x0e -> 16x0e | 2048 weights)
(non_linearity): Activation [x] (16x0e -> 16x0e)
(linear_2): Linear(16x0e -> 1x0e | 16 weights)
)
)
(scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097)
)
2023-12-10 13:00:28.908 INFO: Number of parameters: 3847696
2023-12-10 13:00:28.908 INFO: Optimizer: Adam (
Parameter Group 0
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: embedding
weight_decay: 0.0
Parameter Group 1
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: interactions_decay
weight_decay: 1e-08
Parameter Group 2
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: interactions_no_decay
weight_decay: 0.0
Parameter Group 3
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: products
weight_decay: 1e-08
Parameter Group 4
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: readouts
weight_decay: 0.0
)
2023-12-10 13:00:28.908 INFO: Using Weights and Biases for logging
2023-12-10 13:00:42.646 INFO: Using gradient clipping with tolerance=100.000
2023-12-10 13:00:42.646 INFO: Started training
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.300 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.301 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:00:50.302 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 13:13:13.539 INFO: Epoch 199: loss=2.9845e-03, MAE_E_per_atom=19.3483 meV, MAE_F=63.1037 meV / A, MAE_stress_per_atom=0.1498 meV / A^3
2023-12-10 13:17:37.117 INFO: Epoch 200: loss=2.8206e-03, MAE_E_per_atom=17.5633 meV, MAE_F=64.8166 meV / A, MAE_stress_per_atom=0.1558 meV / A^3
2023-12-10 13:21:48.416 INFO: Epoch 201: loss=2.7368e-03, MAE_E_per_atom=16.4721 meV, MAE_F=65.8262 meV / A, MAE_stress_per_atom=0.1558 meV / A^3
2023-12-10 13:26:00.912 INFO: Epoch 202: loss=2.7212e-03, MAE_E_per_atom=16.1409 meV, MAE_F=66.5151 meV / A, MAE_stress_per_atom=0.1603 meV / A^3
2023-12-10 13:30:11.398 INFO: Epoch 203: loss=2.6605e-03, MAE_E_per_atom=15.4348 meV, MAE_F=66.6152 meV / A, MAE_stress_per_atom=0.1610 meV / A^3
2023-12-10 13:34:22.205 INFO: Epoch 204: loss=2.6008e-03, MAE_E_per_atom=14.7786 meV, MAE_F=66.8671 meV / A, MAE_stress_per_atom=0.1583 meV / A^3
2023-12-10 13:38:33.185 INFO: Epoch 205: loss=2.6204e-03, MAE_E_per_atom=14.9108 meV, MAE_F=67.5783 meV / A, MAE_stress_per_atom=0.1612 meV / A^3
2023-12-10 13:42:44.622 INFO: Epoch 206: loss=2.5909e-03, MAE_E_per_atom=14.4830 meV, MAE_F=67.2339 meV / A, MAE_stress_per_atom=0.1646 meV / A^3
2023-12-10 13:46:55.074 INFO: Epoch 207: loss=2.5987e-03, MAE_E_per_atom=14.5117 meV, MAE_F=67.6949 meV / A, MAE_stress_per_atom=0.1672 meV / A^3
2023-12-10 13:51:07.342 INFO: Epoch 208: loss=2.5806e-03, MAE_E_per_atom=14.2735 meV, MAE_F=67.8321 meV / A, MAE_stress_per_atom=0.1655 meV / A^3
2023-12-10 13:55:17.478 INFO: Epoch 209: loss=2.5823e-03, MAE_E_per_atom=14.3900 meV, MAE_F=67.9699 meV / A, MAE_stress_per_atom=0.1651 meV / A^3
2023-12-10 13:59:28.234 INFO: Epoch 210: loss=2.5540e-03, MAE_E_per_atom=13.9833 meV, MAE_F=67.9470 meV / A, MAE_stress_per_atom=0.1654 meV / A^3
2023-12-10 14:03:38.616 INFO: Epoch 211: loss=2.5515e-03, MAE_E_per_atom=13.9243 meV, MAE_F=67.9955 meV / A, MAE_stress_per_atom=0.1660 meV / A^3
2023-12-10 14:07:52.084 INFO: Epoch 212: loss=2.5456e-03, MAE_E_per_atom=13.8126 meV, MAE_F=68.6091 meV / A, MAE_stress_per_atom=0.1647 meV / A^3
2023-12-10 14:12:04.416 INFO: Epoch 213: loss=2.5590e-03, MAE_E_per_atom=13.8158 meV, MAE_F=68.3543 meV / A, MAE_stress_per_atom=0.1689 meV / A^3
2023-12-10 14:16:16.060 INFO: Epoch 214: loss=2.5196e-03, MAE_E_per_atom=13.4943 meV, MAE_F=68.4538 meV / A, MAE_stress_per_atom=0.1672 meV / A^3
2023-12-10 14:20:27.875 INFO: Epoch 215: loss=2.5135e-03, MAE_E_per_atom=13.3082 meV, MAE_F=68.5487 meV / A, MAE_stress_per_atom=0.1714 meV / A^3
2023-12-10 14:24:39.607 INFO: Epoch 216: loss=2.5250e-03, MAE_E_per_atom=13.4014 meV, MAE_F=68.7036 meV / A, MAE_stress_per_atom=0.1700 meV / A^3
2023-12-10 14:28:52.058 INFO: Epoch 217: loss=2.5197e-03, MAE_E_per_atom=13.3838 meV, MAE_F=68.4496 meV / A, MAE_stress_per_atom=0.1719 meV / A^3
2023-12-10 14:33:04.018 INFO: Epoch 218: loss=2.5156e-03, MAE_E_per_atom=13.2671 meV, MAE_F=69.0225 meV / A, MAE_stress_per_atom=0.1708 meV / A^3
2023-12-10 14:37:16.167 INFO: Epoch 219: loss=2.4919e-03, MAE_E_per_atom=12.9119 meV, MAE_F=68.9507 meV / A, MAE_stress_per_atom=0.1731 meV / A^3
2023-12-10 14:41:29.196 INFO: Epoch 220: loss=2.4997e-03, MAE_E_per_atom=13.0661 meV, MAE_F=68.9661 meV / A, MAE_stress_per_atom=0.1743 meV / A^3
2023-12-10 14:45:42.673 INFO: Epoch 221: loss=2.4835e-03, MAE_E_per_atom=12.9883 meV, MAE_F=68.8325 meV / A, MAE_stress_per_atom=0.1733 meV / A^3
2023-12-10 14:49:55.667 INFO: Epoch 222: loss=2.4799e-03, MAE_E_per_atom=12.8845 meV, MAE_F=68.8826 meV / A, MAE_stress_per_atom=0.1711 meV / A^3
2023-12-10 14:54:10.505 INFO: Epoch 223: loss=2.5148e-03, MAE_E_per_atom=13.2011 meV, MAE_F=69.0581 meV / A, MAE_stress_per_atom=0.1751 meV / A^3
2023-12-10 15:06:33.861 INFO: Process group initialized: True
2023-12-10 15:06:33.863 INFO: Processes: 80
2023-12-10 15:06:33.863 INFO: MACE version: 0.3.0
2023-12-10 15:06:33.863 INFO: Configuration: Namespace(name='05-128-L0', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=0, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=1.0, swa_forces_weight=100.0, energy_weight=10.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=250, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='05-128-L0', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight'])
2023-12-10 15:06:33.863 INFO: CUDA version: 11.8, CUDA device: 0
2023-12-10 15:06:33.864 INFO: Using statistics json file
2023-12-10 15:06:33.864 INFO: Using atomic numbers from statistics file
2023-12-10 15:06:33.864 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94)
2023-12-10 15:06:33.864 INFO: Atomic Energies not in training file, using command line argument E0s
2023-12-10 15:06:33.865 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245]
2023-12-10 15:07:06.279 INFO: UniversalLoss(energy_weight=10.000, forces_weight=1.000, stress_weight=100.000)
2023-12-10 15:07:06.281 INFO: Average number of neighbors: 61.964672446250916
2023-12-10 15:07:06.281 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False}
2023-12-10 15:07:06.281 INFO: Building model
2023-12-10 15:07:06.281 INFO: Hidden irreps: 128x0e
2023-12-10 15:07:09.587 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint.
2023-12-10 15:07:09.591 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-223.pt
2023-12-10 15:07:09.766 INFO: ScaleShiftMACE(
(node_embedding): LinearNodeEmbeddingBlock(
(linear): Linear(89x0e -> 128x0e | 11392 weights)
)
(radial_embedding): RadialEmbeddingBlock(
(bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False)
(cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0)
)
(spherical_harmonics): SphericalHarmonics()
(atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828])
(interactions): ModuleList(
(0-1): 2 x RealAgnosticResidualInteractionBlock(
(linear_up): Linear(128x0e -> 128x0e | 16384 weights)
(conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights)
(conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512]
(linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights)
(skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e | 1458176 paths | 1458176 weights)
(reshape): reshape_irreps()
)
)
(products): ModuleList(
(0-1): 2 x EquivariantProductBasisBlock(
(symmetric_contractions): SymmetricContraction(
(contractions): ModuleList(
(0): Contraction(
(contractions_weighting): ModuleList(
(0-1): 2 x GraphModule()
)
(contractions_features): ModuleList(
(0-1): 2 x GraphModule()
)
(weights): ParameterList(
(0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)]
(1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)]
)
(graph_opt_main): GraphModule()
)
)
)
(linear): Linear(128x0e -> 128x0e | 16384 weights)
)
)
(readouts): ModuleList(
(0): LinearReadoutBlock(
(linear): Linear(128x0e -> 1x0e | 128 weights)
)
(1): NonLinearReadoutBlock(
(linear_1): Linear(128x0e -> 16x0e | 2048 weights)
(non_linearity): Activation [x] (16x0e -> 16x0e)
(linear_2): Linear(16x0e -> 1x0e | 16 weights)
)
)
(scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097)
)
2023-12-10 15:07:09.772 INFO: Number of parameters: 3847696
2023-12-10 15:07:09.772 INFO: Optimizer: Adam (
Parameter Group 0
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: embedding
weight_decay: 0.0
Parameter Group 1
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: interactions_decay
weight_decay: 1e-08
Parameter Group 2
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: interactions_no_decay
weight_decay: 0.0
Parameter Group 3
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: products
weight_decay: 1e-08
Parameter Group 4
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0008388608000000005
maximize: False
name: readouts
weight_decay: 0.0
)
2023-12-10 15:07:09.772 INFO: Using Weights and Biases for logging
2023-12-10 15:07:23.870 INFO: Using gradient clipping with tolerance=100.000
2023-12-10 15:07:23.870 INFO: Started training
2023-12-10 15:07:31.401 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.401 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.401 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.401 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.402 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:07:31.403 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 15:19:54.290 INFO: Epoch 223: loss=2.5266e-03, MAE_E_per_atom=13.3108 meV, MAE_F=69.0304 meV / A, MAE_stress_per_atom=0.1774 meV / A^3
2023-12-10 15:24:05.620 INFO: Epoch 224: loss=2.5063e-03, MAE_E_per_atom=13.0860 meV, MAE_F=69.2501 meV / A, MAE_stress_per_atom=0.1734 meV / A^3
2023-12-10 15:28:16.286 INFO: Epoch 225: loss=2.4891e-03, MAE_E_per_atom=12.9168 meV, MAE_F=69.0463 meV / A, MAE_stress_per_atom=0.1728 meV / A^3
2023-12-10 15:32:27.199 INFO: Epoch 226: loss=2.5005e-03, MAE_E_per_atom=12.9353 meV, MAE_F=69.3562 meV / A, MAE_stress_per_atom=0.1763 meV / A^3
2023-12-10 15:36:37.316 INFO: Epoch 227: loss=2.4865e-03, MAE_E_per_atom=12.9595 meV, MAE_F=69.4521 meV / A, MAE_stress_per_atom=0.1703 meV / A^3
2023-12-10 15:40:47.868 INFO: Epoch 228: loss=2.4903e-03, MAE_E_per_atom=12.8841 meV, MAE_F=69.4782 meV / A, MAE_stress_per_atom=0.1726 meV / A^3
2023-12-10 15:44:58.187 INFO: Epoch 229: loss=2.4650e-03, MAE_E_per_atom=12.7092 meV, MAE_F=69.4320 meV / A, MAE_stress_per_atom=0.1712 meV / A^3
2023-12-10 15:49:09.043 INFO: Epoch 230: loss=2.4793e-03, MAE_E_per_atom=12.8249 meV, MAE_F=69.5088 meV / A, MAE_stress_per_atom=0.1738 meV / A^3
2023-12-10 15:53:19.506 INFO: Epoch 231: loss=2.4816e-03, MAE_E_per_atom=12.7618 meV, MAE_F=69.4094 meV / A, MAE_stress_per_atom=0.1748 meV / A^3
2023-12-10 15:57:29.778 INFO: Epoch 232: loss=2.4761e-03, MAE_E_per_atom=12.7866 meV, MAE_F=69.5216 meV / A, MAE_stress_per_atom=0.1711 meV / A^3
2023-12-10 16:01:40.530 INFO: Epoch 233: loss=2.4413e-03, MAE_E_per_atom=12.5975 meV, MAE_F=69.5293 meV / A, MAE_stress_per_atom=0.1672 meV / A^3
2023-12-10 16:05:51.660 INFO: Epoch 234: loss=2.4759e-03, MAE_E_per_atom=12.7193 meV, MAE_F=69.6281 meV / A, MAE_stress_per_atom=0.1770 meV / A^3
2023-12-10 16:10:03.111 INFO: Epoch 235: loss=2.4628e-03, MAE_E_per_atom=12.6818 meV, MAE_F=69.4103 meV / A, MAE_stress_per_atom=0.1721 meV / A^3
2023-12-10 16:14:16.739 INFO: Epoch 236: loss=2.4583e-03, MAE_E_per_atom=12.6698 meV, MAE_F=69.6974 meV / A, MAE_stress_per_atom=0.1712 meV / A^3
2023-12-10 16:18:28.342 INFO: Epoch 237: loss=2.4640e-03, MAE_E_per_atom=12.6089 meV, MAE_F=69.7229 meV / A, MAE_stress_per_atom=0.1763 meV / A^3
2023-12-10 16:22:40.408 INFO: Epoch 238: loss=2.4656e-03, MAE_E_per_atom=12.5763 meV, MAE_F=69.5826 meV / A, MAE_stress_per_atom=0.1745 meV / A^3
2023-12-10 16:26:52.301 INFO: Epoch 239: loss=2.4381e-03, MAE_E_per_atom=12.4454 meV, MAE_F=69.6831 meV / A, MAE_stress_per_atom=0.1729 meV / A^3
2023-12-10 16:31:03.924 INFO: Epoch 240: loss=2.4571e-03, MAE_E_per_atom=12.4946 meV, MAE_F=69.5905 meV / A, MAE_stress_per_atom=0.1815 meV / A^3
2023-12-10 16:35:15.764 INFO: Epoch 241: loss=2.4624e-03, MAE_E_per_atom=12.7204 meV, MAE_F=69.7296 meV / A, MAE_stress_per_atom=0.1742 meV / A^3
2023-12-10 16:39:28.504 INFO: Epoch 242: loss=2.4628e-03, MAE_E_per_atom=12.7052 meV, MAE_F=69.6315 meV / A, MAE_stress_per_atom=0.1744 meV / A^3
2023-12-10 16:43:41.247 INFO: Epoch 243: loss=2.4632e-03, MAE_E_per_atom=12.5827 meV, MAE_F=69.8993 meV / A, MAE_stress_per_atom=0.1763 meV / A^3
2023-12-10 16:47:54.482 INFO: Epoch 244: loss=2.4483e-03, MAE_E_per_atom=12.5400 meV, MAE_F=69.7002 meV / A, MAE_stress_per_atom=0.1746 meV / A^3
2023-12-10 16:52:08.178 INFO: Epoch 245: loss=2.4522e-03, MAE_E_per_atom=12.5545 meV, MAE_F=69.7714 meV / A, MAE_stress_per_atom=0.1729 meV / A^3
2023-12-10 16:56:20.826 INFO: Epoch 246: loss=2.4398e-03, MAE_E_per_atom=12.5508 meV, MAE_F=69.5313 meV / A, MAE_stress_per_atom=0.1727 meV / A^3
2023-12-10 17:00:36.199 INFO: Epoch 247: loss=2.4319e-03, MAE_E_per_atom=12.3850 meV, MAE_F=69.7225 meV / A, MAE_stress_per_atom=0.1760 meV / A^3
2023-12-10 21:12:22.234 INFO: Process group initialized: True
2023-12-10 21:12:22.236 INFO: Processes: 80
2023-12-10 21:12:22.236 INFO: MACE version: 0.3.0
2023-12-10 21:12:22.236 INFO: Configuration: Namespace(name='05-128-L0', seed=1, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cuda', default_dtype='float64', distributed=True, log_level='INFO', error_table='PerAtomMAE', model='ScaleShiftMACE', r_max=6.0, radial_type='bessel', num_radial_basis=10, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='128x0e + 128x1o', num_channels=128, max_L=0, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=True, compute_forces=True, train_file='../../dataset/mptrj-gga-ggapu-train', valid_file='../../dataset/mptrj-gga-ggapu-val', valid_fraction=0.1, test_file=None, test_dir=None, multi_processed_test=False, num_workers=16, pin_memory=True, atomic_numbers=None, mean=None, std=None, statistics_file='../../dataset/mptrj-gga-ggapu-statistics.json', E0s=None, energy_key='energy', forces_key='forces', virials_key='virials', stress_key='stress', dipole_key='dipole', charges_key='charges', loss='universal', forces_weight=1.0, swa_forces_weight=100.0, energy_weight=10.0, swa_energy_weight=1000.0, virials_weight=1.0, swa_virials_weight=10.0, stress_weight=100.0, swa_stress_weight=10.0, dipole_weight=1.0, swa_dipole_weight=1.0, config_type_weights='{"Default":1.0}', huber_delta=0.01, optimizer='adam', batch_size=16, valid_batch_size=32, lr=0.005, swa_lr=0.001, weight_decay=1e-08, amsgrad=True, scheduler='ReduceLROnPlateau', lr_factor=0.8, scheduler_patience=5, lr_scheduler_gamma=0.9993, swa=False, start_swa=None, ema=True, ema_decay=0.995, max_num_epochs=250, patience=50, eval_interval=1, keep_checkpoints=True, restart_latest=True, save_cpu=True, clip_grad=100.0, wandb=True, wandb_project='mace-universal', wandb_entity='astagroup', wandb_name='05-128-L0', wandb_log_hypers=['num_channels', 'max_L', 'correlation', 'lr', 'swa_lr', 'weight_decay', 'batch_size', 'max_num_epochs', 'start_swa', 'energy_weight', 'forces_weight'])
2023-12-10 21:12:22.236 INFO: CUDA version: 11.8, CUDA device: 0
2023-12-10 21:12:22.236 INFO: Using statistics json file
2023-12-10 21:12:22.236 INFO: Using atomic numbers from statistics file
2023-12-10 21:12:22.237 INFO: AtomicNumberTable: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 89, 90, 91, 92, 93, 94)
2023-12-10 21:12:22.237 INFO: Atomic Energies not in training file, using command line argument E0s
2023-12-10 21:12:22.237 INFO: Atomic energies: [-3.667168021358939, -1.3320953124042916, -3.482100566595956, -4.736697230897597, -7.724935420523256, -8.405573550273285, -7.360100452662763, -7.28459863421322, -4.896490881731322, 1.3917755836700962e-12, -2.7593613569762425, -2.814047612069227, -4.846881245288104, -7.694793133351899, -6.9632957911820235, -4.672630400190884, -2.8116892814008096, -0.06259504416367478, -2.6176454856894793, -5.390461060484104, -7.8857952163517675, -10.268392986214433, -8.665147785496703, -9.233050763772013, -8.304951520770791, -7.0489865771593765, -5.577439766222147, -5.172747618813715, -3.2520726958619472, -1.2901611618726314, -3.527082192997912, -4.70845955030298, -3.9765109025623238, -3.886231055836541, -2.5184940099633986, 6.766947645687137, -2.5634958965928316, -4.938005211501922, -10.149818838085771, -11.846857579882572, -12.138896361658485, -8.791678800595722, -8.78694939675911, -7.78093221529871, -6.850021409115055, -4.891019073240479, -2.0634296773864045, -0.6395695518943755, -2.7887442084286693, -3.818604275441892, -3.587068329278862, -2.8804045971118897, -1.6355986842433357, 9.846723842807721, -2.765284507132287, -4.990956432167774, -8.933684809576345, -8.735591176647514, -8.018966025544966, -8.251491970213372, -7.591719594359237, -8.169659881166858, -13.592664636171698, -18.517523458456985, -7.647396572993602, -8.122981037851925, -7.607787319678067, -6.85029094445494, -7.8268821327130365, -3.584786591677161, -7.455406192077973, -12.796283502572146, -14.108127281277586, -9.354916969477486, -11.387537567890853, -9.621909492152557, -7.324393429417677, -5.3046964808341945, -2.380092582080244, 0.24948924158195362, -2.3239789120665026, -3.730042357127322, -3.438792347649683, -5.062878214511315, -11.02462566385297, -12.265613551943261, -13.855648206100362, -14.933092020258243, -15.282826131998245]
2023-12-10 21:12:54.424 INFO: UniversalLoss(energy_weight=10.000, forces_weight=1.000, stress_weight=100.000)
2023-12-10 21:12:54.426 INFO: Average number of neighbors: 61.964672446250916
2023-12-10 21:12:54.426 INFO: Selected the following outputs: {'energy': True, 'forces': True, 'virials': False, 'stress': True, 'dipoles': False}
2023-12-10 21:12:54.426 INFO: Building model
2023-12-10 21:12:54.427 INFO: Hidden irreps: 128x0e
2023-12-10 21:12:58.441 WARNING: No SWA checkpoint found, while SWA is enabled. Compare the swa_start parameter and the latest checkpoint.
2023-12-10 21:12:58.446 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-247.pt
2023-12-10 21:12:58.632 INFO: ScaleShiftMACE(
(node_embedding): LinearNodeEmbeddingBlock(
(linear): Linear(89x0e -> 128x0e | 11392 weights)
)
(radial_embedding): RadialEmbeddingBlock(
(bessel_fn): BesselBasis(r_max=6.0, num_basis=10, trainable=False)
(cutoff_fn): PolynomialCutoff(p=5.0, r_max=6.0)
)
(spherical_harmonics): SphericalHarmonics()
(atomic_energies_fn): AtomicEnergiesBlock(energies=[-3.6672, -1.3321, -3.4821, -4.7367, -7.7249, -8.4056, -7.3601, -7.2846, -4.8965, 0.0000, -2.7594, -2.8140, -4.8469, -7.6948, -6.9633, -4.6726, -2.8117, -0.0626, -2.6176, -5.3905, -7.8858, -10.2684, -8.6651, -9.2331, -8.3050, -7.0490, -5.5774, -5.1727, -3.2521, -1.2902, -3.5271, -4.7085, -3.9765, -3.8862, -2.5185, 6.7669, -2.5635, -4.9380, -10.1498, -11.8469, -12.1389, -8.7917, -8.7869, -7.7809, -6.8500, -4.8910, -2.0634, -0.6396, -2.7887, -3.8186, -3.5871, -2.8804, -1.6356, 9.8467, -2.7653, -4.9910, -8.9337, -8.7356, -8.0190, -8.2515, -7.5917, -8.1697, -13.5927, -18.5175, -7.6474, -8.1230, -7.6078, -6.8503, -7.8269, -3.5848, -7.4554, -12.7963, -14.1081, -9.3549, -11.3875, -9.6219, -7.3244, -5.3047, -2.3801, 0.2495, -2.3240, -3.7300, -3.4388, -5.0629, -11.0246, -12.2656, -13.8556, -14.9331, -15.2828])
(interactions): ModuleList(
(0-1): 2 x RealAgnosticResidualInteractionBlock(
(linear_up): Linear(128x0e -> 128x0e | 16384 weights)
(conv_tp): TensorProduct(128x0e x 1x0e+1x1o+1x2e+1x3o -> 128x0e+128x1o+128x2e+128x3o | 512 paths | 512 weights)
(conv_tp_weights): FullyConnectedNet[10, 64, 64, 64, 512]
(linear): Linear(128x0e+128x1o+128x2e+128x3o -> 128x0e+128x1o+128x2e+128x3o | 65536 weights)
(skip_tp): FullyConnectedTensorProduct(128x0e x 89x0e -> 128x0e | 1458176 paths | 1458176 weights)
(reshape): reshape_irreps()
)
)
(products): ModuleList(
(0-1): 2 x EquivariantProductBasisBlock(
(symmetric_contractions): SymmetricContraction(
(contractions): ModuleList(
(0): Contraction(
(contractions_weighting): ModuleList(
(0-1): 2 x GraphModule()
)
(contractions_features): ModuleList(
(0-1): 2 x GraphModule()
)
(weights): ParameterList(
(0): Parameter containing: [torch.float64 of size 89x4x128 (GPU 0)]
(1): Parameter containing: [torch.float64 of size 89x1x128 (GPU 0)]
)
(graph_opt_main): GraphModule()
)
)
)
(linear): Linear(128x0e -> 128x0e | 16384 weights)
)
)
(readouts): ModuleList(
(0): LinearReadoutBlock(
(linear): Linear(128x0e -> 1x0e | 128 weights)
)
(1): NonLinearReadoutBlock(
(linear_1): Linear(128x0e -> 16x0e | 2048 weights)
(non_linearity): Activation [x] (16x0e -> 16x0e)
(linear_2): Linear(16x0e -> 1x0e | 16 weights)
)
)
(scale_shift): ScaleShiftBlock(scale=0.804154, shift=0.164097)
)
2023-12-10 21:12:58.638 INFO: Number of parameters: 3847696
2023-12-10 21:12:58.638 INFO: Optimizer: Adam (
Parameter Group 0
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0005368709120000003
maximize: False
name: embedding
weight_decay: 0.0
Parameter Group 1
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0005368709120000003
maximize: False
name: interactions_decay
weight_decay: 1e-08
Parameter Group 2
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0005368709120000003
maximize: False
name: interactions_no_decay
weight_decay: 0.0
Parameter Group 3
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0005368709120000003
maximize: False
name: products
weight_decay: 1e-08
Parameter Group 4
amsgrad: True
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.0005368709120000003
maximize: False
name: readouts
weight_decay: 0.0
)
2023-12-10 21:12:58.639 INFO: Using Weights and Biases for logging
2023-12-10 21:13:11.108 INFO: Using gradient clipping with tolerance=100.000
2023-12-10 21:13:11.108 INFO: Started training
2023-12-10 21:13:18.349 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.349 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.349 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.349 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.350 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.351 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:13:18.352 INFO: Reducer buckets have been rebuilt in this iteration.
2023-12-10 21:25:44.326 INFO: Epoch 247: loss=2.4249e-03, MAE_E_per_atom=12.3254 meV, MAE_F=69.6555 meV / A, MAE_stress_per_atom=0.1756 meV / A^3
2023-12-10 21:29:57.391 INFO: Epoch 248: loss=2.4369e-03, MAE_E_per_atom=12.4797 meV, MAE_F=69.6579 meV / A, MAE_stress_per_atom=0.1736 meV / A^3
2023-12-10 21:34:08.076 INFO: Epoch 249: loss=2.4476e-03, MAE_E_per_atom=12.4997 meV, MAE_F=69.8884 meV / A, MAE_stress_per_atom=0.1752 meV / A^3
2023-12-10 21:34:08.276 INFO: Training complete
2023-12-10 21:34:08.276 INFO: Computing metrics for training, validation, and test sets
2023-12-10 21:34:08.283 INFO: Loading checkpoint: checkpoints/05-128-L0_run-1_epoch-249.pt
2023-12-10 21:34:08.552 INFO: Loaded model from epoch 249
2023-12-10 21:34:08.552 INFO: Evaluating train ...
2023-12-10 21:35:34.929 INFO: Evaluating valid ...
2023-12-10 21:35:35.979 INFO:
+-------------+--------------------+-----------------+------------------+
| config_type | MAE E / meV / atom | MAE F / meV / A | relative F MAE % |
+-------------+--------------------+-----------------+------------------+
| train | 12.5 | 67.4 | 42.67 |
| valid | 13.0 | 69.8 | 49.64 |
+-------------+--------------------+-----------------+------------------+
2023-12-10 21:35:35.979 INFO: Saving model to checkpoints/05-128-L0_run-1.model
2023-12-10 21:35:36.140 INFO: Done