bigscience-bot
commited on
Commit
·
90a29d1
1
Parent(s):
a4e5b13
new data
Browse files- logs/main_log.txt +144 -0
logs/main_log.txt
CHANGED
@@ -106577,3 +106577,147 @@ time (ms)
|
|
106577 |
time (ms)
|
106578 |
iteration 2543/ 292968 | consumed samples: 5208064 | consumed tokens: 677871616 | elapsed time per iteration (ms): 133073.7 | learning rate: 1.000E-04 | global batch size: 2048 | lm loss: 3.731538E+00 | loss scale: 131072.0 | grad norm: 53881.397 | num zeros: 0.0 | curriculum seqlen: 200 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
106579 |
time (ms)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
106577 |
time (ms)
|
106578 |
iteration 2543/ 292968 | consumed samples: 5208064 | consumed tokens: 677871616 | elapsed time per iteration (ms): 133073.7 | learning rate: 1.000E-04 | global batch size: 2048 | lm loss: 3.731538E+00 | loss scale: 131072.0 | grad norm: 53881.397 | num zeros: 0.0 | curriculum seqlen: 200 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
106579 |
time (ms)
|
106580 |
+
iteration 2544/ 292968 | consumed samples: 5210112 | consumed tokens: 678281216 | elapsed time per iteration (ms): 130908.9 | learning rate: 1.000E-04 | global batch size: 2048 | lm loss: 3.711176E+00 | loss scale: 131072.0 | grad norm: 46917.614 | num zeros: 0.0 | curriculum seqlen: 200 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
106581 |
+
time (ms)
|
106582 |
+
iteration 2545/ 292968 | consumed samples: 5212160 | consumed tokens: 678690816 | elapsed time per iteration (ms): 130423.1 | learning rate: 1.000E-04 | global batch size: 2048 | lm loss: 3.679188E+00 | loss scale: 131072.0 | grad norm: 44071.737 | num zeros: 0.0 | curriculum seqlen: 200 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
106583 |
+
time (ms)
|
106584 |
+
iteration 2546/ 292968 | consumed samples: 5214208 | consumed tokens: 679100416 | elapsed time per iteration (ms): 138117.6 | learning rate: 1.000E-04 | global batch size: 2048 | lm loss: 3.696059E+00 | loss scale: 131072.0 | grad norm: 52838.337 | num zeros: 0.0 | curriculum seqlen: 200 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
106585 |
+
time (ms)
|
106586 |
+
iteration 2547/ 292968 | consumed samples: 5216256 | consumed tokens: 679510016 | elapsed time per iteration (ms): 134088.6 | learning rate: 1.000E-04 | global batch size: 2048 | lm loss: 3.703378E+00 | loss scale: 131072.0 | grad norm: 60797.403 | num zeros: 0.0 | curriculum seqlen: 200 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
106587 |
+
time (ms)
|
106588 |
+
iteration 2548/ 292968 | consumed samples: 5218304 | consumed tokens: 679919616 | elapsed time per iteration (ms): 134911.2 | learning rate: 1.000E-04 | global batch size: 2048 | lm loss: 3.702210E+00 | loss scale: 131072.0 | grad norm: 50331.478 | num zeros: 0.0 | curriculum seqlen: 200 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
106589 |
+
time (ms)
|
106590 |
+
saving checkpoint at iteration 2548 to /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints
|
106591 |
+
[2021-10-28 17:13:39,733] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/mp_rank_00_model_states.pt
|
106592 |
+
[2021-10-28 17:13:39,941] [INFO] [logging.py:68:log_dist] [Rank 1] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/mp_rank_01_model_states.pt
|
106593 |
+
[2021-10-28 17:13:53,455] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_20_optim_states.pt
|
106594 |
+
[2021-10-28 17:13:53,470] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_58_optim_states.pt
|
106595 |
+
[2021-10-28 17:13:53,486] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_13_optim_states.pt
|
106596 |
+
[2021-10-28 17:13:53,523] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_16_optim_states.pt
|
106597 |
+
[2021-10-28 17:13:53,532] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_55_optim_states.pt
|
106598 |
+
[2021-10-28 17:13:53,559] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_15_optim_states.pt
|
106599 |
+
[2021-10-28 17:13:53,578] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_25_optim_states.pt
|
106600 |
+
[2021-10-28 17:13:53,597] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_09_optim_states.pt
|
106601 |
+
[2021-10-28 17:13:53,607] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_24_optim_states.pt
|
106602 |
+
[2021-10-28 17:13:53,662] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_38_optim_states.pt
|
106603 |
+
[2021-10-28 17:13:53,665] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_49_optim_states.pt
|
106604 |
+
[2021-10-28 17:13:53,672] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_44_optim_states.pt
|
106605 |
+
[2021-10-28 17:13:53,688] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_53_optim_states.pt
|
106606 |
+
[2021-10-28 17:13:53,718] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_51_optim_states.pt
|
106607 |
+
[2021-10-28 17:13:53,753] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_22_optim_states.pt
|
106608 |
+
[2021-10-28 17:13:53,787] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_88_optim_states.pt
|
106609 |
+
[2021-10-28 17:13:53,798] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_62_optim_states.pt
|
106610 |
+
[2021-10-28 17:13:53,808] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_10_optim_states.pt
|
106611 |
+
[2021-10-28 17:13:53,817] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_60_optim_states.pt
|
106612 |
+
[2021-10-28 17:13:53,876] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_41_optim_states.pt
|
106613 |
+
[2021-10-28 17:13:53,882] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_46_optim_states.pt
|
106614 |
+
[2021-10-28 17:13:53,882] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_73_optim_states.pt
|
106615 |
+
[2021-10-28 17:13:53,884] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_56_optim_states.pt
|
106616 |
+
[2021-10-28 17:13:53,929] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_40_optim_states.pt
|
106617 |
+
[2021-10-28 17:13:53,954] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_35_optim_states.pt
|
106618 |
+
[2021-10-28 17:13:53,966] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_101_optim_states.pt
|
106619 |
+
[2021-10-28 17:13:54,034] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_91_optim_states.pt
|
106620 |
+
[2021-10-28 17:13:54,043] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_74_optim_states.pt
|
106621 |
+
[2021-10-28 17:13:54,206] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_99_optim_states.pt
|
106622 |
+
[2021-10-28 17:13:54,207] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_34_optim_states.pt
|
106623 |
+
[2021-10-28 17:13:54,220] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_17_optim_states.pt
|
106624 |
+
[2021-10-28 17:13:54,542] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_102_optim_states.pt
|
106625 |
+
[2021-10-28 17:13:54,564] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_43_optim_states.pt
|
106626 |
+
[2021-10-28 17:13:54,565] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_77_optim_states.pt
|
106627 |
+
[2021-10-28 17:13:54,589] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_28_optim_states.pt
|
106628 |
+
[2021-10-28 17:13:54,591] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_04_optim_states.pt
|
106629 |
+
[2021-10-28 17:13:54,600] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_70_optim_states.pt
|
106630 |
+
[2021-10-28 17:13:54,615] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_14_optim_states.pt
|
106631 |
+
[2021-10-28 17:13:54,620] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_107_optim_states.pt
|
106632 |
+
[2021-10-28 17:13:54,624] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_26_optim_states.pt
|
106633 |
+
[2021-10-28 17:13:54,631] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_37_optim_states.pt
|
106634 |
+
[2021-10-28 17:13:54,645] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_50_optim_states.pt
|
106635 |
+
[2021-10-28 17:13:54,650] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_57_optim_states.pt
|
106636 |
+
[2021-10-28 17:13:54,679] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_27_optim_states.pt
|
106637 |
+
[2021-10-28 17:13:54,679] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_23_optim_states.pt
|
106638 |
+
[2021-10-28 17:13:54,681] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_29_optim_states.pt
|
106639 |
+
[2021-10-28 17:13:54,689] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_84_optim_states.pt
|
106640 |
+
[2021-10-28 17:13:54,690] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_114_optim_states.pt
|
106641 |
+
[2021-10-28 17:13:54,698] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_52_optim_states.pt
|
106642 |
+
[2021-10-28 17:13:54,707] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_19_optim_states.pt
|
106643 |
+
[2021-10-28 17:13:54,717] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_61_optim_states.pt
|
106644 |
+
[2021-10-28 17:13:54,725] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_47_optim_states.pt
|
106645 |
+
[2021-10-28 17:13:54,780] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_59_optim_states.pt
|
106646 |
+
[2021-10-28 17:13:54,790] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_97_optim_states.pt
|
106647 |
+
[2021-10-28 17:13:54,796] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_63_optim_states.pt
|
106648 |
+
[2021-10-28 17:13:54,804] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_109_optim_states.pt
|
106649 |
+
[2021-10-28 17:13:54,819] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_11_optim_states.pt
|
106650 |
+
[2021-10-28 17:13:54,848] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_06_optim_states.pt
|
106651 |
+
[2021-10-28 17:13:54,859] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_12_optim_states.pt
|
106652 |
+
[2021-10-28 17:13:54,874] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_48_optim_states.pt
|
106653 |
+
[2021-10-28 17:13:54,895] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_21_optim_states.pt
|
106654 |
+
[2021-10-28 17:13:54,899] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_54_optim_states.pt
|
106655 |
+
[2021-10-28 17:13:54,914] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_18_optim_states.pt
|
106656 |
+
[2021-10-28 17:13:54,916] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_122_optim_states.pt
|
106657 |
+
[2021-10-28 17:13:54,941] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_111_optim_states.pt
|
106658 |
+
[2021-10-28 17:13:54,965] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_79_optim_states.pt
|
106659 |
+
[2021-10-28 17:13:54,974] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_33_optim_states.pt
|
106660 |
+
[2021-10-28 17:13:54,975] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_80_optim_states.pt
|
106661 |
+
[2021-10-28 17:13:54,978] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_42_optim_states.pt
|
106662 |
+
[2021-10-28 17:13:54,983] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_87_optim_states.pt
|
106663 |
+
[2021-10-28 17:13:55,002] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_68_optim_states.pt
|
106664 |
+
[2021-10-28 17:13:55,008] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_08_optim_states.pt
|
106665 |
+
[2021-10-28 17:13:55,015] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_32_optim_states.pt
|
106666 |
+
[2021-10-28 17:13:55,030] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_103_optim_states.pt
|
106667 |
+
[2021-10-28 17:13:55,034] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_100_optim_states.pt
|
106668 |
+
[2021-10-28 17:13:55,047] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_123_optim_states.pt
|
106669 |
+
[2021-10-28 17:13:55,079] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_106_optim_states.pt
|
106670 |
+
[2021-10-28 17:13:55,080] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_117_optim_states.pt
|
106671 |
+
[2021-10-28 17:13:55,092] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_65_optim_states.pt
|
106672 |
+
[2021-10-28 17:13:55,098] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_116_optim_states.pt
|
106673 |
+
[2021-10-28 17:13:55,101] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_113_optim_states.pt
|
106674 |
+
[2021-10-28 17:13:55,127] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_64_optim_states.pt
|
106675 |
+
[2021-10-28 17:13:55,144] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_45_optim_states.pt
|
106676 |
+
[2021-10-28 17:13:55,160] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_96_optim_states.pt
|
106677 |
+
[2021-10-28 17:13:55,248] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_98_optim_states.pt
|
106678 |
+
[2021-10-28 17:13:55,250] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_31_optim_states.pt
|
106679 |
+
[2021-10-28 17:13:55,252] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_30_optim_states.pt
|
106680 |
+
[2021-10-28 17:13:55,264] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_105_optim_states.pt
|
106681 |
+
[2021-10-28 17:13:55,392] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_104_optim_states.pt
|
106682 |
+
[2021-10-28 17:13:55,441] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_81_optim_states.pt
|
106683 |
+
[2021-10-28 17:13:55,464] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_83_optim_states.pt
|
106684 |
+
[2021-10-28 17:13:55,506] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_82_optim_states.pt
|
106685 |
+
[2021-10-28 17:13:55,520] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_112_optim_states.pt
|
106686 |
+
[2021-10-28 17:13:55,559] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_118_optim_states.pt
|
106687 |
+
[2021-10-28 17:13:55,563] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_69_optim_states.pt
|
106688 |
+
[2021-10-28 17:13:55,590] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_115_optim_states.pt
|
106689 |
+
[2021-10-28 17:13:55,596] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_71_optim_states.pt
|
106690 |
+
[2021-10-28 17:13:55,603] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_78_optim_states.pt
|
106691 |
+
[2021-10-28 17:13:55,668] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_121_optim_states.pt
|
106692 |
+
[2021-10-28 17:13:55,685] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_119_optim_states.pt
|
106693 |
+
[2021-10-28 17:13:55,693] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_120_optim_states.pt
|
106694 |
+
[2021-10-28 17:13:55,749] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_66_optim_states.pt
|
106695 |
+
[2021-10-28 17:13:55,789] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_76_optim_states.pt
|
106696 |
+
[2021-10-28 17:13:55,801] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_67_optim_states.pt
|
106697 |
+
[2021-10-28 17:13:56,296] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_00_optim_states.pt
|
106698 |
+
[2021-10-28 17:13:56,445] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_124_optim_states.pt
|
106699 |
+
[2021-10-28 17:13:56,612] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_03_optim_states.pt
|
106700 |
+
[2021-10-28 17:13:56,740] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_127_optim_states.pt
|
106701 |
+
[2021-10-28 17:13:57,647] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_07_optim_states.pt
|
106702 |
+
[2021-10-28 17:13:57,658] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_02_optim_states.pt
|
106703 |
+
[2021-10-28 17:13:57,909] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_126_optim_states.pt
|
106704 |
+
[2021-10-28 17:13:57,967] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_05_optim_states.pt
|
106705 |
+
[2021-10-28 17:13:58,010] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_125_optim_states.pt
|
106706 |
+
[2021-10-28 17:13:58,032] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_01_optim_states.pt
|
106707 |
+
[2021-10-28 17:14:00,209] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_36_optim_states.pt
|
106708 |
+
[2021-10-28 17:14:00,309] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_90_optim_states.pt
|
106709 |
+
[2021-10-28 17:14:00,312] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_110_optim_states.pt
|
106710 |
+
[2021-10-28 17:14:00,743] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_39_optim_states.pt
|
106711 |
+
[2021-10-28 17:14:00,748] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_89_optim_states.pt
|
106712 |
+
[2021-10-28 17:14:01,156] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_108_optim_states.pt
|
106713 |
+
[2021-10-28 17:14:01,205] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_93_optim_states.pt
|
106714 |
+
[2021-10-28 17:14:02,800] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_95_optim_states.pt
|
106715 |
+
[2021-10-28 17:14:03,558] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_94_optim_states.pt
|
106716 |
+
[2021-10-28 17:14:04,345] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_86_optim_states.pt
|
106717 |
+
[2021-10-28 17:14:04,864] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_75_optim_states.pt
|
106718 |
+
[2021-10-28 17:14:05,058] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_85_optim_states.pt
|
106719 |
+
[2021-10-28 17:14:05,492] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_92_optim_states.pt
|
106720 |
+
[2021-10-28 17:14:06,165] [INFO] [engine.py:2540:_save_zero_checkpoint] zero checkpoint saved /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints/global_step2548/zero_pp_rank_0_mp_rank_72_optim_states.pt
|
106721 |
+
successfully saved checkpoint at iteration 2548 to /gpfsscratch/rech/six/commun/checkpoints/tr8b-104B/checkpoints
|
106722 |
+
time (ms) | save-checkpoint: 29252.10
|
106723 |
+
[exiting program after 1190.94853798151 minutes] datetime: 2021-10-28 17:14:06
|