husseinmo's picture
Upload folder using huggingface_hub
5fdc1bb verified
raw
history blame
121 kB
[2024-12-21 13:04:47,087][02089] Saving configuration to /content/train_dir/default_experiment/config.json...
[2024-12-21 13:04:47,090][02089] Rollout worker 0 uses device cpu
[2024-12-21 13:04:47,094][02089] Rollout worker 1 uses device cpu
[2024-12-21 13:04:47,095][02089] Rollout worker 2 uses device cpu
[2024-12-21 13:04:47,097][02089] Rollout worker 3 uses device cpu
[2024-12-21 13:04:47,098][02089] Rollout worker 4 uses device cpu
[2024-12-21 13:04:47,099][02089] Rollout worker 5 uses device cpu
[2024-12-21 13:04:47,102][02089] Rollout worker 6 uses device cpu
[2024-12-21 13:04:47,103][02089] Rollout worker 7 uses device cpu
[2024-12-21 13:04:47,304][02089] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-12-21 13:04:47,308][02089] InferenceWorker_p0-w0: min num requests: 2
[2024-12-21 13:04:47,352][02089] Starting all processes...
[2024-12-21 13:04:47,355][02089] Starting process learner_proc0
[2024-12-21 13:04:47,418][02089] Starting all processes...
[2024-12-21 13:04:47,435][02089] Starting process inference_proc0-0
[2024-12-21 13:04:47,436][02089] Starting process rollout_proc0
[2024-12-21 13:04:47,436][02089] Starting process rollout_proc1
[2024-12-21 13:04:47,436][02089] Starting process rollout_proc2
[2024-12-21 13:04:47,436][02089] Starting process rollout_proc3
[2024-12-21 13:04:47,436][02089] Starting process rollout_proc4
[2024-12-21 13:04:47,436][02089] Starting process rollout_proc5
[2024-12-21 13:04:47,436][02089] Starting process rollout_proc6
[2024-12-21 13:04:47,436][02089] Starting process rollout_proc7
[2024-12-21 13:05:04,274][04429] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-12-21 13:05:04,284][04429] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2024-12-21 13:05:04,360][04429] Num visible devices: 1
[2024-12-21 13:05:04,414][04429] Starting seed is not provided
[2024-12-21 13:05:04,415][04429] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-12-21 13:05:04,416][04429] Initializing actor-critic model on device cuda:0
[2024-12-21 13:05:04,417][04429] RunningMeanStd input shape: (3, 72, 128)
[2024-12-21 13:05:04,420][04429] RunningMeanStd input shape: (1,)
[2024-12-21 13:05:04,528][04429] ConvEncoder: input_channels=3
[2024-12-21 13:05:05,174][04446] Worker 3 uses CPU cores [1]
[2024-12-21 13:05:05,180][04447] Worker 4 uses CPU cores [0]
[2024-12-21 13:05:05,226][04444] Worker 0 uses CPU cores [0]
[2024-12-21 13:05:05,350][04442] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-12-21 13:05:05,352][04442] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2024-12-21 13:05:05,362][04449] Worker 6 uses CPU cores [0]
[2024-12-21 13:05:05,383][04450] Worker 7 uses CPU cores [1]
[2024-12-21 13:05:05,409][04429] Conv encoder output size: 512
[2024-12-21 13:05:05,410][04429] Policy head output size: 512
[2024-12-21 13:05:05,418][04448] Worker 5 uses CPU cores [1]
[2024-12-21 13:05:05,417][04442] Num visible devices: 1
[2024-12-21 13:05:05,488][04443] Worker 1 uses CPU cores [1]
[2024-12-21 13:05:05,501][04445] Worker 2 uses CPU cores [0]
[2024-12-21 13:05:05,505][04429] Created Actor Critic model with architecture:
[2024-12-21 13:05:05,506][04429] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2024-12-21 13:05:05,917][04429] Using optimizer <class 'torch.optim.adam.Adam'>
[2024-12-21 13:05:07,293][02089] Heartbeat connected on Batcher_0
[2024-12-21 13:05:07,305][02089] Heartbeat connected on InferenceWorker_p0-w0
[2024-12-21 13:05:07,317][02089] Heartbeat connected on RolloutWorker_w0
[2024-12-21 13:05:07,323][02089] Heartbeat connected on RolloutWorker_w1
[2024-12-21 13:05:07,329][02089] Heartbeat connected on RolloutWorker_w2
[2024-12-21 13:05:07,332][02089] Heartbeat connected on RolloutWorker_w3
[2024-12-21 13:05:07,337][02089] Heartbeat connected on RolloutWorker_w4
[2024-12-21 13:05:07,343][02089] Heartbeat connected on RolloutWorker_w5
[2024-12-21 13:05:07,347][02089] Heartbeat connected on RolloutWorker_w6
[2024-12-21 13:05:07,355][02089] Heartbeat connected on RolloutWorker_w7
[2024-12-21 13:05:09,432][04429] No checkpoints found
[2024-12-21 13:05:09,432][04429] Did not load from checkpoint, starting from scratch!
[2024-12-21 13:05:09,432][04429] Initialized policy 0 weights for model version 0
[2024-12-21 13:05:09,435][04429] LearnerWorker_p0 finished initialization!
[2024-12-21 13:05:09,438][04429] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-12-21 13:05:09,436][02089] Heartbeat connected on LearnerWorker_p0
[2024-12-21 13:05:09,646][04442] RunningMeanStd input shape: (3, 72, 128)
[2024-12-21 13:05:09,648][04442] RunningMeanStd input shape: (1,)
[2024-12-21 13:05:09,661][04442] ConvEncoder: input_channels=3
[2024-12-21 13:05:09,766][04442] Conv encoder output size: 512
[2024-12-21 13:05:09,766][04442] Policy head output size: 512
[2024-12-21 13:05:09,818][02089] Inference worker 0-0 is ready!
[2024-12-21 13:05:09,822][02089] All inference workers are ready! Signal rollout workers to start!
[2024-12-21 13:05:10,021][04450] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-21 13:05:10,023][04448] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-21 13:05:10,018][04446] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-21 13:05:10,020][04443] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-21 13:05:10,018][04445] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-21 13:05:10,032][04447] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-21 13:05:10,035][04449] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-21 13:05:10,021][04444] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-21 13:05:11,576][04449] Decorrelating experience for 0 frames...
[2024-12-21 13:05:11,576][04446] Decorrelating experience for 0 frames...
[2024-12-21 13:05:11,578][04450] Decorrelating experience for 0 frames...
[2024-12-21 13:05:11,579][04444] Decorrelating experience for 0 frames...
[2024-12-21 13:05:11,581][04447] Decorrelating experience for 0 frames...
[2024-12-21 13:05:11,582][04443] Decorrelating experience for 0 frames...
[2024-12-21 13:05:11,651][02089] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2024-12-21 13:05:12,676][04443] Decorrelating experience for 32 frames...
[2024-12-21 13:05:12,678][04450] Decorrelating experience for 32 frames...
[2024-12-21 13:05:13,287][04449] Decorrelating experience for 32 frames...
[2024-12-21 13:05:13,291][04447] Decorrelating experience for 32 frames...
[2024-12-21 13:05:13,298][04445] Decorrelating experience for 0 frames...
[2024-12-21 13:05:13,324][04444] Decorrelating experience for 32 frames...
[2024-12-21 13:05:13,433][04450] Decorrelating experience for 64 frames...
[2024-12-21 13:05:15,035][04448] Decorrelating experience for 0 frames...
[2024-12-21 13:05:15,109][04446] Decorrelating experience for 32 frames...
[2024-12-21 13:05:15,169][04445] Decorrelating experience for 32 frames...
[2024-12-21 13:05:15,481][04450] Decorrelating experience for 96 frames...
[2024-12-21 13:05:15,658][04444] Decorrelating experience for 64 frames...
[2024-12-21 13:05:15,674][04449] Decorrelating experience for 64 frames...
[2024-12-21 13:05:16,651][02089] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2024-12-21 13:05:16,733][04447] Decorrelating experience for 64 frames...
[2024-12-21 13:05:17,164][04443] Decorrelating experience for 64 frames...
[2024-12-21 13:05:17,395][04448] Decorrelating experience for 32 frames...
[2024-12-21 13:05:17,460][04445] Decorrelating experience for 64 frames...
[2024-12-21 13:05:18,902][04446] Decorrelating experience for 64 frames...
[2024-12-21 13:05:19,467][04447] Decorrelating experience for 96 frames...
[2024-12-21 13:05:20,012][04443] Decorrelating experience for 96 frames...
[2024-12-21 13:05:20,079][04449] Decorrelating experience for 96 frames...
[2024-12-21 13:05:20,371][04445] Decorrelating experience for 96 frames...
[2024-12-21 13:05:20,930][04448] Decorrelating experience for 64 frames...
[2024-12-21 13:05:21,135][04444] Decorrelating experience for 96 frames...
[2024-12-21 13:05:21,651][02089] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 7.2. Samples: 72. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2024-12-21 13:05:21,653][02089] Avg episode reward: [(0, '2.205')]
[2024-12-21 13:05:22,326][04446] Decorrelating experience for 96 frames...
[2024-12-21 13:05:24,377][04429] Signal inference workers to stop experience collection...
[2024-12-21 13:05:24,406][04442] InferenceWorker_p0-w0: stopping experience collection
[2024-12-21 13:05:24,440][04448] Decorrelating experience for 96 frames...
[2024-12-21 13:05:26,651][02089] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 169.5. Samples: 2542. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2024-12-21 13:05:26,653][02089] Avg episode reward: [(0, '2.565')]
[2024-12-21 13:05:27,175][04429] Signal inference workers to resume experience collection...
[2024-12-21 13:05:27,178][04442] InferenceWorker_p0-w0: resuming experience collection
[2024-12-21 13:05:31,651][02089] Fps is (10 sec: 2048.0, 60 sec: 1024.0, 300 sec: 1024.0). Total num frames: 20480. Throughput: 0: 311.8. Samples: 6236. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:05:31,655][02089] Avg episode reward: [(0, '3.579')]
[2024-12-21 13:05:36,651][02089] Fps is (10 sec: 3686.0, 60 sec: 1474.5, 300 sec: 1474.5). Total num frames: 36864. Throughput: 0: 331.4. Samples: 8286. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:05:36,654][02089] Avg episode reward: [(0, '3.765')]
[2024-12-21 13:05:37,420][04442] Updated weights for policy 0, policy_version 10 (0.0154)
[2024-12-21 13:05:41,651][02089] Fps is (10 sec: 3686.4, 60 sec: 1911.5, 300 sec: 1911.5). Total num frames: 57344. Throughput: 0: 452.5. Samples: 13574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:05:41,657][02089] Avg episode reward: [(0, '4.365')]
[2024-12-21 13:05:46,651][02089] Fps is (10 sec: 4096.3, 60 sec: 2223.5, 300 sec: 2223.5). Total num frames: 77824. Throughput: 0: 565.3. Samples: 19786. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:05:46,657][02089] Avg episode reward: [(0, '4.539')]
[2024-12-21 13:05:47,047][04442] Updated weights for policy 0, policy_version 20 (0.0022)
[2024-12-21 13:05:51,651][02089] Fps is (10 sec: 3686.4, 60 sec: 2355.2, 300 sec: 2355.2). Total num frames: 94208. Throughput: 0: 569.0. Samples: 22762. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:05:51,653][02089] Avg episode reward: [(0, '4.403')]
[2024-12-21 13:05:56,651][02089] Fps is (10 sec: 3276.8, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 110592. Throughput: 0: 601.8. Samples: 27080. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:05:56,657][02089] Avg episode reward: [(0, '4.304')]
[2024-12-21 13:05:56,659][04429] Saving new best policy, reward=4.304!
[2024-12-21 13:05:58,480][04442] Updated weights for policy 0, policy_version 30 (0.0025)
[2024-12-21 13:06:01,651][02089] Fps is (10 sec: 4096.0, 60 sec: 2703.4, 300 sec: 2703.4). Total num frames: 135168. Throughput: 0: 755.2. Samples: 33984. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:06:01,656][02089] Avg episode reward: [(0, '4.341')]
[2024-12-21 13:06:01,663][04429] Saving new best policy, reward=4.341!
[2024-12-21 13:06:06,651][02089] Fps is (10 sec: 4505.6, 60 sec: 2830.0, 300 sec: 2830.0). Total num frames: 155648. Throughput: 0: 831.3. Samples: 37482. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:06:06,656][02089] Avg episode reward: [(0, '4.315')]
[2024-12-21 13:06:08,778][04442] Updated weights for policy 0, policy_version 40 (0.0019)
[2024-12-21 13:06:11,651][02089] Fps is (10 sec: 3686.4, 60 sec: 2867.2, 300 sec: 2867.2). Total num frames: 172032. Throughput: 0: 879.8. Samples: 42134. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-12-21 13:06:11,659][02089] Avg episode reward: [(0, '4.315')]
[2024-12-21 13:06:16,650][02089] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 2961.7). Total num frames: 192512. Throughput: 0: 929.3. Samples: 48056. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:06:16,653][02089] Avg episode reward: [(0, '4.364')]
[2024-12-21 13:06:16,658][04429] Saving new best policy, reward=4.364!
[2024-12-21 13:06:18,927][04442] Updated weights for policy 0, policy_version 50 (0.0035)
[2024-12-21 13:06:21,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3042.7). Total num frames: 212992. Throughput: 0: 961.7. Samples: 51562. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:06:21,653][02089] Avg episode reward: [(0, '4.384')]
[2024-12-21 13:06:21,670][04429] Saving new best policy, reward=4.384!
[2024-12-21 13:06:26,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3058.3). Total num frames: 229376. Throughput: 0: 969.4. Samples: 57196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:06:26,653][02089] Avg episode reward: [(0, '4.423')]
[2024-12-21 13:06:26,655][04429] Saving new best policy, reward=4.423!
[2024-12-21 13:06:30,610][04442] Updated weights for policy 0, policy_version 60 (0.0024)
[2024-12-21 13:06:31,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3123.2). Total num frames: 249856. Throughput: 0: 942.5. Samples: 62200. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:06:31,652][02089] Avg episode reward: [(0, '4.441')]
[2024-12-21 13:06:31,664][04429] Saving new best policy, reward=4.441!
[2024-12-21 13:06:36,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3891.3, 300 sec: 3180.4). Total num frames: 270336. Throughput: 0: 953.3. Samples: 65662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:06:36,653][02089] Avg episode reward: [(0, '4.496')]
[2024-12-21 13:06:36,659][04429] Saving new best policy, reward=4.496!
[2024-12-21 13:06:40,186][04442] Updated weights for policy 0, policy_version 70 (0.0021)
[2024-12-21 13:06:41,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3231.3). Total num frames: 290816. Throughput: 0: 997.8. Samples: 71980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:06:41,653][02089] Avg episode reward: [(0, '4.420')]
[2024-12-21 13:06:41,664][04429] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000071_290816.pth...
[2024-12-21 13:06:46,651][02089] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3190.6). Total num frames: 303104. Throughput: 0: 934.4. Samples: 76032. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:06:46,657][02089] Avg episode reward: [(0, '4.457')]
[2024-12-21 13:06:51,568][04442] Updated weights for policy 0, policy_version 80 (0.0015)
[2024-12-21 13:06:51,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3276.8). Total num frames: 327680. Throughput: 0: 926.6. Samples: 79178. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:06:51,658][02089] Avg episode reward: [(0, '4.511')]
[2024-12-21 13:06:51,673][04429] Saving new best policy, reward=4.511!
[2024-12-21 13:06:56,652][02089] Fps is (10 sec: 4505.0, 60 sec: 3959.4, 300 sec: 3315.8). Total num frames: 348160. Throughput: 0: 975.0. Samples: 86010. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:06:56,657][02089] Avg episode reward: [(0, '4.370')]
[2024-12-21 13:07:01,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3276.8). Total num frames: 360448. Throughput: 0: 945.5. Samples: 90602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:07:01,653][02089] Avg episode reward: [(0, '4.417')]
[2024-12-21 13:07:04,735][04442] Updated weights for policy 0, policy_version 90 (0.0018)
[2024-12-21 13:07:06,654][02089] Fps is (10 sec: 2457.1, 60 sec: 3617.9, 300 sec: 3241.1). Total num frames: 372736. Throughput: 0: 904.8. Samples: 92280. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:07:06,662][02089] Avg episode reward: [(0, '4.565')]
[2024-12-21 13:07:06,668][04429] Saving new best policy, reward=4.565!
[2024-12-21 13:07:11,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3276.8). Total num frames: 393216. Throughput: 0: 881.6. Samples: 96868. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:07:11,653][02089] Avg episode reward: [(0, '4.783')]
[2024-12-21 13:07:11,666][04429] Saving new best policy, reward=4.783!
[2024-12-21 13:07:15,168][04442] Updated weights for policy 0, policy_version 100 (0.0014)
[2024-12-21 13:07:16,651][02089] Fps is (10 sec: 4097.4, 60 sec: 3686.4, 300 sec: 3309.6). Total num frames: 413696. Throughput: 0: 919.1. Samples: 103558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:07:16,655][02089] Avg episode reward: [(0, '4.709')]
[2024-12-21 13:07:21,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3276.8). Total num frames: 425984. Throughput: 0: 895.2. Samples: 105948. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-12-21 13:07:21,659][02089] Avg episode reward: [(0, '4.630')]
[2024-12-21 13:07:26,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3307.1). Total num frames: 446464. Throughput: 0: 856.2. Samples: 110508. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:07:26,657][02089] Avg episode reward: [(0, '4.662')]
[2024-12-21 13:07:27,090][04442] Updated weights for policy 0, policy_version 110 (0.0034)
[2024-12-21 13:07:31,651][02089] Fps is (10 sec: 4505.5, 60 sec: 3686.4, 300 sec: 3364.6). Total num frames: 471040. Throughput: 0: 919.6. Samples: 117416. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:07:31,654][02089] Avg episode reward: [(0, '4.481')]
[2024-12-21 13:07:36,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3361.5). Total num frames: 487424. Throughput: 0: 927.8. Samples: 120930. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2024-12-21 13:07:36,653][02089] Avg episode reward: [(0, '4.483')]
[2024-12-21 13:07:37,083][04442] Updated weights for policy 0, policy_version 120 (0.0021)
[2024-12-21 13:07:41,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3549.8, 300 sec: 3358.7). Total num frames: 503808. Throughput: 0: 868.3. Samples: 125082. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-12-21 13:07:41,653][02089] Avg episode reward: [(0, '4.726')]
[2024-12-21 13:07:46,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3382.5). Total num frames: 524288. Throughput: 0: 906.9. Samples: 131412. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:07:46,653][02089] Avg episode reward: [(0, '4.780')]
[2024-12-21 13:07:47,852][04442] Updated weights for policy 0, policy_version 130 (0.0017)
[2024-12-21 13:07:51,651][02089] Fps is (10 sec: 4505.7, 60 sec: 3686.4, 300 sec: 3430.4). Total num frames: 548864. Throughput: 0: 944.3. Samples: 134770. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:07:51,653][02089] Avg episode reward: [(0, '4.650')]
[2024-12-21 13:07:56,651][02089] Fps is (10 sec: 3686.1, 60 sec: 3549.9, 300 sec: 3400.9). Total num frames: 561152. Throughput: 0: 957.4. Samples: 139952. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:07:56,656][02089] Avg episode reward: [(0, '4.554')]
[2024-12-21 13:07:59,729][04442] Updated weights for policy 0, policy_version 140 (0.0034)
[2024-12-21 13:08:01,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3421.4). Total num frames: 581632. Throughput: 0: 926.0. Samples: 145226. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:08:01,657][02089] Avg episode reward: [(0, '4.525')]
[2024-12-21 13:08:06,651][02089] Fps is (10 sec: 4096.4, 60 sec: 3823.1, 300 sec: 3440.6). Total num frames: 602112. Throughput: 0: 947.3. Samples: 148576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:08:06,653][02089] Avg episode reward: [(0, '4.576')]
[2024-12-21 13:08:08,521][04442] Updated weights for policy 0, policy_version 150 (0.0015)
[2024-12-21 13:08:11,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3458.8). Total num frames: 622592. Throughput: 0: 989.5. Samples: 155034. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:08:11,655][02089] Avg episode reward: [(0, '4.414')]
[2024-12-21 13:08:16,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3431.8). Total num frames: 634880. Throughput: 0: 931.7. Samples: 159340. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:08:16,654][02089] Avg episode reward: [(0, '4.510')]
[2024-12-21 13:08:20,302][04442] Updated weights for policy 0, policy_version 160 (0.0018)
[2024-12-21 13:08:21,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3470.8). Total num frames: 659456. Throughput: 0: 928.7. Samples: 162720. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:08:21,655][02089] Avg episode reward: [(0, '4.551')]
[2024-12-21 13:08:26,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3486.9). Total num frames: 679936. Throughput: 0: 992.8. Samples: 169758. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:08:26,656][02089] Avg episode reward: [(0, '4.719')]
[2024-12-21 13:08:31,109][04442] Updated weights for policy 0, policy_version 170 (0.0027)
[2024-12-21 13:08:31,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3481.6). Total num frames: 696320. Throughput: 0: 955.1. Samples: 174392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:08:31,656][02089] Avg episode reward: [(0, '4.789')]
[2024-12-21 13:08:31,666][04429] Saving new best policy, reward=4.789!
[2024-12-21 13:08:36,652][02089] Fps is (10 sec: 3685.7, 60 sec: 3822.8, 300 sec: 3496.6). Total num frames: 716800. Throughput: 0: 936.5. Samples: 176916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:08:36,657][02089] Avg episode reward: [(0, '4.597')]
[2024-12-21 13:08:40,735][04442] Updated weights for policy 0, policy_version 180 (0.0026)
[2024-12-21 13:08:41,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3530.4). Total num frames: 741376. Throughput: 0: 977.6. Samples: 183944. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:08:41,656][02089] Avg episode reward: [(0, '4.829')]
[2024-12-21 13:08:41,668][04429] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000181_741376.pth...
[2024-12-21 13:08:41,800][04429] Saving new best policy, reward=4.829!
[2024-12-21 13:08:46,651][02089] Fps is (10 sec: 4096.7, 60 sec: 3891.2, 300 sec: 3524.5). Total num frames: 757760. Throughput: 0: 982.1. Samples: 189420. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:08:46,656][02089] Avg episode reward: [(0, '5.042')]
[2024-12-21 13:08:46,659][04429] Saving new best policy, reward=5.042!
[2024-12-21 13:08:51,651][02089] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3500.2). Total num frames: 770048. Throughput: 0: 952.1. Samples: 191420. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:08:51,658][02089] Avg episode reward: [(0, '4.987')]
[2024-12-21 13:08:52,565][04442] Updated weights for policy 0, policy_version 190 (0.0014)
[2024-12-21 13:08:56,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3891.3, 300 sec: 3531.7). Total num frames: 794624. Throughput: 0: 949.8. Samples: 197776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:08:56,658][02089] Avg episode reward: [(0, '5.158')]
[2024-12-21 13:08:56,661][04429] Saving new best policy, reward=5.158!
[2024-12-21 13:09:01,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3543.9). Total num frames: 815104. Throughput: 0: 1001.5. Samples: 204406. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:09:01,653][02089] Avg episode reward: [(0, '5.051')]
[2024-12-21 13:09:01,977][04442] Updated weights for policy 0, policy_version 200 (0.0014)
[2024-12-21 13:09:06,651][02089] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3538.2). Total num frames: 831488. Throughput: 0: 973.0. Samples: 206504. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:09:06,656][02089] Avg episode reward: [(0, '5.111')]
[2024-12-21 13:09:11,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3549.9). Total num frames: 851968. Throughput: 0: 938.9. Samples: 212008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:09:11,654][02089] Avg episode reward: [(0, '5.073')]
[2024-12-21 13:09:12,974][04442] Updated weights for policy 0, policy_version 210 (0.0041)
[2024-12-21 13:09:16,651][02089] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3577.7). Total num frames: 876544. Throughput: 0: 992.7. Samples: 219062. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:09:16,653][02089] Avg episode reward: [(0, '5.167')]
[2024-12-21 13:09:16,657][04429] Saving new best policy, reward=5.167!
[2024-12-21 13:09:21,652][02089] Fps is (10 sec: 3685.9, 60 sec: 3822.8, 300 sec: 3555.3). Total num frames: 888832. Throughput: 0: 995.7. Samples: 221722. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:09:21,659][02089] Avg episode reward: [(0, '5.343')]
[2024-12-21 13:09:21,667][04429] Saving new best policy, reward=5.343!
[2024-12-21 13:09:24,804][04442] Updated weights for policy 0, policy_version 220 (0.0032)
[2024-12-21 13:09:26,650][02089] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3565.9). Total num frames: 909312. Throughput: 0: 933.2. Samples: 225938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:09:26,654][02089] Avg episode reward: [(0, '5.234')]
[2024-12-21 13:09:31,651][02089] Fps is (10 sec: 4096.5, 60 sec: 3891.2, 300 sec: 3576.1). Total num frames: 929792. Throughput: 0: 969.0. Samples: 233024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:09:31,655][02089] Avg episode reward: [(0, '5.326')]
[2024-12-21 13:09:33,555][04442] Updated weights for policy 0, policy_version 230 (0.0020)
[2024-12-21 13:09:36,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3891.3, 300 sec: 3585.9). Total num frames: 950272. Throughput: 0: 1001.1. Samples: 236468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:09:36,657][02089] Avg episode reward: [(0, '5.220')]
[2024-12-21 13:09:41,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3580.2). Total num frames: 966656. Throughput: 0: 960.8. Samples: 241012. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:09:41,653][02089] Avg episode reward: [(0, '5.122')]
[2024-12-21 13:09:45,068][04442] Updated weights for policy 0, policy_version 240 (0.0033)
[2024-12-21 13:09:46,650][02089] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3589.6). Total num frames: 987136. Throughput: 0: 950.5. Samples: 247178. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:09:46,653][02089] Avg episode reward: [(0, '4.885')]
[2024-12-21 13:09:51,652][02089] Fps is (10 sec: 4504.8, 60 sec: 4027.6, 300 sec: 3613.2). Total num frames: 1011712. Throughput: 0: 977.7. Samples: 250500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:09:51,658][02089] Avg episode reward: [(0, '5.059')]
[2024-12-21 13:09:56,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3578.6). Total num frames: 1019904. Throughput: 0: 964.7. Samples: 255418. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:09:56,664][02089] Avg episode reward: [(0, '5.079')]
[2024-12-21 13:09:56,928][04442] Updated weights for policy 0, policy_version 250 (0.0018)
[2024-12-21 13:10:01,651][02089] Fps is (10 sec: 2048.3, 60 sec: 3618.1, 300 sec: 3559.3). Total num frames: 1032192. Throughput: 0: 884.4. Samples: 258860. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:10:01,653][02089] Avg episode reward: [(0, '5.231')]
[2024-12-21 13:10:06,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 1052672. Throughput: 0: 881.9. Samples: 261406. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:10:06,655][02089] Avg episode reward: [(0, '4.993')]
[2024-12-21 13:10:08,413][04442] Updated weights for policy 0, policy_version 260 (0.0025)
[2024-12-21 13:10:11,651][02089] Fps is (10 sec: 4505.8, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 1077248. Throughput: 0: 941.7. Samples: 268314. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:10:11,653][02089] Avg episode reward: [(0, '4.861')]
[2024-12-21 13:10:16,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3707.2). Total num frames: 1093632. Throughput: 0: 904.4. Samples: 273724. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:10:16,656][02089] Avg episode reward: [(0, '4.945')]
[2024-12-21 13:10:20,275][04442] Updated weights for policy 0, policy_version 270 (0.0018)
[2024-12-21 13:10:21,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3686.5, 300 sec: 3762.8). Total num frames: 1110016. Throughput: 0: 873.0. Samples: 275754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:10:21,653][02089] Avg episode reward: [(0, '5.283')]
[2024-12-21 13:10:26,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 1134592. Throughput: 0: 920.6. Samples: 282438. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:10:26,655][02089] Avg episode reward: [(0, '5.577')]
[2024-12-21 13:10:26,658][04429] Saving new best policy, reward=5.577!
[2024-12-21 13:10:29,199][04442] Updated weights for policy 0, policy_version 280 (0.0031)
[2024-12-21 13:10:31,653][02089] Fps is (10 sec: 4504.7, 60 sec: 3754.5, 300 sec: 3790.5). Total num frames: 1155072. Throughput: 0: 924.4. Samples: 288778. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:10:31,659][02089] Avg episode reward: [(0, '5.511')]
[2024-12-21 13:10:36,653][02089] Fps is (10 sec: 3276.0, 60 sec: 3618.0, 300 sec: 3762.7). Total num frames: 1167360. Throughput: 0: 898.3. Samples: 290926. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:10:36,655][02089] Avg episode reward: [(0, '5.379')]
[2024-12-21 13:10:40,781][04442] Updated weights for policy 0, policy_version 290 (0.0019)
[2024-12-21 13:10:41,651][02089] Fps is (10 sec: 3687.1, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 1191936. Throughput: 0: 912.2. Samples: 296468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:10:41,655][02089] Avg episode reward: [(0, '5.203')]
[2024-12-21 13:10:41,664][04429] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000291_1191936.pth...
[2024-12-21 13:10:41,793][04429] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000071_290816.pth
[2024-12-21 13:10:46,651][02089] Fps is (10 sec: 4506.7, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 1212416. Throughput: 0: 991.7. Samples: 303486. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:10:46,656][02089] Avg episode reward: [(0, '5.729')]
[2024-12-21 13:10:46,660][04429] Saving new best policy, reward=5.729!
[2024-12-21 13:10:51,304][04442] Updated weights for policy 0, policy_version 300 (0.0033)
[2024-12-21 13:10:51,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3790.5). Total num frames: 1228800. Throughput: 0: 990.9. Samples: 305996. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:10:51,655][02089] Avg episode reward: [(0, '6.153')]
[2024-12-21 13:10:51,666][04429] Saving new best policy, reward=6.153!
[2024-12-21 13:10:56,651][02089] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 1245184. Throughput: 0: 933.4. Samples: 310318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:10:56,653][02089] Avg episode reward: [(0, '6.416')]
[2024-12-21 13:10:56,661][04429] Saving new best policy, reward=6.416!
[2024-12-21 13:11:01,336][04442] Updated weights for policy 0, policy_version 310 (0.0027)
[2024-12-21 13:11:01,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 1269760. Throughput: 0: 967.9. Samples: 317278. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:11:01,658][02089] Avg episode reward: [(0, '6.091')]
[2024-12-21 13:11:06,651][02089] Fps is (10 sec: 4505.7, 60 sec: 3959.5, 300 sec: 3790.5). Total num frames: 1290240. Throughput: 0: 1001.7. Samples: 320830. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:11:06,655][02089] Avg episode reward: [(0, '6.308')]
[2024-12-21 13:11:11,660][02089] Fps is (10 sec: 3273.8, 60 sec: 3754.1, 300 sec: 3762.7). Total num frames: 1302528. Throughput: 0: 955.7. Samples: 325454. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-12-21 13:11:11,663][02089] Avg episode reward: [(0, '6.546')]
[2024-12-21 13:11:11,676][04429] Saving new best policy, reward=6.546!
[2024-12-21 13:11:12,849][04442] Updated weights for policy 0, policy_version 320 (0.0035)
[2024-12-21 13:11:16,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 1327104. Throughput: 0: 952.2. Samples: 331626. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:11:16,653][02089] Avg episode reward: [(0, '6.799')]
[2024-12-21 13:11:16,659][04429] Saving new best policy, reward=6.799!
[2024-12-21 13:11:21,651][02089] Fps is (10 sec: 4509.7, 60 sec: 3959.5, 300 sec: 3790.5). Total num frames: 1347584. Throughput: 0: 979.4. Samples: 334996. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:11:21,656][02089] Avg episode reward: [(0, '6.661')]
[2024-12-21 13:11:22,124][04442] Updated weights for policy 0, policy_version 330 (0.0029)
[2024-12-21 13:11:26,652][02089] Fps is (10 sec: 3685.7, 60 sec: 3822.8, 300 sec: 3776.6). Total num frames: 1363968. Throughput: 0: 976.6. Samples: 340418. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:11:26,658][02089] Avg episode reward: [(0, '6.705')]
[2024-12-21 13:11:31,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3754.8, 300 sec: 3762.8). Total num frames: 1380352. Throughput: 0: 933.4. Samples: 345488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:11:31,655][02089] Avg episode reward: [(0, '6.799')]
[2024-12-21 13:11:33,685][04442] Updated weights for policy 0, policy_version 340 (0.0027)
[2024-12-21 13:11:36,651][02089] Fps is (10 sec: 4096.7, 60 sec: 3959.6, 300 sec: 3776.7). Total num frames: 1404928. Throughput: 0: 953.7. Samples: 348914. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:11:36,655][02089] Avg episode reward: [(0, '6.699')]
[2024-12-21 13:11:41,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1421312. Throughput: 0: 1006.4. Samples: 355606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:11:41,653][02089] Avg episode reward: [(0, '7.332')]
[2024-12-21 13:11:41,664][04429] Saving new best policy, reward=7.332!
[2024-12-21 13:11:44,732][04442] Updated weights for policy 0, policy_version 350 (0.0025)
[2024-12-21 13:11:46,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 1437696. Throughput: 0: 944.8. Samples: 359794. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:11:46,653][02089] Avg episode reward: [(0, '7.116')]
[2024-12-21 13:11:51,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 1458176. Throughput: 0: 937.9. Samples: 363036. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:11:51,653][02089] Avg episode reward: [(0, '6.966')]
[2024-12-21 13:11:54,340][04442] Updated weights for policy 0, policy_version 360 (0.0021)
[2024-12-21 13:11:56,651][02089] Fps is (10 sec: 4505.2, 60 sec: 3959.4, 300 sec: 3804.4). Total num frames: 1482752. Throughput: 0: 986.1. Samples: 369820. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:11:56,656][02089] Avg episode reward: [(0, '6.730')]
[2024-12-21 13:12:01,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 1499136. Throughput: 0: 956.7. Samples: 374678. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:12:01,655][02089] Avg episode reward: [(0, '6.928')]
[2024-12-21 13:12:05,759][04442] Updated weights for policy 0, policy_version 370 (0.0017)
[2024-12-21 13:12:06,651][02089] Fps is (10 sec: 3686.7, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 1519616. Throughput: 0: 935.2. Samples: 377078. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:12:06,653][02089] Avg episode reward: [(0, '7.321')]
[2024-12-21 13:12:11,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3960.1, 300 sec: 3818.3). Total num frames: 1540096. Throughput: 0: 972.2. Samples: 384164. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:12:11,653][02089] Avg episode reward: [(0, '7.786')]
[2024-12-21 13:12:11,659][04429] Saving new best policy, reward=7.786!
[2024-12-21 13:12:15,469][04442] Updated weights for policy 0, policy_version 380 (0.0032)
[2024-12-21 13:12:16,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 1556480. Throughput: 0: 991.2. Samples: 390094. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-12-21 13:12:16,656][02089] Avg episode reward: [(0, '8.341')]
[2024-12-21 13:12:16,661][04429] Saving new best policy, reward=8.341!
[2024-12-21 13:12:21,651][02089] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 1572864. Throughput: 0: 960.6. Samples: 392140. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:12:21,653][02089] Avg episode reward: [(0, '8.641')]
[2024-12-21 13:12:21,660][04429] Saving new best policy, reward=8.641!
[2024-12-21 13:12:26,395][04442] Updated weights for policy 0, policy_version 390 (0.0015)
[2024-12-21 13:12:26,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3891.3, 300 sec: 3818.3). Total num frames: 1597440. Throughput: 0: 943.9. Samples: 398082. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:12:26,653][02089] Avg episode reward: [(0, '8.420')]
[2024-12-21 13:12:31,651][02089] Fps is (10 sec: 4505.7, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 1617920. Throughput: 0: 1004.6. Samples: 405000. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:12:31,661][02089] Avg episode reward: [(0, '8.340')]
[2024-12-21 13:12:36,653][02089] Fps is (10 sec: 3685.7, 60 sec: 3822.8, 300 sec: 3832.2). Total num frames: 1634304. Throughput: 0: 980.7. Samples: 407170. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:12:36,658][02089] Avg episode reward: [(0, '8.705')]
[2024-12-21 13:12:36,663][04429] Saving new best policy, reward=8.705!
[2024-12-21 13:12:37,950][04442] Updated weights for policy 0, policy_version 400 (0.0023)
[2024-12-21 13:12:41,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 1654784. Throughput: 0: 940.7. Samples: 412150. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:12:41,653][02089] Avg episode reward: [(0, '9.085')]
[2024-12-21 13:12:41,659][04429] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000404_1654784.pth...
[2024-12-21 13:12:41,783][04429] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000181_741376.pth
[2024-12-21 13:12:41,798][04429] Saving new best policy, reward=9.085!
[2024-12-21 13:12:46,651][02089] Fps is (10 sec: 4096.8, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 1675264. Throughput: 0: 985.4. Samples: 419022. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:12:46,654][02089] Avg episode reward: [(0, '9.015')]
[2024-12-21 13:12:47,683][04442] Updated weights for policy 0, policy_version 410 (0.0030)
[2024-12-21 13:12:51,651][02089] Fps is (10 sec: 3276.5, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 1687552. Throughput: 0: 978.7. Samples: 421120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:12:51,659][02089] Avg episode reward: [(0, '8.740')]
[2024-12-21 13:12:56,651][02089] Fps is (10 sec: 2457.6, 60 sec: 3618.2, 300 sec: 3790.5). Total num frames: 1699840. Throughput: 0: 897.2. Samples: 424538. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:12:56,657][02089] Avg episode reward: [(0, '9.030')]
[2024-12-21 13:13:01,554][04442] Updated weights for policy 0, policy_version 420 (0.0015)
[2024-12-21 13:13:01,651][02089] Fps is (10 sec: 3277.1, 60 sec: 3686.4, 300 sec: 3790.5). Total num frames: 1720320. Throughput: 0: 881.7. Samples: 429770. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:13:01,654][02089] Avg episode reward: [(0, '9.148')]
[2024-12-21 13:13:01,661][04429] Saving new best policy, reward=9.148!
[2024-12-21 13:13:06,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3790.5). Total num frames: 1740800. Throughput: 0: 914.6. Samples: 433298. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:13:06,653][02089] Avg episode reward: [(0, '10.247')]
[2024-12-21 13:13:06,657][04429] Saving new best policy, reward=10.247!
[2024-12-21 13:13:11,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 1757184. Throughput: 0: 920.6. Samples: 439508. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:13:11,653][02089] Avg episode reward: [(0, '11.028')]
[2024-12-21 13:13:11,661][04429] Saving new best policy, reward=11.028!
[2024-12-21 13:13:11,994][04442] Updated weights for policy 0, policy_version 430 (0.0016)
[2024-12-21 13:13:16,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3776.7). Total num frames: 1773568. Throughput: 0: 863.6. Samples: 443864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:13:16,653][02089] Avg episode reward: [(0, '11.603')]
[2024-12-21 13:13:16,676][04429] Saving new best policy, reward=11.603!
[2024-12-21 13:13:21,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 1798144. Throughput: 0: 891.9. Samples: 447304. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-12-21 13:13:21,654][02089] Avg episode reward: [(0, '11.603')]
[2024-12-21 13:13:21,995][04442] Updated weights for policy 0, policy_version 440 (0.0026)
[2024-12-21 13:13:26,656][02089] Fps is (10 sec: 4503.2, 60 sec: 3686.1, 300 sec: 3804.4). Total num frames: 1818624. Throughput: 0: 936.0. Samples: 454274. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:13:26,658][02089] Avg episode reward: [(0, '12.364')]
[2024-12-21 13:13:26,660][04429] Saving new best policy, reward=12.364!
[2024-12-21 13:13:31,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3790.6). Total num frames: 1835008. Throughput: 0: 884.4. Samples: 458818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:13:31,657][02089] Avg episode reward: [(0, '11.531')]
[2024-12-21 13:13:33,473][04442] Updated weights for policy 0, policy_version 450 (0.0022)
[2024-12-21 13:13:36,651][02089] Fps is (10 sec: 3688.3, 60 sec: 3686.5, 300 sec: 3776.7). Total num frames: 1855488. Throughput: 0: 898.4. Samples: 461548. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-12-21 13:13:36,658][02089] Avg episode reward: [(0, '11.508')]
[2024-12-21 13:13:41,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 1880064. Throughput: 0: 980.0. Samples: 468638. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:13:41,653][02089] Avg episode reward: [(0, '10.928')]
[2024-12-21 13:13:42,397][04442] Updated weights for policy 0, policy_version 460 (0.0024)
[2024-12-21 13:13:46,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3818.3). Total num frames: 1896448. Throughput: 0: 989.0. Samples: 474276. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:13:46,654][02089] Avg episode reward: [(0, '10.922')]
[2024-12-21 13:13:51,651][02089] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 1912832. Throughput: 0: 958.7. Samples: 476438. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:13:51,658][02089] Avg episode reward: [(0, '10.991')]
[2024-12-21 13:13:53,831][04442] Updated weights for policy 0, policy_version 470 (0.0025)
[2024-12-21 13:13:56,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 1937408. Throughput: 0: 967.4. Samples: 483040. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:13:56,655][02089] Avg episode reward: [(0, '10.941')]
[2024-12-21 13:14:01,651][02089] Fps is (10 sec: 4505.8, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 1957888. Throughput: 0: 1017.2. Samples: 489640. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:14:01,658][02089] Avg episode reward: [(0, '11.616')]
[2024-12-21 13:14:04,228][04442] Updated weights for policy 0, policy_version 480 (0.0023)
[2024-12-21 13:14:06,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1970176. Throughput: 0: 986.9. Samples: 491716. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:14:06,658][02089] Avg episode reward: [(0, '12.185')]
[2024-12-21 13:14:11,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3790.5). Total num frames: 1994752. Throughput: 0: 954.5. Samples: 497222. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:14:11,653][02089] Avg episode reward: [(0, '13.573')]
[2024-12-21 13:14:11,664][04429] Saving new best policy, reward=13.573!
[2024-12-21 13:14:14,186][04442] Updated weights for policy 0, policy_version 490 (0.0021)
[2024-12-21 13:14:16,651][02089] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3818.3). Total num frames: 2015232. Throughput: 0: 1008.8. Samples: 504216. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:14:16,653][02089] Avg episode reward: [(0, '14.389')]
[2024-12-21 13:14:16,656][04429] Saving new best policy, reward=14.389!
[2024-12-21 13:14:21,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 2031616. Throughput: 0: 1004.0. Samples: 506730. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:14:21,657][02089] Avg episode reward: [(0, '14.509')]
[2024-12-21 13:14:21,675][04429] Saving new best policy, reward=14.509!
[2024-12-21 13:14:26,143][04442] Updated weights for policy 0, policy_version 500 (0.0035)
[2024-12-21 13:14:26,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3823.3, 300 sec: 3790.5). Total num frames: 2048000. Throughput: 0: 940.0. Samples: 510936. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:14:26,653][02089] Avg episode reward: [(0, '14.418')]
[2024-12-21 13:14:31,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 2072576. Throughput: 0: 970.5. Samples: 517948. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:14:31,653][02089] Avg episode reward: [(0, '14.793')]
[2024-12-21 13:14:31,661][04429] Saving new best policy, reward=14.793!
[2024-12-21 13:14:35,395][04442] Updated weights for policy 0, policy_version 510 (0.0013)
[2024-12-21 13:14:36,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 2088960. Throughput: 0: 998.9. Samples: 521388. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:14:36,655][02089] Avg episode reward: [(0, '15.020')]
[2024-12-21 13:14:36,659][04429] Saving new best policy, reward=15.020!
[2024-12-21 13:14:41,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 2105344. Throughput: 0: 949.6. Samples: 525774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:14:41,657][02089] Avg episode reward: [(0, '15.257')]
[2024-12-21 13:14:41,672][04429] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000514_2105344.pth...
[2024-12-21 13:14:41,800][04429] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000291_1191936.pth
[2024-12-21 13:14:41,814][04429] Saving new best policy, reward=15.257!
[2024-12-21 13:14:46,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 2125824. Throughput: 0: 938.4. Samples: 531868. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:14:46,653][02089] Avg episode reward: [(0, '14.796')]
[2024-12-21 13:14:46,952][04442] Updated weights for policy 0, policy_version 520 (0.0021)
[2024-12-21 13:14:51,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 2150400. Throughput: 0: 969.2. Samples: 535328. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:14:51,660][02089] Avg episode reward: [(0, '13.730')]
[2024-12-21 13:14:56,650][02089] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 2162688. Throughput: 0: 969.2. Samples: 540836. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:14:56,655][02089] Avg episode reward: [(0, '14.116')]
[2024-12-21 13:14:58,411][04442] Updated weights for policy 0, policy_version 530 (0.0017)
[2024-12-21 13:15:01,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 2183168. Throughput: 0: 930.5. Samples: 546088. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:15:01,653][02089] Avg episode reward: [(0, '14.397')]
[2024-12-21 13:15:06,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 2207744. Throughput: 0: 953.1. Samples: 549620. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:15:06,656][02089] Avg episode reward: [(0, '14.834')]
[2024-12-21 13:15:07,123][04442] Updated weights for policy 0, policy_version 540 (0.0015)
[2024-12-21 13:15:11,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2228224. Throughput: 0: 1010.4. Samples: 556406. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:15:11,653][02089] Avg episode reward: [(0, '15.354')]
[2024-12-21 13:15:11,663][04429] Saving new best policy, reward=15.354!
[2024-12-21 13:15:16,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 2240512. Throughput: 0: 948.2. Samples: 560616. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:15:16,653][02089] Avg episode reward: [(0, '14.526')]
[2024-12-21 13:15:18,731][04442] Updated weights for policy 0, policy_version 550 (0.0022)
[2024-12-21 13:15:21,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 2265088. Throughput: 0: 944.0. Samples: 563870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:15:21,653][02089] Avg episode reward: [(0, '14.317')]
[2024-12-21 13:15:26,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 2285568. Throughput: 0: 1000.0. Samples: 570774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:15:26,657][02089] Avg episode reward: [(0, '15.262')]
[2024-12-21 13:15:28,436][04442] Updated weights for policy 0, policy_version 560 (0.0030)
[2024-12-21 13:15:31,652][02089] Fps is (10 sec: 3685.9, 60 sec: 3822.8, 300 sec: 3846.1). Total num frames: 2301952. Throughput: 0: 969.8. Samples: 575512. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:15:31,659][02089] Avg episode reward: [(0, '15.964')]
[2024-12-21 13:15:31,669][04429] Saving new best policy, reward=15.964!
[2024-12-21 13:15:36,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 2318336. Throughput: 0: 944.4. Samples: 577828. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:15:36,653][02089] Avg episode reward: [(0, '15.729')]
[2024-12-21 13:15:39,505][04442] Updated weights for policy 0, policy_version 570 (0.0030)
[2024-12-21 13:15:41,658][02089] Fps is (10 sec: 4093.6, 60 sec: 3959.0, 300 sec: 3832.1). Total num frames: 2342912. Throughput: 0: 973.8. Samples: 584664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:15:41,665][02089] Avg episode reward: [(0, '15.435')]
[2024-12-21 13:15:46,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 2355200. Throughput: 0: 958.5. Samples: 589220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:15:46,653][02089] Avg episode reward: [(0, '16.851')]
[2024-12-21 13:15:46,659][04429] Saving new best policy, reward=16.851!
[2024-12-21 13:15:51,651][02089] Fps is (10 sec: 2459.4, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 2367488. Throughput: 0: 916.8. Samples: 590878. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:15:51,654][02089] Avg episode reward: [(0, '15.926')]
[2024-12-21 13:15:54,028][04442] Updated weights for policy 0, policy_version 580 (0.0019)
[2024-12-21 13:15:56,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 2387968. Throughput: 0: 870.2. Samples: 595564. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:15:56,653][02089] Avg episode reward: [(0, '15.993')]
[2024-12-21 13:16:01,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 2408448. Throughput: 0: 933.5. Samples: 602624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:16:01,653][02089] Avg episode reward: [(0, '16.110')]
[2024-12-21 13:16:02,817][04442] Updated weights for policy 0, policy_version 590 (0.0019)
[2024-12-21 13:16:06,654][02089] Fps is (10 sec: 3685.2, 60 sec: 3617.9, 300 sec: 3804.5). Total num frames: 2424832. Throughput: 0: 931.8. Samples: 605806. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:16:06,656][02089] Avg episode reward: [(0, '16.781')]
[2024-12-21 13:16:11,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3790.5). Total num frames: 2445312. Throughput: 0: 873.3. Samples: 610074. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:16:11,658][02089] Avg episode reward: [(0, '16.586')]
[2024-12-21 13:16:14,178][04442] Updated weights for policy 0, policy_version 600 (0.0025)
[2024-12-21 13:16:16,651][02089] Fps is (10 sec: 4097.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 2465792. Throughput: 0: 924.3. Samples: 617106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:16:16,657][02089] Avg episode reward: [(0, '17.190')]
[2024-12-21 13:16:16,660][04429] Saving new best policy, reward=17.190!
[2024-12-21 13:16:21,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3804.4). Total num frames: 2486272. Throughput: 0: 949.7. Samples: 620566. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-12-21 13:16:21,657][02089] Avg episode reward: [(0, '17.535')]
[2024-12-21 13:16:21,719][04429] Saving new best policy, reward=17.535!
[2024-12-21 13:16:24,892][04442] Updated weights for policy 0, policy_version 610 (0.0028)
[2024-12-21 13:16:26,652][02089] Fps is (10 sec: 3685.9, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 2502656. Throughput: 0: 903.7. Samples: 625326. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:16:26,658][02089] Avg episode reward: [(0, '18.857')]
[2024-12-21 13:16:26,660][04429] Saving new best policy, reward=18.857!
[2024-12-21 13:16:31,651][02089] Fps is (10 sec: 3686.3, 60 sec: 3686.5, 300 sec: 3790.5). Total num frames: 2523136. Throughput: 0: 930.6. Samples: 631096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:16:31,658][02089] Avg episode reward: [(0, '18.558')]
[2024-12-21 13:16:34,492][04442] Updated weights for policy 0, policy_version 620 (0.0033)
[2024-12-21 13:16:36,651][02089] Fps is (10 sec: 4506.2, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 2547712. Throughput: 0: 971.8. Samples: 634608. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:16:36,653][02089] Avg episode reward: [(0, '19.826')]
[2024-12-21 13:16:36,658][04429] Saving new best policy, reward=19.826!
[2024-12-21 13:16:41,653][02089] Fps is (10 sec: 4095.2, 60 sec: 3686.7, 300 sec: 3818.3). Total num frames: 2564096. Throughput: 0: 999.1. Samples: 640524. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:16:41,661][02089] Avg episode reward: [(0, '19.759')]
[2024-12-21 13:16:41,669][04429] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000626_2564096.pth...
[2024-12-21 13:16:41,829][04429] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000404_1654784.pth
[2024-12-21 13:16:46,152][04442] Updated weights for policy 0, policy_version 630 (0.0023)
[2024-12-21 13:16:46,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 2580480. Throughput: 0: 947.3. Samples: 645252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:16:46,653][02089] Avg episode reward: [(0, '19.322')]
[2024-12-21 13:16:51,651][02089] Fps is (10 sec: 4096.9, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 2605056. Throughput: 0: 955.5. Samples: 648800. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:16:51,653][02089] Avg episode reward: [(0, '19.121')]
[2024-12-21 13:16:55,038][04442] Updated weights for policy 0, policy_version 640 (0.0033)
[2024-12-21 13:16:56,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 2625536. Throughput: 0: 1015.3. Samples: 655762. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:16:56,653][02089] Avg episode reward: [(0, '19.066')]
[2024-12-21 13:17:01,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 2637824. Throughput: 0: 953.6. Samples: 660020. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:17:01,658][02089] Avg episode reward: [(0, '19.009')]
[2024-12-21 13:17:06,386][04442] Updated weights for policy 0, policy_version 650 (0.0013)
[2024-12-21 13:17:06,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3959.7, 300 sec: 3804.4). Total num frames: 2662400. Throughput: 0: 945.2. Samples: 663100. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:17:06,656][02089] Avg episode reward: [(0, '19.652')]
[2024-12-21 13:17:11,651][02089] Fps is (10 sec: 4505.5, 60 sec: 3959.4, 300 sec: 3818.3). Total num frames: 2682880. Throughput: 0: 996.0. Samples: 670146. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:17:11,653][02089] Avg episode reward: [(0, '19.061')]
[2024-12-21 13:17:16,651][02089] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2699264. Throughput: 0: 985.5. Samples: 675442. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:17:16,658][02089] Avg episode reward: [(0, '19.493')]
[2024-12-21 13:17:17,349][04442] Updated weights for policy 0, policy_version 660 (0.0035)
[2024-12-21 13:17:21,651][02089] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 2719744. Throughput: 0: 955.3. Samples: 677596. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:17:21,652][02089] Avg episode reward: [(0, '19.997')]
[2024-12-21 13:17:21,669][04429] Saving new best policy, reward=19.997!
[2024-12-21 13:17:26,651][02089] Fps is (10 sec: 4096.1, 60 sec: 3959.6, 300 sec: 3804.4). Total num frames: 2740224. Throughput: 0: 976.6. Samples: 684468. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:17:26,660][02089] Avg episode reward: [(0, '20.870')]
[2024-12-21 13:17:26,662][04429] Saving new best policy, reward=20.870!
[2024-12-21 13:17:26,933][04442] Updated weights for policy 0, policy_version 670 (0.0023)
[2024-12-21 13:17:31,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 2760704. Throughput: 0: 1005.0. Samples: 690478. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:17:31,655][02089] Avg episode reward: [(0, '21.093')]
[2024-12-21 13:17:31,668][04429] Saving new best policy, reward=21.093!
[2024-12-21 13:17:36,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 2777088. Throughput: 0: 971.0. Samples: 692496. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:17:36,655][02089] Avg episode reward: [(0, '21.764')]
[2024-12-21 13:17:36,660][04429] Saving new best policy, reward=21.764!
[2024-12-21 13:17:38,500][04442] Updated weights for policy 0, policy_version 680 (0.0023)
[2024-12-21 13:17:41,651][02089] Fps is (10 sec: 3686.3, 60 sec: 3891.3, 300 sec: 3804.4). Total num frames: 2797568. Throughput: 0: 947.0. Samples: 698378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:17:41,659][02089] Avg episode reward: [(0, '21.110')]
[2024-12-21 13:17:46,651][02089] Fps is (10 sec: 4505.5, 60 sec: 4027.7, 300 sec: 3846.1). Total num frames: 2822144. Throughput: 0: 1011.0. Samples: 705514. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2024-12-21 13:17:46,653][02089] Avg episode reward: [(0, '19.614')]
[2024-12-21 13:17:47,958][04442] Updated weights for policy 0, policy_version 690 (0.0016)
[2024-12-21 13:17:51,651][02089] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2834432. Throughput: 0: 995.6. Samples: 707900. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:17:51,655][02089] Avg episode reward: [(0, '20.158')]
[2024-12-21 13:17:56,651][02089] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2854912. Throughput: 0: 948.7. Samples: 712836. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:17:56,654][02089] Avg episode reward: [(0, '20.088')]
[2024-12-21 13:17:58,776][04442] Updated weights for policy 0, policy_version 700 (0.0016)
[2024-12-21 13:18:01,651][02089] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 2879488. Throughput: 0: 985.5. Samples: 719788. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:18:01,656][02089] Avg episode reward: [(0, '20.800')]
[2024-12-21 13:18:06,652][02089] Fps is (10 sec: 4095.4, 60 sec: 3891.1, 300 sec: 3859.9). Total num frames: 2895872. Throughput: 0: 1011.4. Samples: 723112. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:18:06,655][02089] Avg episode reward: [(0, '21.599')]
[2024-12-21 13:18:10,295][04442] Updated weights for policy 0, policy_version 710 (0.0016)
[2024-12-21 13:18:11,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3860.0). Total num frames: 2912256. Throughput: 0: 952.4. Samples: 727326. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:18:11,653][02089] Avg episode reward: [(0, '23.324')]
[2024-12-21 13:18:11,660][04429] Saving new best policy, reward=23.324!
[2024-12-21 13:18:16,651][02089] Fps is (10 sec: 4096.7, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 2936832. Throughput: 0: 964.4. Samples: 733878. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:18:16,652][02089] Avg episode reward: [(0, '23.174')]
[2024-12-21 13:18:19,275][04442] Updated weights for policy 0, policy_version 720 (0.0014)
[2024-12-21 13:18:21,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 2957312. Throughput: 0: 997.8. Samples: 737396. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:18:21,656][02089] Avg episode reward: [(0, '24.064')]
[2024-12-21 13:18:21,665][04429] Saving new best policy, reward=24.064!
[2024-12-21 13:18:26,651][02089] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2969600. Throughput: 0: 979.2. Samples: 742440. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:18:26,653][02089] Avg episode reward: [(0, '23.467')]
[2024-12-21 13:18:30,879][04442] Updated weights for policy 0, policy_version 730 (0.0018)
[2024-12-21 13:18:31,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2990080. Throughput: 0: 942.1. Samples: 747908. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:18:31,657][02089] Avg episode reward: [(0, '23.563')]
[2024-12-21 13:18:36,651][02089] Fps is (10 sec: 4505.7, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 3014656. Throughput: 0: 968.0. Samples: 751460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:18:36,653][02089] Avg episode reward: [(0, '22.777')]
[2024-12-21 13:18:41,652][02089] Fps is (10 sec: 3686.0, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3026944. Throughput: 0: 977.8. Samples: 756838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:18:41,657][02089] Avg episode reward: [(0, '23.222')]
[2024-12-21 13:18:41,665][04429] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000739_3026944.pth...
[2024-12-21 13:18:41,869][04429] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000514_2105344.pth
[2024-12-21 13:18:42,359][04442] Updated weights for policy 0, policy_version 740 (0.0021)
[2024-12-21 13:18:46,652][02089] Fps is (10 sec: 2457.3, 60 sec: 3618.1, 300 sec: 3818.3). Total num frames: 3039232. Throughput: 0: 898.4. Samples: 760216. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:18:46,654][02089] Avg episode reward: [(0, '22.428')]
[2024-12-21 13:18:51,651][02089] Fps is (10 sec: 3277.1, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 3059712. Throughput: 0: 872.0. Samples: 762352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:18:51,657][02089] Avg episode reward: [(0, '22.496')]
[2024-12-21 13:18:54,002][04442] Updated weights for policy 0, policy_version 750 (0.0041)
[2024-12-21 13:18:56,651][02089] Fps is (10 sec: 4096.6, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 3080192. Throughput: 0: 935.9. Samples: 769440. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:18:56,653][02089] Avg episode reward: [(0, '22.760')]
[2024-12-21 13:19:01,653][02089] Fps is (10 sec: 4095.1, 60 sec: 3686.3, 300 sec: 3832.2). Total num frames: 3100672. Throughput: 0: 921.6. Samples: 775352. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:19:01,659][02089] Avg episode reward: [(0, '22.509')]
[2024-12-21 13:19:05,392][04442] Updated weights for policy 0, policy_version 760 (0.0034)
[2024-12-21 13:19:06,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3686.5, 300 sec: 3804.4). Total num frames: 3117056. Throughput: 0: 891.7. Samples: 777522. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:19:06,655][02089] Avg episode reward: [(0, '21.932')]
[2024-12-21 13:19:11,651][02089] Fps is (10 sec: 3687.2, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 3137536. Throughput: 0: 920.3. Samples: 783852. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:19:11,662][02089] Avg episode reward: [(0, '22.111')]
[2024-12-21 13:19:14,198][04442] Updated weights for policy 0, policy_version 770 (0.0018)
[2024-12-21 13:19:16,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 3162112. Throughput: 0: 956.8. Samples: 790966. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-12-21 13:19:16,653][02089] Avg episode reward: [(0, '21.730')]
[2024-12-21 13:19:21,656][02089] Fps is (10 sec: 3684.5, 60 sec: 3617.8, 300 sec: 3818.2). Total num frames: 3174400. Throughput: 0: 923.4. Samples: 793020. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:19:21,659][02089] Avg episode reward: [(0, '21.948')]
[2024-12-21 13:19:25,942][04442] Updated weights for policy 0, policy_version 780 (0.0039)
[2024-12-21 13:19:26,652][02089] Fps is (10 sec: 3276.3, 60 sec: 3754.6, 300 sec: 3804.4). Total num frames: 3194880. Throughput: 0: 916.6. Samples: 798084. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:19:26,659][02089] Avg episode reward: [(0, '22.425')]
[2024-12-21 13:19:31,651][02089] Fps is (10 sec: 4508.0, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3219456. Throughput: 0: 996.1. Samples: 805040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:19:31,656][02089] Avg episode reward: [(0, '22.355')]
[2024-12-21 13:19:35,637][04442] Updated weights for policy 0, policy_version 790 (0.0023)
[2024-12-21 13:19:36,651][02089] Fps is (10 sec: 4096.6, 60 sec: 3686.4, 300 sec: 3832.2). Total num frames: 3235840. Throughput: 0: 1019.2. Samples: 808214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:19:36,656][02089] Avg episode reward: [(0, '23.016')]
[2024-12-21 13:19:41,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 3252224. Throughput: 0: 956.5. Samples: 812484. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:19:41,652][02089] Avg episode reward: [(0, '23.147')]
[2024-12-21 13:19:46,039][04442] Updated weights for policy 0, policy_version 800 (0.0030)
[2024-12-21 13:19:46,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3959.6, 300 sec: 3818.3). Total num frames: 3276800. Throughput: 0: 980.3. Samples: 819462. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:19:46,653][02089] Avg episode reward: [(0, '23.383')]
[2024-12-21 13:19:51,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 3297280. Throughput: 0: 1010.0. Samples: 822974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:19:51,658][02089] Avg episode reward: [(0, '23.072')]
[2024-12-21 13:19:56,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3313664. Throughput: 0: 978.4. Samples: 827878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:19:56,656][02089] Avg episode reward: [(0, '22.422')]
[2024-12-21 13:19:57,760][04442] Updated weights for policy 0, policy_version 810 (0.0021)
[2024-12-21 13:20:01,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3891.3, 300 sec: 3818.3). Total num frames: 3334144. Throughput: 0: 948.9. Samples: 833666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:20:01,658][02089] Avg episode reward: [(0, '23.699')]
[2024-12-21 13:20:06,299][04442] Updated weights for policy 0, policy_version 820 (0.0013)
[2024-12-21 13:20:06,651][02089] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3832.2). Total num frames: 3358720. Throughput: 0: 982.9. Samples: 837246. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:20:06,653][02089] Avg episode reward: [(0, '23.744')]
[2024-12-21 13:20:11,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 3375104. Throughput: 0: 1008.6. Samples: 843468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:20:11,658][02089] Avg episode reward: [(0, '24.413')]
[2024-12-21 13:20:11,670][04429] Saving new best policy, reward=24.413!
[2024-12-21 13:20:16,651][02089] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3391488. Throughput: 0: 957.5. Samples: 848130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:20:16,655][02089] Avg episode reward: [(0, '24.077')]
[2024-12-21 13:20:18,005][04442] Updated weights for policy 0, policy_version 830 (0.0015)
[2024-12-21 13:20:21,651][02089] Fps is (10 sec: 4096.0, 60 sec: 4028.1, 300 sec: 3832.2). Total num frames: 3416064. Throughput: 0: 964.2. Samples: 851602. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:20:21,653][02089] Avg episode reward: [(0, '25.436')]
[2024-12-21 13:20:21,668][04429] Saving new best policy, reward=25.436!
[2024-12-21 13:20:26,651][02089] Fps is (10 sec: 4505.7, 60 sec: 4027.8, 300 sec: 3846.1). Total num frames: 3436544. Throughput: 0: 1026.5. Samples: 858678. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:20:26,653][02089] Avg episode reward: [(0, '25.542')]
[2024-12-21 13:20:26,657][04429] Saving new best policy, reward=25.542!
[2024-12-21 13:20:27,937][04442] Updated weights for policy 0, policy_version 840 (0.0017)
[2024-12-21 13:20:31,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3448832. Throughput: 0: 962.5. Samples: 862774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:20:31,656][02089] Avg episode reward: [(0, '24.694')]
[2024-12-21 13:20:36,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3832.3). Total num frames: 3473408. Throughput: 0: 950.8. Samples: 865762. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:20:36,655][02089] Avg episode reward: [(0, '24.985')]
[2024-12-21 13:20:38,209][04442] Updated weights for policy 0, policy_version 850 (0.0022)
[2024-12-21 13:20:41,651][02089] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 3493888. Throughput: 0: 997.4. Samples: 872760. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:20:41,653][02089] Avg episode reward: [(0, '27.069')]
[2024-12-21 13:20:41,687][04429] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000854_3497984.pth...
[2024-12-21 13:20:41,829][04429] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000626_2564096.pth
[2024-12-21 13:20:41,846][04429] Saving new best policy, reward=27.069!
[2024-12-21 13:20:46,651][02089] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 3510272. Throughput: 0: 985.4. Samples: 878008. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:20:46,654][02089] Avg episode reward: [(0, '26.860')]
[2024-12-21 13:20:49,876][04442] Updated weights for policy 0, policy_version 860 (0.0018)
[2024-12-21 13:20:51,650][02089] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 3530752. Throughput: 0: 951.7. Samples: 880072. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-21 13:20:51,652][02089] Avg episode reward: [(0, '25.681')]
[2024-12-21 13:20:56,651][02089] Fps is (10 sec: 4096.1, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 3551232. Throughput: 0: 964.5. Samples: 886872. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:20:56,652][02089] Avg episode reward: [(0, '25.962')]
[2024-12-21 13:20:58,649][04442] Updated weights for policy 0, policy_version 870 (0.0025)
[2024-12-21 13:21:01,653][02089] Fps is (10 sec: 4095.0, 60 sec: 3959.3, 300 sec: 3887.7). Total num frames: 3571712. Throughput: 0: 1000.4. Samples: 893148. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:21:01,656][02089] Avg episode reward: [(0, '26.209')]
[2024-12-21 13:21:06,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 3584000. Throughput: 0: 969.9. Samples: 895246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:21:06,657][02089] Avg episode reward: [(0, '25.261')]
[2024-12-21 13:21:10,414][04442] Updated weights for policy 0, policy_version 880 (0.0019)
[2024-12-21 13:21:11,651][02089] Fps is (10 sec: 3687.2, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 3608576. Throughput: 0: 940.7. Samples: 901008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:21:11,652][02089] Avg episode reward: [(0, '25.801')]
[2024-12-21 13:21:16,651][02089] Fps is (10 sec: 4915.2, 60 sec: 4027.8, 300 sec: 3887.7). Total num frames: 3633152. Throughput: 0: 1007.3. Samples: 908102. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:21:16,655][02089] Avg episode reward: [(0, '25.901')]
[2024-12-21 13:21:20,858][04442] Updated weights for policy 0, policy_version 890 (0.0018)
[2024-12-21 13:21:21,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3873.9). Total num frames: 3645440. Throughput: 0: 997.3. Samples: 910640. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:21:21,657][02089] Avg episode reward: [(0, '24.794')]
[2024-12-21 13:21:26,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 3665920. Throughput: 0: 946.3. Samples: 915342. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:21:26,658][02089] Avg episode reward: [(0, '25.052')]
[2024-12-21 13:21:30,845][04442] Updated weights for policy 0, policy_version 900 (0.0016)
[2024-12-21 13:21:31,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 3686400. Throughput: 0: 983.8. Samples: 922280. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:21:31,657][02089] Avg episode reward: [(0, '25.001')]
[2024-12-21 13:21:36,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 3702784. Throughput: 0: 1000.7. Samples: 925102. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-21 13:21:36,657][02089] Avg episode reward: [(0, '25.418')]
[2024-12-21 13:21:41,651][02089] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3846.1). Total num frames: 3715072. Throughput: 0: 925.7. Samples: 928528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:21:41,656][02089] Avg episode reward: [(0, '24.349')]
[2024-12-21 13:21:45,066][04442] Updated weights for policy 0, policy_version 910 (0.0031)
[2024-12-21 13:21:46,651][02089] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3818.3). Total num frames: 3731456. Throughput: 0: 893.7. Samples: 933362. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:21:46,655][02089] Avg episode reward: [(0, '24.855')]
[2024-12-21 13:21:51,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 3756032. Throughput: 0: 925.0. Samples: 936872. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:21:51,652][02089] Avg episode reward: [(0, '25.519')]
[2024-12-21 13:21:53,870][04442] Updated weights for policy 0, policy_version 920 (0.0015)
[2024-12-21 13:21:56,651][02089] Fps is (10 sec: 4505.5, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 3776512. Throughput: 0: 943.5. Samples: 943466. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:21:56,659][02089] Avg episode reward: [(0, '25.111')]
[2024-12-21 13:22:01,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3618.3, 300 sec: 3818.3). Total num frames: 3788800. Throughput: 0: 880.8. Samples: 947738. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:22:01,652][02089] Avg episode reward: [(0, '24.334')]
[2024-12-21 13:22:05,433][04442] Updated weights for policy 0, policy_version 930 (0.0026)
[2024-12-21 13:22:06,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3813376. Throughput: 0: 899.3. Samples: 951108. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2024-12-21 13:22:06,656][02089] Avg episode reward: [(0, '23.610')]
[2024-12-21 13:22:11,651][02089] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 3833856. Throughput: 0: 952.2. Samples: 958190. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-21 13:22:11,658][02089] Avg episode reward: [(0, '21.513')]
[2024-12-21 13:22:16,012][04442] Updated weights for policy 0, policy_version 940 (0.0020)
[2024-12-21 13:22:16,655][02089] Fps is (10 sec: 3684.8, 60 sec: 3617.9, 300 sec: 3832.1). Total num frames: 3850240. Throughput: 0: 907.0. Samples: 963100. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:22:16,658][02089] Avg episode reward: [(0, '21.820')]
[2024-12-21 13:22:21,651][02089] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 3870720. Throughput: 0: 896.5. Samples: 965444. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:22:21,652][02089] Avg episode reward: [(0, '23.100')]
[2024-12-21 13:22:25,733][04442] Updated weights for policy 0, policy_version 950 (0.0017)
[2024-12-21 13:22:26,651][02089] Fps is (10 sec: 4507.6, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 3895296. Throughput: 0: 976.1. Samples: 972454. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-21 13:22:26,653][02089] Avg episode reward: [(0, '23.432')]
[2024-12-21 13:22:31,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 3911680. Throughput: 0: 998.9. Samples: 978314. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:22:31,656][02089] Avg episode reward: [(0, '24.270')]
[2024-12-21 13:22:36,651][02089] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 3928064. Throughput: 0: 967.5. Samples: 980410. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-21 13:22:36,659][02089] Avg episode reward: [(0, '24.854')]
[2024-12-21 13:22:37,245][04442] Updated weights for policy 0, policy_version 960 (0.0020)
[2024-12-21 13:22:41,651][02089] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 3952640. Throughput: 0: 959.6. Samples: 986648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:22:41,653][02089] Avg episode reward: [(0, '26.388')]
[2024-12-21 13:22:41,662][04429] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000965_3952640.pth...
[2024-12-21 13:22:41,784][04429] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000739_3026944.pth
[2024-12-21 13:22:46,101][04442] Updated weights for policy 0, policy_version 970 (0.0036)
[2024-12-21 13:22:46,651][02089] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 3973120. Throughput: 0: 1019.8. Samples: 993628. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-21 13:22:46,653][02089] Avg episode reward: [(0, '25.888')]
[2024-12-21 13:22:51,655][02089] Fps is (10 sec: 3275.4, 60 sec: 3822.7, 300 sec: 3832.1). Total num frames: 3985408. Throughput: 0: 992.5. Samples: 995774. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-21 13:22:51,657][02089] Avg episode reward: [(0, '25.927')]
[2024-12-21 13:22:55,848][04429] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2024-12-21 13:22:55,848][02089] Component Batcher_0 stopped!
[2024-12-21 13:22:55,848][04429] Stopping Batcher_0...
[2024-12-21 13:22:55,858][04429] Loop batcher_evt_loop terminating...
[2024-12-21 13:22:55,916][04442] Weights refcount: 2 0
[2024-12-21 13:22:55,920][04442] Stopping InferenceWorker_p0-w0...
[2024-12-21 13:22:55,921][04442] Loop inference_proc0-0_evt_loop terminating...
[2024-12-21 13:22:55,921][02089] Component InferenceWorker_p0-w0 stopped!
[2024-12-21 13:22:55,980][04429] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000854_3497984.pth
[2024-12-21 13:22:56,008][04429] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2024-12-21 13:22:56,190][02089] Component LearnerWorker_p0 stopped!
[2024-12-21 13:22:56,192][04429] Stopping LearnerWorker_p0...
[2024-12-21 13:22:56,193][04429] Loop learner_proc0_evt_loop terminating...
[2024-12-21 13:22:56,230][04446] Stopping RolloutWorker_w3...
[2024-12-21 13:22:56,230][02089] Component RolloutWorker_w3 stopped!
[2024-12-21 13:22:56,232][04446] Loop rollout_proc3_evt_loop terminating...
[2024-12-21 13:22:56,243][02089] Component RolloutWorker_w7 stopped!
[2024-12-21 13:22:56,242][04450] Stopping RolloutWorker_w7...
[2024-12-21 13:22:56,248][02089] Component RolloutWorker_w1 stopped!
[2024-12-21 13:22:56,254][04443] Stopping RolloutWorker_w1...
[2024-12-21 13:22:56,256][02089] Component RolloutWorker_w5 stopped!
[2024-12-21 13:22:56,246][04450] Loop rollout_proc7_evt_loop terminating...
[2024-12-21 13:22:56,260][04448] Stopping RolloutWorker_w5...
[2024-12-21 13:22:56,255][04443] Loop rollout_proc1_evt_loop terminating...
[2024-12-21 13:22:56,262][04448] Loop rollout_proc5_evt_loop terminating...
[2024-12-21 13:22:56,272][04444] Stopping RolloutWorker_w0...
[2024-12-21 13:22:56,272][02089] Component RolloutWorker_w0 stopped!
[2024-12-21 13:22:56,280][04447] Stopping RolloutWorker_w4...
[2024-12-21 13:22:56,283][04447] Loop rollout_proc4_evt_loop terminating...
[2024-12-21 13:22:56,280][02089] Component RolloutWorker_w4 stopped!
[2024-12-21 13:22:56,295][04444] Loop rollout_proc0_evt_loop terminating...
[2024-12-21 13:22:56,313][04449] Stopping RolloutWorker_w6...
[2024-12-21 13:22:56,316][04449] Loop rollout_proc6_evt_loop terminating...
[2024-12-21 13:22:56,314][02089] Component RolloutWorker_w6 stopped!
[2024-12-21 13:22:56,333][04445] Stopping RolloutWorker_w2...
[2024-12-21 13:22:56,333][02089] Component RolloutWorker_w2 stopped!
[2024-12-21 13:22:56,336][02089] Waiting for process learner_proc0 to stop...
[2024-12-21 13:22:56,341][04445] Loop rollout_proc2_evt_loop terminating...
[2024-12-21 13:22:57,891][02089] Waiting for process inference_proc0-0 to join...
[2024-12-21 13:22:57,900][02089] Waiting for process rollout_proc0 to join...
[2024-12-21 13:22:59,817][02089] Waiting for process rollout_proc1 to join...
[2024-12-21 13:22:59,826][02089] Waiting for process rollout_proc2 to join...
[2024-12-21 13:22:59,852][02089] Waiting for process rollout_proc3 to join...
[2024-12-21 13:22:59,856][02089] Waiting for process rollout_proc4 to join...
[2024-12-21 13:22:59,860][02089] Waiting for process rollout_proc5 to join...
[2024-12-21 13:22:59,863][02089] Waiting for process rollout_proc6 to join...
[2024-12-21 13:22:59,866][02089] Waiting for process rollout_proc7 to join...
[2024-12-21 13:22:59,870][02089] Batcher 0 profile tree view:
batching: 26.4837, releasing_batches: 0.0294
[2024-12-21 13:22:59,872][02089] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 427.2956
update_model: 8.7427
weight_update: 0.0050
one_step: 0.0025
handle_policy_step: 584.2219
deserialize: 14.7512, stack: 3.1527, obs_to_device_normalize: 123.8642, forward: 292.5321, send_messages: 28.8992
prepare_outputs: 91.1658
to_cpu: 55.2367
[2024-12-21 13:22:59,874][02089] Learner 0 profile tree view:
misc: 0.0048, prepare_batch: 14.0818
train: 74.1073
epoch_init: 0.0134, minibatch_init: 0.0128, losses_postprocess: 0.6504, kl_divergence: 0.6554, after_optimizer: 34.2816
calculate_losses: 25.9485
losses_init: 0.0037, forward_head: 1.3813, bptt_initial: 16.9322, tail: 1.1103, advantages_returns: 0.2894, losses: 3.9151
bptt: 1.9496
bptt_forward_core: 1.8604
update: 11.9060
clip: 0.8906
[2024-12-21 13:22:59,877][02089] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3260, enqueue_policy_requests: 102.2030, env_step: 830.5280, overhead: 13.2464, complete_rollouts: 7.6569
save_policy_outputs: 20.7380
split_output_tensors: 8.4482
[2024-12-21 13:22:59,879][02089] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.2967, enqueue_policy_requests: 106.8665, env_step: 830.7750, overhead: 12.9086, complete_rollouts: 6.4772
save_policy_outputs: 20.5492
split_output_tensors: 8.3398
[2024-12-21 13:22:59,880][02089] Loop Runner_EvtLoop terminating...
[2024-12-21 13:22:59,881][02089] Runner profile tree view:
main_loop: 1092.5302
[2024-12-21 13:22:59,883][02089] Collected {0: 4005888}, FPS: 3666.6
[2024-12-21 13:23:23,986][02089] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2024-12-21 13:23:23,987][02089] Overriding arg 'num_workers' with value 1 passed from command line
[2024-12-21 13:23:23,990][02089] Adding new argument 'no_render'=True that is not in the saved config file!
[2024-12-21 13:23:23,992][02089] Adding new argument 'save_video'=True that is not in the saved config file!
[2024-12-21 13:23:23,994][02089] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2024-12-21 13:23:23,996][02089] Adding new argument 'video_name'=None that is not in the saved config file!
[2024-12-21 13:23:23,997][02089] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2024-12-21 13:23:23,998][02089] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2024-12-21 13:23:24,000][02089] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2024-12-21 13:23:24,001][02089] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2024-12-21 13:23:24,002][02089] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2024-12-21 13:23:24,003][02089] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2024-12-21 13:23:24,004][02089] Adding new argument 'train_script'=None that is not in the saved config file!
[2024-12-21 13:23:24,005][02089] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2024-12-21 13:23:24,006][02089] Using frameskip 1 and render_action_repeat=4 for evaluation
[2024-12-21 13:23:24,041][02089] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-21 13:23:24,044][02089] RunningMeanStd input shape: (3, 72, 128)
[2024-12-21 13:23:24,046][02089] RunningMeanStd input shape: (1,)
[2024-12-21 13:23:24,061][02089] ConvEncoder: input_channels=3
[2024-12-21 13:23:24,166][02089] Conv encoder output size: 512
[2024-12-21 13:23:24,169][02089] Policy head output size: 512
[2024-12-21 13:23:24,452][02089] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2024-12-21 13:23:25,251][02089] Num frames 100...
[2024-12-21 13:23:25,384][02089] Num frames 200...
[2024-12-21 13:23:25,511][02089] Num frames 300...
[2024-12-21 13:23:25,635][02089] Num frames 400...
[2024-12-21 13:23:25,754][02089] Num frames 500...
[2024-12-21 13:23:25,873][02089] Num frames 600...
[2024-12-21 13:23:26,028][02089] Num frames 700...
[2024-12-21 13:23:26,204][02089] Num frames 800...
[2024-12-21 13:23:26,383][02089] Num frames 900...
[2024-12-21 13:23:26,565][02089] Num frames 1000...
[2024-12-21 13:23:26,727][02089] Num frames 1100...
[2024-12-21 13:23:26,895][02089] Num frames 1200...
[2024-12-21 13:23:27,060][02089] Num frames 1300...
[2024-12-21 13:23:27,227][02089] Num frames 1400...
[2024-12-21 13:23:27,398][02089] Num frames 1500...
[2024-12-21 13:23:27,480][02089] Avg episode rewards: #0: 41.130, true rewards: #0: 15.130
[2024-12-21 13:23:27,482][02089] Avg episode reward: 41.130, avg true_objective: 15.130
[2024-12-21 13:23:27,644][02089] Num frames 1600...
[2024-12-21 13:23:27,814][02089] Num frames 1700...
[2024-12-21 13:23:27,987][02089] Num frames 1800...
[2024-12-21 13:23:28,172][02089] Num frames 1900...
[2024-12-21 13:23:28,353][02089] Num frames 2000...
[2024-12-21 13:23:28,537][02089] Num frames 2100...
[2024-12-21 13:23:28,663][02089] Num frames 2200...
[2024-12-21 13:23:28,788][02089] Num frames 2300...
[2024-12-21 13:23:28,910][02089] Num frames 2400...
[2024-12-21 13:23:29,032][02089] Num frames 2500...
[2024-12-21 13:23:29,152][02089] Num frames 2600...
[2024-12-21 13:23:29,278][02089] Num frames 2700...
[2024-12-21 13:23:29,402][02089] Num frames 2800...
[2024-12-21 13:23:29,540][02089] Num frames 2900...
[2024-12-21 13:23:29,668][02089] Num frames 3000...
[2024-12-21 13:23:29,794][02089] Num frames 3100...
[2024-12-21 13:23:29,924][02089] Num frames 3200...
[2024-12-21 13:23:30,048][02089] Num frames 3300...
[2024-12-21 13:23:30,172][02089] Num frames 3400...
[2024-12-21 13:23:30,299][02089] Num frames 3500...
[2024-12-21 13:23:30,424][02089] Num frames 3600...
[2024-12-21 13:23:30,499][02089] Avg episode rewards: #0: 48.564, true rewards: #0: 18.065
[2024-12-21 13:23:30,501][02089] Avg episode reward: 48.564, avg true_objective: 18.065
[2024-12-21 13:23:30,622][02089] Num frames 3700...
[2024-12-21 13:23:30,746][02089] Num frames 3800...
[2024-12-21 13:23:30,868][02089] Num frames 3900...
[2024-12-21 13:23:30,990][02089] Num frames 4000...
[2024-12-21 13:23:31,136][02089] Num frames 4100...
[2024-12-21 13:23:31,270][02089] Num frames 4200...
[2024-12-21 13:23:31,396][02089] Num frames 4300...
[2024-12-21 13:23:31,523][02089] Num frames 4400...
[2024-12-21 13:23:31,658][02089] Num frames 4500...
[2024-12-21 13:23:31,778][02089] Num frames 4600...
[2024-12-21 13:23:31,911][02089] Avg episode rewards: #0: 39.203, true rewards: #0: 15.537
[2024-12-21 13:23:31,913][02089] Avg episode reward: 39.203, avg true_objective: 15.537
[2024-12-21 13:23:31,963][02089] Num frames 4700...
[2024-12-21 13:23:32,083][02089] Num frames 4800...
[2024-12-21 13:23:32,205][02089] Num frames 4900...
[2024-12-21 13:23:32,331][02089] Num frames 5000...
[2024-12-21 13:23:32,452][02089] Num frames 5100...
[2024-12-21 13:23:32,593][02089] Num frames 5200...
[2024-12-21 13:23:32,657][02089] Avg episode rewards: #0: 31.262, true rewards: #0: 13.012
[2024-12-21 13:23:32,659][02089] Avg episode reward: 31.262, avg true_objective: 13.012
[2024-12-21 13:23:32,773][02089] Num frames 5300...
[2024-12-21 13:23:32,894][02089] Num frames 5400...
[2024-12-21 13:23:33,014][02089] Num frames 5500...
[2024-12-21 13:23:33,143][02089] Num frames 5600...
[2024-12-21 13:23:33,267][02089] Num frames 5700...
[2024-12-21 13:23:33,388][02089] Num frames 5800...
[2024-12-21 13:23:33,511][02089] Num frames 5900...
[2024-12-21 13:23:33,583][02089] Avg episode rewards: #0: 27.818, true rewards: #0: 11.818
[2024-12-21 13:23:33,585][02089] Avg episode reward: 27.818, avg true_objective: 11.818
[2024-12-21 13:23:33,701][02089] Num frames 6000...
[2024-12-21 13:23:33,821][02089] Num frames 6100...
[2024-12-21 13:23:33,941][02089] Num frames 6200...
[2024-12-21 13:23:34,107][02089] Avg episode rewards: #0: 23.821, true rewards: #0: 10.488
[2024-12-21 13:23:34,109][02089] Avg episode reward: 23.821, avg true_objective: 10.488
[2024-12-21 13:23:34,120][02089] Num frames 6300...
[2024-12-21 13:23:34,243][02089] Num frames 6400...
[2024-12-21 13:23:34,364][02089] Num frames 6500...
[2024-12-21 13:23:34,485][02089] Num frames 6600...
[2024-12-21 13:23:34,616][02089] Num frames 6700...
[2024-12-21 13:23:34,747][02089] Num frames 6800...
[2024-12-21 13:23:34,868][02089] Num frames 6900...
[2024-12-21 13:23:34,991][02089] Num frames 7000...
[2024-12-21 13:23:35,116][02089] Num frames 7100...
[2024-12-21 13:23:35,238][02089] Num frames 7200...
[2024-12-21 13:23:35,359][02089] Num frames 7300...
[2024-12-21 13:23:35,480][02089] Num frames 7400...
[2024-12-21 13:23:35,606][02089] Num frames 7500...
[2024-12-21 13:23:35,738][02089] Num frames 7600...
[2024-12-21 13:23:35,855][02089] Num frames 7700...
[2024-12-21 13:23:35,976][02089] Num frames 7800...
[2024-12-21 13:23:36,099][02089] Num frames 7900...
[2024-12-21 13:23:36,226][02089] Avg episode rewards: #0: 26.653, true rewards: #0: 11.367
[2024-12-21 13:23:36,228][02089] Avg episode reward: 26.653, avg true_objective: 11.367
[2024-12-21 13:23:36,283][02089] Num frames 8000...
[2024-12-21 13:23:36,409][02089] Num frames 8100...
[2024-12-21 13:23:36,535][02089] Num frames 8200...
[2024-12-21 13:23:36,666][02089] Num frames 8300...
[2024-12-21 13:23:36,795][02089] Num frames 8400...
[2024-12-21 13:23:36,916][02089] Num frames 8500...
[2024-12-21 13:23:37,043][02089] Num frames 8600...
[2024-12-21 13:23:37,168][02089] Num frames 8700...
[2024-12-21 13:23:37,292][02089] Num frames 8800...
[2024-12-21 13:23:37,416][02089] Num frames 8900...
[2024-12-21 13:23:37,543][02089] Num frames 9000...
[2024-12-21 13:23:37,701][02089] Avg episode rewards: #0: 26.609, true rewards: #0: 11.359
[2024-12-21 13:23:37,703][02089] Avg episode reward: 26.609, avg true_objective: 11.359
[2024-12-21 13:23:37,727][02089] Num frames 9100...
[2024-12-21 13:23:37,847][02089] Num frames 9200...
[2024-12-21 13:23:37,963][02089] Num frames 9300...
[2024-12-21 13:23:38,084][02089] Num frames 9400...
[2024-12-21 13:23:38,213][02089] Avg episode rewards: #0: 24.401, true rewards: #0: 10.512
[2024-12-21 13:23:38,215][02089] Avg episode reward: 24.401, avg true_objective: 10.512
[2024-12-21 13:23:38,266][02089] Num frames 9500...
[2024-12-21 13:23:38,390][02089] Num frames 9600...
[2024-12-21 13:23:38,534][02089] Num frames 9700...
[2024-12-21 13:23:38,714][02089] Num frames 9800...
[2024-12-21 13:23:38,897][02089] Num frames 9900...
[2024-12-21 13:23:39,064][02089] Num frames 10000...
[2024-12-21 13:23:39,232][02089] Num frames 10100...
[2024-12-21 13:23:39,398][02089] Num frames 10200...
[2024-12-21 13:23:39,568][02089] Num frames 10300...
[2024-12-21 13:23:39,737][02089] Num frames 10400...
[2024-12-21 13:23:39,910][02089] Num frames 10500...
[2024-12-21 13:23:40,088][02089] Num frames 10600...
[2024-12-21 13:23:40,264][02089] Num frames 10700...
[2024-12-21 13:23:40,446][02089] Num frames 10800...
[2024-12-21 13:23:40,630][02089] Num frames 10900...
[2024-12-21 13:23:40,691][02089] Avg episode rewards: #0: 25.701, true rewards: #0: 10.901
[2024-12-21 13:23:40,693][02089] Avg episode reward: 25.701, avg true_objective: 10.901
[2024-12-21 13:24:45,234][02089] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2024-12-21 13:30:11,250][02089] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2024-12-21 13:30:11,252][02089] Overriding arg 'num_workers' with value 1 passed from command line
[2024-12-21 13:30:11,254][02089] Adding new argument 'no_render'=True that is not in the saved config file!
[2024-12-21 13:30:11,256][02089] Adding new argument 'save_video'=True that is not in the saved config file!
[2024-12-21 13:30:11,258][02089] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2024-12-21 13:30:11,261][02089] Adding new argument 'video_name'=None that is not in the saved config file!
[2024-12-21 13:30:11,263][02089] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2024-12-21 13:30:11,265][02089] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2024-12-21 13:30:11,266][02089] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2024-12-21 13:30:11,268][02089] Adding new argument 'hf_repository'='husseinmo/vizdoom_health_gathering_supreme' that is not in the saved config file!
[2024-12-21 13:30:11,269][02089] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2024-12-21 13:30:11,270][02089] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2024-12-21 13:30:11,271][02089] Adding new argument 'train_script'=None that is not in the saved config file!
[2024-12-21 13:30:11,272][02089] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2024-12-21 13:30:11,273][02089] Using frameskip 1 and render_action_repeat=4 for evaluation
[2024-12-21 13:30:11,312][02089] RunningMeanStd input shape: (3, 72, 128)
[2024-12-21 13:30:11,314][02089] RunningMeanStd input shape: (1,)
[2024-12-21 13:30:11,333][02089] ConvEncoder: input_channels=3
[2024-12-21 13:30:11,395][02089] Conv encoder output size: 512
[2024-12-21 13:30:11,398][02089] Policy head output size: 512
[2024-12-21 13:30:11,438][02089] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2024-12-21 13:30:12,087][02089] Num frames 100...
[2024-12-21 13:30:12,263][02089] Num frames 200...
[2024-12-21 13:30:12,443][02089] Num frames 300...
[2024-12-21 13:30:12,584][02089] Num frames 400...
[2024-12-21 13:30:12,664][02089] Avg episode rewards: #0: 6.190, true rewards: #0: 4.190
[2024-12-21 13:30:12,665][02089] Avg episode reward: 6.190, avg true_objective: 4.190
[2024-12-21 13:30:12,770][02089] Num frames 500...
[2024-12-21 13:30:12,893][02089] Num frames 600...
[2024-12-21 13:30:13,016][02089] Num frames 700...
[2024-12-21 13:30:13,135][02089] Num frames 800...
[2024-12-21 13:30:13,262][02089] Num frames 900...
[2024-12-21 13:30:13,389][02089] Num frames 1000...
[2024-12-21 13:30:13,528][02089] Num frames 1100...
[2024-12-21 13:30:13,651][02089] Num frames 1200...
[2024-12-21 13:30:13,775][02089] Num frames 1300...
[2024-12-21 13:30:13,902][02089] Num frames 1400...
[2024-12-21 13:30:14,025][02089] Num frames 1500...
[2024-12-21 13:30:14,148][02089] Num frames 1600...
[2024-12-21 13:30:14,274][02089] Num frames 1700...
[2024-12-21 13:30:14,340][02089] Avg episode rewards: #0: 16.540, true rewards: #0: 8.540
[2024-12-21 13:30:14,341][02089] Avg episode reward: 16.540, avg true_objective: 8.540
[2024-12-21 13:30:14,458][02089] Num frames 1800...
[2024-12-21 13:30:14,597][02089] Num frames 1900...
[2024-12-21 13:30:14,717][02089] Num frames 2000...
[2024-12-21 13:30:14,837][02089] Num frames 2100...
[2024-12-21 13:30:14,958][02089] Num frames 2200...
[2024-12-21 13:30:15,080][02089] Num frames 2300...
[2024-12-21 13:30:15,205][02089] Num frames 2400...
[2024-12-21 13:30:15,328][02089] Num frames 2500...
[2024-12-21 13:30:15,457][02089] Num frames 2600...
[2024-12-21 13:30:15,595][02089] Num frames 2700...
[2024-12-21 13:30:15,647][02089] Avg episode rewards: #0: 18.333, true rewards: #0: 9.000
[2024-12-21 13:30:15,649][02089] Avg episode reward: 18.333, avg true_objective: 9.000
[2024-12-21 13:30:15,770][02089] Num frames 2800...
[2024-12-21 13:30:15,891][02089] Num frames 2900...
[2024-12-21 13:30:16,015][02089] Num frames 3000...
[2024-12-21 13:30:16,139][02089] Num frames 3100...
[2024-12-21 13:30:16,263][02089] Num frames 3200...
[2024-12-21 13:30:16,388][02089] Num frames 3300...
[2024-12-21 13:30:16,515][02089] Num frames 3400...
[2024-12-21 13:30:16,651][02089] Num frames 3500...
[2024-12-21 13:30:16,772][02089] Num frames 3600...
[2024-12-21 13:30:16,894][02089] Num frames 3700...
[2024-12-21 13:30:16,982][02089] Avg episode rewards: #0: 19.065, true rewards: #0: 9.315
[2024-12-21 13:30:16,983][02089] Avg episode reward: 19.065, avg true_objective: 9.315
[2024-12-21 13:30:17,074][02089] Num frames 3800...
[2024-12-21 13:30:17,199][02089] Num frames 3900...
[2024-12-21 13:30:17,322][02089] Num frames 4000...
[2024-12-21 13:30:17,445][02089] Num frames 4100...
[2024-12-21 13:30:17,574][02089] Num frames 4200...
[2024-12-21 13:30:17,836][02089] Num frames 4300...
[2024-12-21 13:30:17,999][02089] Num frames 4400...
[2024-12-21 13:30:18,128][02089] Num frames 4500...
[2024-12-21 13:30:18,263][02089] Num frames 4600...
[2024-12-21 13:30:18,389][02089] Num frames 4700...
[2024-12-21 13:30:18,697][02089] Num frames 4800...
[2024-12-21 13:30:18,842][02089] Num frames 4900...
[2024-12-21 13:30:18,968][02089] Num frames 5000...
[2024-12-21 13:30:19,093][02089] Num frames 5100...
[2024-12-21 13:30:19,178][02089] Avg episode rewards: #0: 22.448, true rewards: #0: 10.248
[2024-12-21 13:30:19,179][02089] Avg episode reward: 22.448, avg true_objective: 10.248
[2024-12-21 13:30:19,277][02089] Num frames 5200...
[2024-12-21 13:30:19,588][02089] Num frames 5300...
[2024-12-21 13:30:19,716][02089] Num frames 5400...
[2024-12-21 13:30:19,839][02089] Num frames 5500...
[2024-12-21 13:30:19,959][02089] Num frames 5600...
[2024-12-21 13:30:20,057][02089] Avg episode rewards: #0: 20.227, true rewards: #0: 9.393
[2024-12-21 13:30:20,058][02089] Avg episode reward: 20.227, avg true_objective: 9.393
[2024-12-21 13:30:20,141][02089] Num frames 5700...
[2024-12-21 13:30:20,269][02089] Num frames 5800...
[2024-12-21 13:30:20,404][02089] Num frames 5900...
[2024-12-21 13:30:20,533][02089] Num frames 6000...
[2024-12-21 13:30:20,660][02089] Num frames 6100...
[2024-12-21 13:30:20,787][02089] Num frames 6200...
[2024-12-21 13:30:20,906][02089] Num frames 6300...
[2024-12-21 13:30:21,034][02089] Num frames 6400...
[2024-12-21 13:30:21,161][02089] Num frames 6500...
[2024-12-21 13:30:21,290][02089] Num frames 6600...
[2024-12-21 13:30:21,417][02089] Num frames 6700...
[2024-12-21 13:30:21,550][02089] Num frames 6800...
[2024-12-21 13:30:21,683][02089] Num frames 6900...
[2024-12-21 13:30:21,820][02089] Num frames 7000...
[2024-12-21 13:30:21,945][02089] Num frames 7100...
[2024-12-21 13:30:22,068][02089] Num frames 7200...
[2024-12-21 13:30:22,192][02089] Num frames 7300...
[2024-12-21 13:30:22,325][02089] Num frames 7400...
[2024-12-21 13:30:22,449][02089] Num frames 7500...
[2024-12-21 13:30:22,628][02089] Num frames 7600...
[2024-12-21 13:30:22,813][02089] Num frames 7700...
[2024-12-21 13:30:22,932][02089] Avg episode rewards: #0: 25.766, true rewards: #0: 11.051
[2024-12-21 13:30:22,934][02089] Avg episode reward: 25.766, avg true_objective: 11.051
[2024-12-21 13:30:23,037][02089] Num frames 7800...
[2024-12-21 13:30:23,207][02089] Num frames 7900...
[2024-12-21 13:30:23,377][02089] Num frames 8000...
[2024-12-21 13:30:23,562][02089] Num frames 8100...
[2024-12-21 13:30:23,723][02089] Num frames 8200...
[2024-12-21 13:30:23,833][02089] Avg episode rewards: #0: 23.540, true rewards: #0: 10.290
[2024-12-21 13:30:23,835][02089] Avg episode reward: 23.540, avg true_objective: 10.290
[2024-12-21 13:30:23,952][02089] Num frames 8300...
[2024-12-21 13:30:24,126][02089] Num frames 8400...
[2024-12-21 13:30:24,297][02089] Num frames 8500...
[2024-12-21 13:30:24,476][02089] Num frames 8600...
[2024-12-21 13:30:24,658][02089] Num frames 8700...
[2024-12-21 13:30:24,832][02089] Num frames 8800...
[2024-12-21 13:30:25,008][02089] Num frames 8900...
[2024-12-21 13:30:25,167][02089] Num frames 9000...
[2024-12-21 13:30:25,219][02089] Avg episode rewards: #0: 22.555, true rewards: #0: 10.000
[2024-12-21 13:30:25,220][02089] Avg episode reward: 22.555, avg true_objective: 10.000
[2024-12-21 13:30:25,345][02089] Num frames 9100...
[2024-12-21 13:30:25,471][02089] Num frames 9200...
[2024-12-21 13:30:25,597][02089] Num frames 9300...
[2024-12-21 13:30:25,718][02089] Avg episode rewards: #0: 20.856, true rewards: #0: 9.356
[2024-12-21 13:30:25,720][02089] Avg episode reward: 20.856, avg true_objective: 9.356
[2024-12-21 13:31:20,323][02089] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2024-12-21 13:31:40,245][02089] The model has been pushed to https://huggingface.co/husseinmo/vizdoom_health_gathering_supreme
[2024-12-21 13:33:30,556][02089] Loading legacy config file train_dir/doom_health_gathering_supreme_2222/cfg.json instead of train_dir/doom_health_gathering_supreme_2222/config.json
[2024-12-21 13:33:30,558][02089] Loading existing experiment configuration from train_dir/doom_health_gathering_supreme_2222/config.json
[2024-12-21 13:33:30,560][02089] Overriding arg 'experiment' with value 'doom_health_gathering_supreme_2222' passed from command line
[2024-12-21 13:33:30,561][02089] Overriding arg 'train_dir' with value 'train_dir' passed from command line
[2024-12-21 13:33:30,563][02089] Overriding arg 'num_workers' with value 1 passed from command line
[2024-12-21 13:33:30,564][02089] Adding new argument 'lr_adaptive_min'=1e-06 that is not in the saved config file!
[2024-12-21 13:33:30,566][02089] Adding new argument 'lr_adaptive_max'=0.01 that is not in the saved config file!
[2024-12-21 13:33:30,567][02089] Adding new argument 'env_gpu_observations'=True that is not in the saved config file!
[2024-12-21 13:33:30,569][02089] Adding new argument 'no_render'=True that is not in the saved config file!
[2024-12-21 13:33:30,570][02089] Adding new argument 'save_video'=True that is not in the saved config file!
[2024-12-21 13:33:30,572][02089] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2024-12-21 13:33:30,574][02089] Adding new argument 'video_name'=None that is not in the saved config file!
[2024-12-21 13:33:30,575][02089] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2024-12-21 13:33:30,578][02089] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2024-12-21 13:33:30,580][02089] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2024-12-21 13:33:30,581][02089] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2024-12-21 13:33:30,582][02089] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2024-12-21 13:33:30,583][02089] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2024-12-21 13:33:30,587][02089] Adding new argument 'train_script'=None that is not in the saved config file!
[2024-12-21 13:33:30,592][02089] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2024-12-21 13:33:30,594][02089] Using frameskip 1 and render_action_repeat=4 for evaluation
[2024-12-21 13:33:30,618][02089] RunningMeanStd input shape: (3, 72, 128)
[2024-12-21 13:33:30,619][02089] RunningMeanStd input shape: (1,)
[2024-12-21 13:33:30,632][02089] ConvEncoder: input_channels=3
[2024-12-21 13:33:30,679][02089] Conv encoder output size: 512
[2024-12-21 13:33:30,682][02089] Policy head output size: 512
[2024-12-21 13:33:30,704][02089] Loading state from checkpoint train_dir/doom_health_gathering_supreme_2222/checkpoint_p0/checkpoint_000539850_4422451200.pth...
[2024-12-21 13:33:31,138][02089] Num frames 100...
[2024-12-21 13:33:31,263][02089] Num frames 200...
[2024-12-21 13:33:31,385][02089] Num frames 300...
[2024-12-21 13:33:31,524][02089] Num frames 400...
[2024-12-21 13:33:31,669][02089] Num frames 500...
[2024-12-21 13:33:31,790][02089] Num frames 600...
[2024-12-21 13:33:31,914][02089] Num frames 700...
[2024-12-21 13:33:32,040][02089] Num frames 800...
[2024-12-21 13:33:32,169][02089] Num frames 900...
[2024-12-21 13:33:32,295][02089] Num frames 1000...
[2024-12-21 13:33:32,424][02089] Num frames 1100...
[2024-12-21 13:33:32,567][02089] Num frames 1200...
[2024-12-21 13:33:32,697][02089] Num frames 1300...
[2024-12-21 13:33:32,824][02089] Num frames 1400...
[2024-12-21 13:33:32,946][02089] Num frames 1500...
[2024-12-21 13:33:33,071][02089] Num frames 1600...
[2024-12-21 13:33:33,196][02089] Num frames 1700...
[2024-12-21 13:33:33,327][02089] Num frames 1800...
[2024-12-21 13:33:33,451][02089] Num frames 1900...
[2024-12-21 13:33:33,594][02089] Num frames 2000...
[2024-12-21 13:33:33,726][02089] Num frames 2100...
[2024-12-21 13:33:33,778][02089] Avg episode rewards: #0: 60.998, true rewards: #0: 21.000
[2024-12-21 13:33:33,780][02089] Avg episode reward: 60.998, avg true_objective: 21.000
[2024-12-21 13:33:33,908][02089] Num frames 2200...
[2024-12-21 13:33:34,036][02089] Num frames 2300...
[2024-12-21 13:33:34,161][02089] Num frames 2400...
[2024-12-21 13:33:34,287][02089] Num frames 2500...
[2024-12-21 13:33:34,414][02089] Num frames 2600...
[2024-12-21 13:33:34,550][02089] Num frames 2700...
[2024-12-21 13:33:34,686][02089] Num frames 2800...
[2024-12-21 13:33:34,813][02089] Num frames 2900...
[2024-12-21 13:33:34,940][02089] Num frames 3000...
[2024-12-21 13:33:35,071][02089] Num frames 3100...
[2024-12-21 13:33:35,199][02089] Num frames 3200...
[2024-12-21 13:33:35,326][02089] Num frames 3300...
[2024-12-21 13:33:35,450][02089] Num frames 3400...
[2024-12-21 13:33:35,582][02089] Num frames 3500...
[2024-12-21 13:33:35,715][02089] Num frames 3600...
[2024-12-21 13:33:35,838][02089] Num frames 3700...
[2024-12-21 13:33:35,964][02089] Num frames 3800...
[2024-12-21 13:33:36,090][02089] Num frames 3900...
[2024-12-21 13:33:36,220][02089] Num frames 4000...
[2024-12-21 13:33:36,344][02089] Num frames 4100...
[2024-12-21 13:33:36,473][02089] Num frames 4200...
[2024-12-21 13:33:36,527][02089] Avg episode rewards: #0: 63.999, true rewards: #0: 21.000
[2024-12-21 13:33:36,529][02089] Avg episode reward: 63.999, avg true_objective: 21.000
[2024-12-21 13:33:36,674][02089] Num frames 4300...
[2024-12-21 13:33:36,796][02089] Num frames 4400...
[2024-12-21 13:33:36,920][02089] Num frames 4500...
[2024-12-21 13:33:37,114][02089] Num frames 4600...
[2024-12-21 13:33:37,293][02089] Num frames 4700...
[2024-12-21 13:33:37,464][02089] Num frames 4800...
[2024-12-21 13:33:37,669][02089] Num frames 4900...
[2024-12-21 13:33:37,847][02089] Num frames 5000...
[2024-12-21 13:33:38,013][02089] Num frames 5100...
[2024-12-21 13:33:38,186][02089] Num frames 5200...
[2024-12-21 13:33:38,377][02089] Num frames 5300...
[2024-12-21 13:33:38,576][02089] Num frames 5400...
[2024-12-21 13:33:38,772][02089] Num frames 5500...
[2024-12-21 13:33:38,977][02089] Num frames 5600...
[2024-12-21 13:33:39,156][02089] Num frames 5700...
[2024-12-21 13:33:39,335][02089] Num frames 5800...
[2024-12-21 13:33:39,524][02089] Num frames 5900...
[2024-12-21 13:33:39,709][02089] Num frames 6000...
[2024-12-21 13:33:39,840][02089] Num frames 6100...
[2024-12-21 13:33:39,963][02089] Num frames 6200...
[2024-12-21 13:33:40,095][02089] Num frames 6300...
[2024-12-21 13:33:40,148][02089] Avg episode rewards: #0: 62.332, true rewards: #0: 21.000
[2024-12-21 13:33:40,150][02089] Avg episode reward: 62.332, avg true_objective: 21.000
[2024-12-21 13:33:40,277][02089] Num frames 6400...
[2024-12-21 13:33:40,401][02089] Num frames 6500...
[2024-12-21 13:33:40,534][02089] Num frames 6600...
[2024-12-21 13:33:40,656][02089] Num frames 6700...
[2024-12-21 13:33:40,778][02089] Num frames 6800...
[2024-12-21 13:33:40,910][02089] Num frames 6900...
[2024-12-21 13:33:41,039][02089] Num frames 7000...
[2024-12-21 13:33:41,164][02089] Num frames 7100...
[2024-12-21 13:33:41,288][02089] Num frames 7200...
[2024-12-21 13:33:41,417][02089] Num frames 7300...
[2024-12-21 13:33:41,548][02089] Num frames 7400...
[2024-12-21 13:33:41,671][02089] Num frames 7500...
[2024-12-21 13:33:41,794][02089] Num frames 7600...
[2024-12-21 13:33:41,927][02089] Num frames 7700...
[2024-12-21 13:33:42,049][02089] Num frames 7800...
[2024-12-21 13:33:42,181][02089] Num frames 7900...
[2024-12-21 13:33:42,310][02089] Num frames 8000...
[2024-12-21 13:33:42,441][02089] Num frames 8100...
[2024-12-21 13:33:42,582][02089] Num frames 8200...
[2024-12-21 13:33:42,706][02089] Num frames 8300...
[2024-12-21 13:33:42,832][02089] Num frames 8400...
[2024-12-21 13:33:42,885][02089] Avg episode rewards: #0: 63.749, true rewards: #0: 21.000
[2024-12-21 13:33:42,887][02089] Avg episode reward: 63.749, avg true_objective: 21.000
[2024-12-21 13:33:43,015][02089] Num frames 8500...
[2024-12-21 13:33:43,143][02089] Num frames 8600...
[2024-12-21 13:33:43,264][02089] Num frames 8700...
[2024-12-21 13:33:43,386][02089] Num frames 8800...
[2024-12-21 13:33:43,519][02089] Num frames 8900...
[2024-12-21 13:33:43,654][02089] Num frames 9000...
[2024-12-21 13:33:43,778][02089] Num frames 9100...
[2024-12-21 13:33:43,911][02089] Num frames 9200...
[2024-12-21 13:33:44,036][02089] Num frames 9300...
[2024-12-21 13:33:44,166][02089] Num frames 9400...
[2024-12-21 13:33:44,296][02089] Num frames 9500...
[2024-12-21 13:33:44,423][02089] Num frames 9600...
[2024-12-21 13:33:44,509][02089] Avg episode rewards: #0: 57.843, true rewards: #0: 19.244
[2024-12-21 13:33:44,511][02089] Avg episode reward: 57.843, avg true_objective: 19.244
[2024-12-21 13:33:44,619][02089] Num frames 9700...
[2024-12-21 13:33:44,746][02089] Num frames 9800...
[2024-12-21 13:33:44,870][02089] Num frames 9900...
[2024-12-21 13:33:45,007][02089] Num frames 10000...
[2024-12-21 13:33:45,138][02089] Num frames 10100...
[2024-12-21 13:33:45,272][02089] Num frames 10200...
[2024-12-21 13:33:45,396][02089] Num frames 10300...
[2024-12-21 13:33:45,528][02089] Num frames 10400...
[2024-12-21 13:33:45,656][02089] Num frames 10500...
[2024-12-21 13:33:45,780][02089] Num frames 10600...
[2024-12-21 13:33:45,905][02089] Num frames 10700...
[2024-12-21 13:33:46,041][02089] Num frames 10800...
[2024-12-21 13:33:46,168][02089] Num frames 10900...
[2024-12-21 13:33:46,291][02089] Num frames 11000...
[2024-12-21 13:33:46,420][02089] Num frames 11100...
[2024-12-21 13:33:46,553][02089] Num frames 11200...
[2024-12-21 13:33:46,677][02089] Num frames 11300...
[2024-12-21 13:33:46,799][02089] Num frames 11400...
[2024-12-21 13:33:46,931][02089] Num frames 11500...
[2024-12-21 13:33:47,068][02089] Num frames 11600...
[2024-12-21 13:33:47,193][02089] Num frames 11700...
[2024-12-21 13:33:47,278][02089] Avg episode rewards: #0: 59.369, true rewards: #0: 19.537
[2024-12-21 13:33:47,279][02089] Avg episode reward: 59.369, avg true_objective: 19.537
[2024-12-21 13:33:47,383][02089] Num frames 11800...
[2024-12-21 13:33:47,522][02089] Num frames 11900...
[2024-12-21 13:33:47,651][02089] Num frames 12000...
[2024-12-21 13:33:47,774][02089] Num frames 12100...
[2024-12-21 13:33:47,901][02089] Num frames 12200...
[2024-12-21 13:33:48,035][02089] Num frames 12300...
[2024-12-21 13:33:48,160][02089] Num frames 12400...
[2024-12-21 13:33:48,292][02089] Num frames 12500...
[2024-12-21 13:33:48,419][02089] Num frames 12600...
[2024-12-21 13:33:48,567][02089] Num frames 12700...
[2024-12-21 13:33:48,694][02089] Num frames 12800...
[2024-12-21 13:33:48,821][02089] Num frames 12900...
[2024-12-21 13:33:48,946][02089] Num frames 13000...
[2024-12-21 13:33:49,081][02089] Num frames 13100...
[2024-12-21 13:33:49,215][02089] Num frames 13200...
[2024-12-21 13:33:49,342][02089] Num frames 13300...
[2024-12-21 13:33:49,422][02089] Avg episode rewards: #0: 57.738, true rewards: #0: 19.024
[2024-12-21 13:33:49,424][02089] Avg episode reward: 57.738, avg true_objective: 19.024
[2024-12-21 13:33:49,534][02089] Num frames 13400...
[2024-12-21 13:33:49,659][02089] Num frames 13500...
[2024-12-21 13:33:49,824][02089] Num frames 13600...
[2024-12-21 13:33:50,002][02089] Num frames 13700...
[2024-12-21 13:33:50,195][02089] Num frames 13800...
[2024-12-21 13:33:50,369][02089] Num frames 13900...
[2024-12-21 13:33:50,547][02089] Num frames 14000...
[2024-12-21 13:33:50,714][02089] Num frames 14100...
[2024-12-21 13:33:50,886][02089] Num frames 14200...
[2024-12-21 13:33:51,057][02089] Num frames 14300...
[2024-12-21 13:33:51,240][02089] Num frames 14400...
[2024-12-21 13:33:51,426][02089] Num frames 14500...
[2024-12-21 13:33:51,606][02089] Num frames 14600...
[2024-12-21 13:33:51,787][02089] Num frames 14700...
[2024-12-21 13:33:51,972][02089] Num frames 14800...
[2024-12-21 13:33:52,161][02089] Num frames 14900...
[2024-12-21 13:33:52,353][02089] Num frames 15000...
[2024-12-21 13:33:52,524][02089] Num frames 15100...
[2024-12-21 13:33:52,650][02089] Num frames 15200...
[2024-12-21 13:33:52,775][02089] Num frames 15300...
[2024-12-21 13:33:52,900][02089] Num frames 15400...
[2024-12-21 13:33:52,978][02089] Avg episode rewards: #0: 58.645, true rewards: #0: 19.271
[2024-12-21 13:33:52,981][02089] Avg episode reward: 58.645, avg true_objective: 19.271
[2024-12-21 13:33:53,087][02089] Num frames 15500...
[2024-12-21 13:33:53,223][02089] Num frames 15600...
[2024-12-21 13:33:53,358][02089] Num frames 15700...
[2024-12-21 13:33:53,495][02089] Num frames 15800...
[2024-12-21 13:33:53,629][02089] Num frames 15900...
[2024-12-21 13:33:53,762][02089] Num frames 16000...
[2024-12-21 13:33:53,884][02089] Num frames 16100...
[2024-12-21 13:33:54,012][02089] Num frames 16200...
[2024-12-21 13:33:54,138][02089] Num frames 16300...
[2024-12-21 13:33:54,280][02089] Num frames 16400...
[2024-12-21 13:33:54,410][02089] Num frames 16500...
[2024-12-21 13:33:54,548][02089] Num frames 16600...
[2024-12-21 13:33:54,676][02089] Num frames 16700...
[2024-12-21 13:33:54,804][02089] Num frames 16800...
[2024-12-21 13:33:54,929][02089] Num frames 16900...
[2024-12-21 13:33:55,062][02089] Num frames 17000...
[2024-12-21 13:33:55,201][02089] Num frames 17100...
[2024-12-21 13:33:55,337][02089] Num frames 17200...
[2024-12-21 13:33:55,463][02089] Num frames 17300...
[2024-12-21 13:33:55,599][02089] Num frames 17400...
[2024-12-21 13:33:55,725][02089] Num frames 17500...
[2024-12-21 13:33:55,806][02089] Avg episode rewards: #0: 59.240, true rewards: #0: 19.463
[2024-12-21 13:33:55,807][02089] Avg episode reward: 59.240, avg true_objective: 19.463
[2024-12-21 13:33:55,917][02089] Num frames 17600...
[2024-12-21 13:33:56,054][02089] Num frames 17700...
[2024-12-21 13:33:56,181][02089] Num frames 17800...
[2024-12-21 13:33:56,316][02089] Num frames 17900...
[2024-12-21 13:33:56,442][02089] Num frames 18000...
[2024-12-21 13:33:56,579][02089] Num frames 18100...
[2024-12-21 13:33:56,705][02089] Num frames 18200...
[2024-12-21 13:33:56,832][02089] Num frames 18300...
[2024-12-21 13:33:56,956][02089] Num frames 18400...
[2024-12-21 13:33:57,085][02089] Num frames 18500...
[2024-12-21 13:33:57,211][02089] Num frames 18600...
[2024-12-21 13:33:57,349][02089] Num frames 18700...
[2024-12-21 13:33:57,479][02089] Num frames 18800...
[2024-12-21 13:33:57,617][02089] Num frames 18900...
[2024-12-21 13:33:57,747][02089] Num frames 19000...
[2024-12-21 13:33:57,882][02089] Num frames 19100...
[2024-12-21 13:33:58,009][02089] Num frames 19200...
[2024-12-21 13:33:58,137][02089] Num frames 19300...
[2024-12-21 13:33:58,250][02089] Avg episode rewards: #0: 58.640, true rewards: #0: 19.341
[2024-12-21 13:33:58,253][02089] Avg episode reward: 58.640, avg true_objective: 19.341
[2024-12-21 13:35:53,937][02089] Replay video saved to train_dir/doom_health_gathering_supreme_2222/replay.mp4!