saikiranp's picture
Upload . with huggingface_hub
1c326f7
[2023-02-23 00:31:49,359][05422] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-23 00:31:49,362][05422] Rollout worker 0 uses device cpu
[2023-02-23 00:31:49,364][05422] Rollout worker 1 uses device cpu
[2023-02-23 00:31:49,366][05422] Rollout worker 2 uses device cpu
[2023-02-23 00:31:49,367][05422] Rollout worker 3 uses device cpu
[2023-02-23 00:31:49,369][05422] Rollout worker 4 uses device cpu
[2023-02-23 00:31:49,370][05422] Rollout worker 5 uses device cpu
[2023-02-23 00:31:49,371][05422] Rollout worker 6 uses device cpu
[2023-02-23 00:31:49,373][05422] Rollout worker 7 uses device cpu
[2023-02-23 00:31:49,553][05422] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 00:31:49,555][05422] InferenceWorker_p0-w0: min num requests: 2
[2023-02-23 00:31:49,586][05422] Starting all processes...
[2023-02-23 00:31:49,588][05422] Starting process learner_proc0
[2023-02-23 00:31:49,641][05422] Starting all processes...
[2023-02-23 00:31:49,652][05422] Starting process inference_proc0-0
[2023-02-23 00:31:49,654][05422] Starting process rollout_proc0
[2023-02-23 00:31:49,654][05422] Starting process rollout_proc1
[2023-02-23 00:31:49,654][05422] Starting process rollout_proc2
[2023-02-23 00:31:49,654][05422] Starting process rollout_proc3
[2023-02-23 00:31:49,654][05422] Starting process rollout_proc4
[2023-02-23 00:31:49,654][05422] Starting process rollout_proc5
[2023-02-23 00:31:49,654][05422] Starting process rollout_proc6
[2023-02-23 00:31:49,654][05422] Starting process rollout_proc7
[2023-02-23 00:32:01,167][11215] Worker 0 uses CPU cores [0]
[2023-02-23 00:32:01,315][11223] Worker 7 uses CPU cores [1]
[2023-02-23 00:32:01,400][11219] Worker 3 uses CPU cores [1]
[2023-02-23 00:32:01,406][11201] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 00:32:01,406][11201] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-23 00:32:01,410][11222] Worker 6 uses CPU cores [0]
[2023-02-23 00:32:01,465][11221] Worker 5 uses CPU cores [1]
[2023-02-23 00:32:01,478][11217] Worker 1 uses CPU cores [1]
[2023-02-23 00:32:01,488][11220] Worker 4 uses CPU cores [0]
[2023-02-23 00:32:01,535][11216] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 00:32:01,535][11216] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-23 00:32:01,561][11218] Worker 2 uses CPU cores [0]
[2023-02-23 00:32:02,049][11201] Num visible devices: 1
[2023-02-23 00:32:02,049][11216] Num visible devices: 1
[2023-02-23 00:32:02,059][11201] Starting seed is not provided
[2023-02-23 00:32:02,059][11201] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 00:32:02,059][11201] Initializing actor-critic model on device cuda:0
[2023-02-23 00:32:02,060][11201] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 00:32:02,061][11201] RunningMeanStd input shape: (1,)
[2023-02-23 00:32:02,073][11201] ConvEncoder: input_channels=3
[2023-02-23 00:32:02,335][11201] Conv encoder output size: 512
[2023-02-23 00:32:02,336][11201] Policy head output size: 512
[2023-02-23 00:32:02,381][11201] Created Actor Critic model with architecture:
[2023-02-23 00:32:02,381][11201] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-23 00:32:08,692][11201] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-23 00:32:08,694][11201] No checkpoints found
[2023-02-23 00:32:08,694][11201] Did not load from checkpoint, starting from scratch!
[2023-02-23 00:32:08,695][11201] Initialized policy 0 weights for model version 0
[2023-02-23 00:32:08,698][11201] LearnerWorker_p0 finished initialization!
[2023-02-23 00:32:08,699][11201] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 00:32:08,916][11216] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 00:32:08,917][11216] RunningMeanStd input shape: (1,)
[2023-02-23 00:32:08,929][11216] ConvEncoder: input_channels=3
[2023-02-23 00:32:09,023][11216] Conv encoder output size: 512
[2023-02-23 00:32:09,024][11216] Policy head output size: 512
[2023-02-23 00:32:09,546][05422] Heartbeat connected on Batcher_0
[2023-02-23 00:32:09,550][05422] Heartbeat connected on LearnerWorker_p0
[2023-02-23 00:32:09,565][05422] Heartbeat connected on RolloutWorker_w0
[2023-02-23 00:32:09,569][05422] Heartbeat connected on RolloutWorker_w1
[2023-02-23 00:32:09,573][05422] Heartbeat connected on RolloutWorker_w2
[2023-02-23 00:32:09,578][05422] Heartbeat connected on RolloutWorker_w3
[2023-02-23 00:32:09,579][05422] Heartbeat connected on RolloutWorker_w5
[2023-02-23 00:32:09,582][05422] Heartbeat connected on RolloutWorker_w4
[2023-02-23 00:32:09,585][05422] Heartbeat connected on RolloutWorker_w6
[2023-02-23 00:32:09,589][05422] Heartbeat connected on RolloutWorker_w7
[2023-02-23 00:32:10,488][05422] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 00:32:11,793][05422] Inference worker 0-0 is ready!
[2023-02-23 00:32:11,798][05422] All inference workers are ready! Signal rollout workers to start!
[2023-02-23 00:32:11,800][05422] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-23 00:32:11,907][11220] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 00:32:11,939][11222] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 00:32:11,963][11215] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 00:32:11,970][11218] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 00:32:12,038][11223] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 00:32:12,067][11221] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 00:32:12,082][11217] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 00:32:12,213][11219] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 00:32:13,698][11223] Decorrelating experience for 0 frames...
[2023-02-23 00:32:13,699][11221] Decorrelating experience for 0 frames...
[2023-02-23 00:32:13,707][11222] Decorrelating experience for 0 frames...
[2023-02-23 00:32:13,707][11220] Decorrelating experience for 0 frames...
[2023-02-23 00:32:13,722][11215] Decorrelating experience for 0 frames...
[2023-02-23 00:32:13,745][11218] Decorrelating experience for 0 frames...
[2023-02-23 00:32:14,804][11223] Decorrelating experience for 32 frames...
[2023-02-23 00:32:14,810][11221] Decorrelating experience for 32 frames...
[2023-02-23 00:32:14,938][11219] Decorrelating experience for 0 frames...
[2023-02-23 00:32:15,271][11220] Decorrelating experience for 32 frames...
[2023-02-23 00:32:15,276][11222] Decorrelating experience for 32 frames...
[2023-02-23 00:32:15,281][11215] Decorrelating experience for 32 frames...
[2023-02-23 00:32:15,488][05422] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 00:32:16,252][11219] Decorrelating experience for 32 frames...
[2023-02-23 00:32:16,264][11217] Decorrelating experience for 0 frames...
[2023-02-23 00:32:16,325][11218] Decorrelating experience for 32 frames...
[2023-02-23 00:32:16,395][11223] Decorrelating experience for 64 frames...
[2023-02-23 00:32:16,562][11215] Decorrelating experience for 64 frames...
[2023-02-23 00:32:17,203][11217] Decorrelating experience for 32 frames...
[2023-02-23 00:32:17,403][11223] Decorrelating experience for 96 frames...
[2023-02-23 00:32:17,774][11222] Decorrelating experience for 64 frames...
[2023-02-23 00:32:17,880][11220] Decorrelating experience for 64 frames...
[2023-02-23 00:32:18,067][11218] Decorrelating experience for 64 frames...
[2023-02-23 00:32:18,157][11221] Decorrelating experience for 64 frames...
[2023-02-23 00:32:18,271][11215] Decorrelating experience for 96 frames...
[2023-02-23 00:32:18,770][11217] Decorrelating experience for 64 frames...
[2023-02-23 00:32:18,850][11219] Decorrelating experience for 64 frames...
[2023-02-23 00:32:19,265][11217] Decorrelating experience for 96 frames...
[2023-02-23 00:32:19,475][11220] Decorrelating experience for 96 frames...
[2023-02-23 00:32:19,573][11222] Decorrelating experience for 96 frames...
[2023-02-23 00:32:19,837][11219] Decorrelating experience for 96 frames...
[2023-02-23 00:32:20,153][11221] Decorrelating experience for 96 frames...
[2023-02-23 00:32:20,359][11218] Decorrelating experience for 96 frames...
[2023-02-23 00:32:20,488][05422] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 00:32:23,539][11201] Signal inference workers to stop experience collection...
[2023-02-23 00:32:23,562][11216] InferenceWorker_p0-w0: stopping experience collection
[2023-02-23 00:32:25,488][05422] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 148.5. Samples: 2228. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 00:32:25,490][05422] Avg episode reward: [(0, '2.063')]
[2023-02-23 00:32:26,047][11201] Signal inference workers to resume experience collection...
[2023-02-23 00:32:26,049][11216] InferenceWorker_p0-w0: resuming experience collection
[2023-02-23 00:32:30,488][05422] Fps is (10 sec: 1638.4, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 16384. Throughput: 0: 214.1. Samples: 4282. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-02-23 00:32:30,493][05422] Avg episode reward: [(0, '3.157')]
[2023-02-23 00:32:35,488][05422] Fps is (10 sec: 3276.8, 60 sec: 1310.7, 300 sec: 1310.7). Total num frames: 32768. Throughput: 0: 259.7. Samples: 6492. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 00:32:35,490][05422] Avg episode reward: [(0, '3.782')]
[2023-02-23 00:32:37,105][11216] Updated weights for policy 0, policy_version 10 (0.0012)
[2023-02-23 00:32:40,488][05422] Fps is (10 sec: 3686.4, 60 sec: 1774.9, 300 sec: 1774.9). Total num frames: 53248. Throughput: 0: 424.9. Samples: 12748. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:32:40,490][05422] Avg episode reward: [(0, '4.381')]
[2023-02-23 00:32:45,491][05422] Fps is (10 sec: 4504.3, 60 sec: 2223.4, 300 sec: 2223.4). Total num frames: 77824. Throughput: 0: 573.8. Samples: 20086. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:32:45,498][05422] Avg episode reward: [(0, '4.412')]
[2023-02-23 00:32:45,755][11216] Updated weights for policy 0, policy_version 20 (0.0015)
[2023-02-23 00:32:50,490][05422] Fps is (10 sec: 4095.2, 60 sec: 2355.1, 300 sec: 2355.1). Total num frames: 94208. Throughput: 0: 563.2. Samples: 22530. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:32:50,496][05422] Avg episode reward: [(0, '4.195')]
[2023-02-23 00:32:55,488][05422] Fps is (10 sec: 3277.7, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 110592. Throughput: 0: 601.8. Samples: 27082. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 00:32:55,491][05422] Avg episode reward: [(0, '4.247')]
[2023-02-23 00:32:55,497][11201] Saving new best policy, reward=4.247!
[2023-02-23 00:32:57,855][11216] Updated weights for policy 0, policy_version 30 (0.0014)
[2023-02-23 00:33:00,488][05422] Fps is (10 sec: 4096.7, 60 sec: 2703.4, 300 sec: 2703.4). Total num frames: 135168. Throughput: 0: 748.2. Samples: 33670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:33:00,496][05422] Avg episode reward: [(0, '4.374')]
[2023-02-23 00:33:00,499][11201] Saving new best policy, reward=4.374!
[2023-02-23 00:33:05,499][05422] Fps is (10 sec: 4500.7, 60 sec: 2829.4, 300 sec: 2829.4). Total num frames: 155648. Throughput: 0: 827.9. Samples: 37266. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 00:33:05,507][05422] Avg episode reward: [(0, '4.275')]
[2023-02-23 00:33:07,245][11216] Updated weights for policy 0, policy_version 40 (0.0015)
[2023-02-23 00:33:10,488][05422] Fps is (10 sec: 3686.4, 60 sec: 2867.2, 300 sec: 2867.2). Total num frames: 172032. Throughput: 0: 898.7. Samples: 42668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:33:10,491][05422] Avg episode reward: [(0, '4.472')]
[2023-02-23 00:33:10,493][11201] Saving new best policy, reward=4.472!
[2023-02-23 00:33:15,488][05422] Fps is (10 sec: 3280.4, 60 sec: 3140.3, 300 sec: 2898.7). Total num frames: 188416. Throughput: 0: 959.1. Samples: 47442. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:33:15,490][05422] Avg episode reward: [(0, '4.576')]
[2023-02-23 00:33:15,497][11201] Saving new best policy, reward=4.576!
[2023-02-23 00:33:18,395][11216] Updated weights for policy 0, policy_version 50 (0.0025)
[2023-02-23 00:33:20,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3042.7). Total num frames: 212992. Throughput: 0: 986.8. Samples: 50898. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:33:20,490][05422] Avg episode reward: [(0, '4.417')]
[2023-02-23 00:33:25,488][05422] Fps is (10 sec: 4505.4, 60 sec: 3891.2, 300 sec: 3112.9). Total num frames: 233472. Throughput: 0: 1010.6. Samples: 58226. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:33:25,490][05422] Avg episode reward: [(0, '4.368')]
[2023-02-23 00:33:28,194][11216] Updated weights for policy 0, policy_version 60 (0.0040)
[2023-02-23 00:33:30,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3123.2). Total num frames: 249856. Throughput: 0: 953.9. Samples: 63010. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:33:30,493][05422] Avg episode reward: [(0, '4.546')]
[2023-02-23 00:33:35,488][05422] Fps is (10 sec: 3686.5, 60 sec: 3959.5, 300 sec: 3180.4). Total num frames: 270336. Throughput: 0: 951.6. Samples: 65350. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 00:33:35,489][05422] Avg episode reward: [(0, '4.477')]
[2023-02-23 00:33:39,020][11216] Updated weights for policy 0, policy_version 70 (0.0012)
[2023-02-23 00:33:40,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3231.3). Total num frames: 290816. Throughput: 0: 998.8. Samples: 72028. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:33:40,496][05422] Avg episode reward: [(0, '4.382')]
[2023-02-23 00:33:45,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.7, 300 sec: 3319.9). Total num frames: 315392. Throughput: 0: 1007.9. Samples: 79026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:33:45,494][05422] Avg episode reward: [(0, '4.430')]
[2023-02-23 00:33:45,510][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000077_315392.pth...
[2023-02-23 00:33:49,562][11216] Updated weights for policy 0, policy_version 80 (0.0022)
[2023-02-23 00:33:50,489][05422] Fps is (10 sec: 3686.0, 60 sec: 3891.3, 300 sec: 3276.8). Total num frames: 327680. Throughput: 0: 976.9. Samples: 81216. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:33:50,494][05422] Avg episode reward: [(0, '4.358')]
[2023-02-23 00:33:55,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3315.8). Total num frames: 348160. Throughput: 0: 959.4. Samples: 85840. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:33:55,494][05422] Avg episode reward: [(0, '4.352')]
[2023-02-23 00:33:59,485][11216] Updated weights for policy 0, policy_version 90 (0.0011)
[2023-02-23 00:34:00,488][05422] Fps is (10 sec: 4506.1, 60 sec: 3959.5, 300 sec: 3388.5). Total num frames: 372736. Throughput: 0: 1012.9. Samples: 93024. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:34:00,490][05422] Avg episode reward: [(0, '4.369')]
[2023-02-23 00:34:05,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3960.2, 300 sec: 3419.3). Total num frames: 393216. Throughput: 0: 1018.1. Samples: 96714. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:34:05,499][05422] Avg episode reward: [(0, '4.411')]
[2023-02-23 00:34:10,448][11216] Updated weights for policy 0, policy_version 100 (0.0032)
[2023-02-23 00:34:10,488][05422] Fps is (10 sec: 3686.1, 60 sec: 3959.4, 300 sec: 3413.3). Total num frames: 409600. Throughput: 0: 964.9. Samples: 101648. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:34:10,493][05422] Avg episode reward: [(0, '4.520')]
[2023-02-23 00:34:15,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3407.9). Total num frames: 425984. Throughput: 0: 976.7. Samples: 106962. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:34:15,494][05422] Avg episode reward: [(0, '4.604')]
[2023-02-23 00:34:15,499][11201] Saving new best policy, reward=4.604!
[2023-02-23 00:34:19,961][11216] Updated weights for policy 0, policy_version 110 (0.0020)
[2023-02-23 00:34:20,488][05422] Fps is (10 sec: 4096.3, 60 sec: 3959.5, 300 sec: 3465.8). Total num frames: 450560. Throughput: 0: 1004.7. Samples: 110560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:34:20,493][05422] Avg episode reward: [(0, '4.377')]
[2023-02-23 00:34:25,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3489.2). Total num frames: 471040. Throughput: 0: 1010.4. Samples: 117494. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:34:25,494][05422] Avg episode reward: [(0, '4.520')]
[2023-02-23 00:34:30,488][05422] Fps is (10 sec: 3686.3, 60 sec: 3959.4, 300 sec: 3481.6). Total num frames: 487424. Throughput: 0: 955.6. Samples: 122026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:34:30,490][05422] Avg episode reward: [(0, '4.542')]
[2023-02-23 00:34:31,594][11216] Updated weights for policy 0, policy_version 120 (0.0033)
[2023-02-23 00:34:35,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3474.5). Total num frames: 503808. Throughput: 0: 958.4. Samples: 124342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:34:35,494][05422] Avg episode reward: [(0, '4.557')]
[2023-02-23 00:34:40,488][05422] Fps is (10 sec: 4096.1, 60 sec: 3959.5, 300 sec: 3522.6). Total num frames: 528384. Throughput: 0: 1011.8. Samples: 131370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:34:40,495][05422] Avg episode reward: [(0, '4.442')]
[2023-02-23 00:34:40,766][11216] Updated weights for policy 0, policy_version 130 (0.0025)
[2023-02-23 00:34:45,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3541.1). Total num frames: 548864. Throughput: 0: 995.8. Samples: 137836. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:34:45,493][05422] Avg episode reward: [(0, '4.619')]
[2023-02-23 00:34:45,506][11201] Saving new best policy, reward=4.619!
[2023-02-23 00:34:50,489][05422] Fps is (10 sec: 3686.0, 60 sec: 3959.5, 300 sec: 3532.8). Total num frames: 565248. Throughput: 0: 963.6. Samples: 140078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:34:50,491][05422] Avg episode reward: [(0, '4.621')]
[2023-02-23 00:34:50,497][11201] Saving new best policy, reward=4.621!
[2023-02-23 00:34:52,760][11216] Updated weights for policy 0, policy_version 140 (0.0034)
[2023-02-23 00:34:55,488][05422] Fps is (10 sec: 3686.3, 60 sec: 3959.4, 300 sec: 3549.9). Total num frames: 585728. Throughput: 0: 966.3. Samples: 145132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:34:55,496][05422] Avg episode reward: [(0, '4.546')]
[2023-02-23 00:35:00,488][05422] Fps is (10 sec: 4096.3, 60 sec: 3891.2, 300 sec: 3565.9). Total num frames: 606208. Throughput: 0: 1007.3. Samples: 152290. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:35:00,495][05422] Avg episode reward: [(0, '4.507')]
[2023-02-23 00:35:01,334][11216] Updated weights for policy 0, policy_version 150 (0.0015)
[2023-02-23 00:35:05,489][05422] Fps is (10 sec: 4095.6, 60 sec: 3891.1, 300 sec: 3581.0). Total num frames: 626688. Throughput: 0: 1005.4. Samples: 155804. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:35:05,501][05422] Avg episode reward: [(0, '4.651')]
[2023-02-23 00:35:05,518][11201] Saving new best policy, reward=4.651!
[2023-02-23 00:35:10,488][05422] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3572.6). Total num frames: 643072. Throughput: 0: 950.2. Samples: 160252. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:35:10,495][05422] Avg episode reward: [(0, '4.751')]
[2023-02-23 00:35:10,496][11201] Saving new best policy, reward=4.751!
[2023-02-23 00:35:13,208][11216] Updated weights for policy 0, policy_version 160 (0.0021)
[2023-02-23 00:35:15,488][05422] Fps is (10 sec: 3686.8, 60 sec: 3959.5, 300 sec: 3586.8). Total num frames: 663552. Throughput: 0: 983.3. Samples: 166274. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:35:15,495][05422] Avg episode reward: [(0, '4.696')]
[2023-02-23 00:35:20,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3621.7). Total num frames: 688128. Throughput: 0: 1012.9. Samples: 169924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:35:20,490][05422] Avg episode reward: [(0, '4.607')]
[2023-02-23 00:35:21,694][11216] Updated weights for policy 0, policy_version 170 (0.0019)
[2023-02-23 00:35:25,488][05422] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3612.9). Total num frames: 704512. Throughput: 0: 999.3. Samples: 176338. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:35:25,495][05422] Avg episode reward: [(0, '4.675')]
[2023-02-23 00:35:30,488][05422] Fps is (10 sec: 3276.5, 60 sec: 3891.2, 300 sec: 3604.5). Total num frames: 720896. Throughput: 0: 956.0. Samples: 180856. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:35:30,493][05422] Avg episode reward: [(0, '4.637')]
[2023-02-23 00:35:33,682][11216] Updated weights for policy 0, policy_version 180 (0.0019)
[2023-02-23 00:35:35,488][05422] Fps is (10 sec: 4096.2, 60 sec: 4027.7, 300 sec: 3636.4). Total num frames: 745472. Throughput: 0: 973.8. Samples: 183900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:35:35,495][05422] Avg episode reward: [(0, '4.954')]
[2023-02-23 00:35:35,506][11201] Saving new best policy, reward=4.954!
[2023-02-23 00:35:40,488][05422] Fps is (10 sec: 4506.0, 60 sec: 3959.5, 300 sec: 3647.4). Total num frames: 765952. Throughput: 0: 1023.1. Samples: 191172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:35:40,494][05422] Avg episode reward: [(0, '5.142')]
[2023-02-23 00:35:40,501][11201] Saving new best policy, reward=5.142!
[2023-02-23 00:35:42,661][11216] Updated weights for policy 0, policy_version 190 (0.0015)
[2023-02-23 00:35:45,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3657.8). Total num frames: 786432. Throughput: 0: 991.7. Samples: 196916. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:35:45,493][05422] Avg episode reward: [(0, '5.108')]
[2023-02-23 00:35:45,510][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000192_786432.pth...
[2023-02-23 00:35:50,489][05422] Fps is (10 sec: 3276.5, 60 sec: 3891.2, 300 sec: 3630.5). Total num frames: 798720. Throughput: 0: 963.5. Samples: 199160. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:35:50,493][05422] Avg episode reward: [(0, '5.375')]
[2023-02-23 00:35:50,500][11201] Saving new best policy, reward=5.375!
[2023-02-23 00:35:54,078][11216] Updated weights for policy 0, policy_version 200 (0.0020)
[2023-02-23 00:35:55,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3659.1). Total num frames: 823296. Throughput: 0: 995.6. Samples: 205056. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 00:35:55,494][05422] Avg episode reward: [(0, '5.607')]
[2023-02-23 00:35:55,507][11201] Saving new best policy, reward=5.607!
[2023-02-23 00:36:00,492][05422] Fps is (10 sec: 4913.3, 60 sec: 4027.4, 300 sec: 3686.3). Total num frames: 847872. Throughput: 0: 1021.7. Samples: 212254. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:36:00,503][05422] Avg episode reward: [(0, '5.957')]
[2023-02-23 00:36:00,508][11201] Saving new best policy, reward=5.957!
[2023-02-23 00:36:03,691][11216] Updated weights for policy 0, policy_version 210 (0.0013)
[2023-02-23 00:36:05,488][05422] Fps is (10 sec: 4095.8, 60 sec: 3959.5, 300 sec: 3677.7). Total num frames: 864256. Throughput: 0: 1000.2. Samples: 214932. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:36:05,491][05422] Avg episode reward: [(0, '6.104')]
[2023-02-23 00:36:05,504][11201] Saving new best policy, reward=6.104!
[2023-02-23 00:36:10,488][05422] Fps is (10 sec: 3278.3, 60 sec: 3959.5, 300 sec: 3669.3). Total num frames: 880640. Throughput: 0: 959.5. Samples: 219514. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:36:10,493][05422] Avg episode reward: [(0, '6.021')]
[2023-02-23 00:36:14,516][11216] Updated weights for policy 0, policy_version 220 (0.0018)
[2023-02-23 00:36:15,488][05422] Fps is (10 sec: 4096.2, 60 sec: 4027.7, 300 sec: 3694.8). Total num frames: 905216. Throughput: 0: 1009.1. Samples: 226264. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:36:15,490][05422] Avg episode reward: [(0, '6.272')]
[2023-02-23 00:36:15,502][11201] Saving new best policy, reward=6.272!
[2023-02-23 00:36:20,488][05422] Fps is (10 sec: 4505.7, 60 sec: 3959.5, 300 sec: 3702.8). Total num frames: 925696. Throughput: 0: 1020.8. Samples: 229836. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:36:20,494][05422] Avg episode reward: [(0, '6.347')]
[2023-02-23 00:36:20,526][11201] Saving new best policy, reward=6.347!
[2023-02-23 00:36:24,524][11216] Updated weights for policy 0, policy_version 230 (0.0012)
[2023-02-23 00:36:25,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3694.4). Total num frames: 942080. Throughput: 0: 985.8. Samples: 235532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:36:25,493][05422] Avg episode reward: [(0, '6.051')]
[2023-02-23 00:36:30,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3686.4). Total num frames: 958464. Throughput: 0: 960.1. Samples: 240122. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:36:30,494][05422] Avg episode reward: [(0, '5.714')]
[2023-02-23 00:36:34,909][11216] Updated weights for policy 0, policy_version 240 (0.0012)
[2023-02-23 00:36:35,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3709.6). Total num frames: 983040. Throughput: 0: 987.2. Samples: 243582. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:36:35,495][05422] Avg episode reward: [(0, '5.960')]
[2023-02-23 00:36:40,488][05422] Fps is (10 sec: 4915.1, 60 sec: 4027.7, 300 sec: 3731.9). Total num frames: 1007616. Throughput: 0: 1023.1. Samples: 251094. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:36:40,490][05422] Avg episode reward: [(0, '6.455')]
[2023-02-23 00:36:40,497][11201] Saving new best policy, reward=6.455!
[2023-02-23 00:36:45,067][11216] Updated weights for policy 0, policy_version 250 (0.0014)
[2023-02-23 00:36:45,490][05422] Fps is (10 sec: 4095.2, 60 sec: 3959.3, 300 sec: 3723.6). Total num frames: 1024000. Throughput: 0: 977.9. Samples: 256256. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:36:45,493][05422] Avg episode reward: [(0, '6.378')]
[2023-02-23 00:36:50,488][05422] Fps is (10 sec: 3276.8, 60 sec: 4027.8, 300 sec: 3715.7). Total num frames: 1040384. Throughput: 0: 967.9. Samples: 258488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:36:50,491][05422] Avg episode reward: [(0, '6.737')]
[2023-02-23 00:36:50,497][11201] Saving new best policy, reward=6.737!
[2023-02-23 00:36:55,171][11216] Updated weights for policy 0, policy_version 260 (0.0017)
[2023-02-23 00:36:55,488][05422] Fps is (10 sec: 4096.8, 60 sec: 4027.7, 300 sec: 3736.7). Total num frames: 1064960. Throughput: 0: 1011.8. Samples: 265044. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:36:55,491][05422] Avg episode reward: [(0, '6.783')]
[2023-02-23 00:36:55,505][11201] Saving new best policy, reward=6.783!
[2023-02-23 00:37:00,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.8, 300 sec: 3742.9). Total num frames: 1085440. Throughput: 0: 1019.7. Samples: 272152. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 00:37:00,493][05422] Avg episode reward: [(0, '6.669')]
[2023-02-23 00:37:05,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 1101824. Throughput: 0: 991.0. Samples: 274430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:37:05,493][05422] Avg episode reward: [(0, '7.199')]
[2023-02-23 00:37:05,503][11201] Saving new best policy, reward=7.199!
[2023-02-23 00:37:06,276][11216] Updated weights for policy 0, policy_version 270 (0.0012)
[2023-02-23 00:37:10,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3790.5). Total num frames: 1118208. Throughput: 0: 967.2. Samples: 279056. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:37:10,491][05422] Avg episode reward: [(0, '7.036')]
[2023-02-23 00:37:15,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 1142784. Throughput: 0: 1022.8. Samples: 286146. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:37:15,490][05422] Avg episode reward: [(0, '6.957')]
[2023-02-23 00:37:15,823][11216] Updated weights for policy 0, policy_version 280 (0.0029)
[2023-02-23 00:37:20,489][05422] Fps is (10 sec: 4914.4, 60 sec: 4027.6, 300 sec: 3957.1). Total num frames: 1167360. Throughput: 0: 1026.9. Samples: 289794. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:37:20,495][05422] Avg episode reward: [(0, '6.942')]
[2023-02-23 00:37:25,491][05422] Fps is (10 sec: 3685.1, 60 sec: 3959.2, 300 sec: 3943.2). Total num frames: 1179648. Throughput: 0: 977.7. Samples: 295092. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:37:25,497][05422] Avg episode reward: [(0, '7.381')]
[2023-02-23 00:37:25,511][11201] Saving new best policy, reward=7.381!
[2023-02-23 00:37:27,081][11216] Updated weights for policy 0, policy_version 290 (0.0020)
[2023-02-23 00:37:30,488][05422] Fps is (10 sec: 3277.4, 60 sec: 4027.7, 300 sec: 3957.2). Total num frames: 1200128. Throughput: 0: 969.9. Samples: 299898. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:37:30,494][05422] Avg episode reward: [(0, '7.528')]
[2023-02-23 00:37:30,499][11201] Saving new best policy, reward=7.528!
[2023-02-23 00:37:35,488][05422] Fps is (10 sec: 4507.2, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1224704. Throughput: 0: 1001.2. Samples: 303542. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:37:35,490][05422] Avg episode reward: [(0, '8.936')]
[2023-02-23 00:37:35,502][11201] Saving new best policy, reward=8.936!
[2023-02-23 00:37:36,302][11216] Updated weights for policy 0, policy_version 300 (0.0017)
[2023-02-23 00:37:40,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1245184. Throughput: 0: 1018.3. Samples: 310866. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 00:37:40,493][05422] Avg episode reward: [(0, '9.182')]
[2023-02-23 00:37:40,498][11201] Saving new best policy, reward=9.182!
[2023-02-23 00:37:45,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.6, 300 sec: 3957.2). Total num frames: 1261568. Throughput: 0: 965.8. Samples: 315614. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:37:45,497][05422] Avg episode reward: [(0, '9.474')]
[2023-02-23 00:37:45,511][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000308_1261568.pth...
[2023-02-23 00:37:45,644][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000077_315392.pth
[2023-02-23 00:37:45,662][11201] Saving new best policy, reward=9.474!
[2023-02-23 00:37:47,925][11216] Updated weights for policy 0, policy_version 310 (0.0024)
[2023-02-23 00:37:50,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1277952. Throughput: 0: 964.8. Samples: 317844. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:37:50,498][05422] Avg episode reward: [(0, '8.774')]
[2023-02-23 00:37:55,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1302528. Throughput: 0: 1016.0. Samples: 324776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:37:55,495][05422] Avg episode reward: [(0, '8.957')]
[2023-02-23 00:37:56,645][11216] Updated weights for policy 0, policy_version 320 (0.0020)
[2023-02-23 00:38:00,491][05422] Fps is (10 sec: 4504.2, 60 sec: 3959.3, 300 sec: 3957.3). Total num frames: 1323008. Throughput: 0: 1007.8. Samples: 331502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:38:00,493][05422] Avg episode reward: [(0, '8.858')]
[2023-02-23 00:38:05,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1339392. Throughput: 0: 979.5. Samples: 333868. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:38:05,493][05422] Avg episode reward: [(0, '8.496')]
[2023-02-23 00:38:08,601][11216] Updated weights for policy 0, policy_version 330 (0.0025)
[2023-02-23 00:38:10,488][05422] Fps is (10 sec: 3687.5, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1359872. Throughput: 0: 969.6. Samples: 338722. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:38:10,490][05422] Avg episode reward: [(0, '8.330')]
[2023-02-23 00:38:15,488][05422] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1384448. Throughput: 0: 1026.4. Samples: 346088. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:38:15,490][05422] Avg episode reward: [(0, '8.186')]
[2023-02-23 00:38:16,936][11216] Updated weights for policy 0, policy_version 340 (0.0012)
[2023-02-23 00:38:20,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.6, 300 sec: 3971.0). Total num frames: 1404928. Throughput: 0: 1026.8. Samples: 349748. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 00:38:20,490][05422] Avg episode reward: [(0, '8.229')]
[2023-02-23 00:38:25,488][05422] Fps is (10 sec: 3686.4, 60 sec: 4028.0, 300 sec: 3971.0). Total num frames: 1421312. Throughput: 0: 971.2. Samples: 354568. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:38:25,494][05422] Avg episode reward: [(0, '8.576')]
[2023-02-23 00:38:29,049][11216] Updated weights for policy 0, policy_version 350 (0.0021)
[2023-02-23 00:38:30,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1437696. Throughput: 0: 985.1. Samples: 359942. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:38:30,490][05422] Avg episode reward: [(0, '9.087')]
[2023-02-23 00:38:35,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1462272. Throughput: 0: 1015.7. Samples: 363552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:38:35,490][05422] Avg episode reward: [(0, '10.211')]
[2023-02-23 00:38:35,502][11201] Saving new best policy, reward=10.211!
[2023-02-23 00:38:37,517][11216] Updated weights for policy 0, policy_version 360 (0.0013)
[2023-02-23 00:38:40,491][05422] Fps is (10 sec: 4504.3, 60 sec: 3959.3, 300 sec: 3957.1). Total num frames: 1482752. Throughput: 0: 1015.9. Samples: 370496. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 00:38:40,495][05422] Avg episode reward: [(0, '9.502')]
[2023-02-23 00:38:45,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.1). Total num frames: 1499136. Throughput: 0: 968.0. Samples: 375060. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:38:45,493][05422] Avg episode reward: [(0, '9.801')]
[2023-02-23 00:38:49,731][11216] Updated weights for policy 0, policy_version 370 (0.0022)
[2023-02-23 00:38:50,488][05422] Fps is (10 sec: 3277.7, 60 sec: 3959.5, 300 sec: 3957.1). Total num frames: 1515520. Throughput: 0: 967.6. Samples: 377408. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:38:50,493][05422] Avg episode reward: [(0, '9.484')]
[2023-02-23 00:38:55,488][05422] Fps is (10 sec: 4095.9, 60 sec: 3959.4, 300 sec: 3957.1). Total num frames: 1540096. Throughput: 0: 1019.0. Samples: 384578. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:38:55,491][05422] Avg episode reward: [(0, '10.178')]
[2023-02-23 00:38:58,013][11216] Updated weights for policy 0, policy_version 380 (0.0020)
[2023-02-23 00:39:00,488][05422] Fps is (10 sec: 4505.7, 60 sec: 3959.7, 300 sec: 3957.2). Total num frames: 1560576. Throughput: 0: 1000.4. Samples: 391106. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:39:00,491][05422] Avg episode reward: [(0, '9.981')]
[2023-02-23 00:39:05,490][05422] Fps is (10 sec: 3685.8, 60 sec: 3959.3, 300 sec: 3957.1). Total num frames: 1576960. Throughput: 0: 969.4. Samples: 393372. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:39:05,492][05422] Avg episode reward: [(0, '9.917')]
[2023-02-23 00:39:10,109][11216] Updated weights for policy 0, policy_version 390 (0.0028)
[2023-02-23 00:39:10,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1597440. Throughput: 0: 974.7. Samples: 398430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:39:10,490][05422] Avg episode reward: [(0, '10.314')]
[2023-02-23 00:39:10,492][11201] Saving new best policy, reward=10.314!
[2023-02-23 00:39:15,488][05422] Fps is (10 sec: 4506.5, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1622016. Throughput: 0: 1019.5. Samples: 405818. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:39:15,490][05422] Avg episode reward: [(0, '10.825')]
[2023-02-23 00:39:15,499][11201] Saving new best policy, reward=10.825!
[2023-02-23 00:39:18,731][11216] Updated weights for policy 0, policy_version 400 (0.0012)
[2023-02-23 00:39:20,488][05422] Fps is (10 sec: 4505.5, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1642496. Throughput: 0: 1019.0. Samples: 409408. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:39:20,490][05422] Avg episode reward: [(0, '10.868')]
[2023-02-23 00:39:20,493][11201] Saving new best policy, reward=10.868!
[2023-02-23 00:39:25,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 1654784. Throughput: 0: 967.8. Samples: 414044. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:39:25,493][05422] Avg episode reward: [(0, '11.587')]
[2023-02-23 00:39:25,513][11201] Saving new best policy, reward=11.587!
[2023-02-23 00:39:30,488][05422] Fps is (10 sec: 3276.9, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1675264. Throughput: 0: 990.6. Samples: 419638. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:39:30,490][05422] Avg episode reward: [(0, '10.919')]
[2023-02-23 00:39:30,504][11216] Updated weights for policy 0, policy_version 410 (0.0031)
[2023-02-23 00:39:35,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1699840. Throughput: 0: 1017.4. Samples: 423192. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:39:35,490][05422] Avg episode reward: [(0, '11.477')]
[2023-02-23 00:39:39,961][11216] Updated weights for policy 0, policy_version 420 (0.0012)
[2023-02-23 00:39:40,489][05422] Fps is (10 sec: 4505.1, 60 sec: 3959.6, 300 sec: 3971.0). Total num frames: 1720320. Throughput: 0: 1005.7. Samples: 429836. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:39:40,494][05422] Avg episode reward: [(0, '11.722')]
[2023-02-23 00:39:40,501][11201] Saving new best policy, reward=11.722!
[2023-02-23 00:39:45,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.1). Total num frames: 1736704. Throughput: 0: 961.0. Samples: 434352. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:39:45,492][05422] Avg episode reward: [(0, '12.240')]
[2023-02-23 00:39:45,503][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000424_1736704.pth...
[2023-02-23 00:39:45,635][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000192_786432.pth
[2023-02-23 00:39:45,649][11201] Saving new best policy, reward=12.240!
[2023-02-23 00:39:50,488][05422] Fps is (10 sec: 3686.8, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1757184. Throughput: 0: 965.8. Samples: 436830. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:39:50,492][05422] Avg episode reward: [(0, '12.671')]
[2023-02-23 00:39:50,495][11201] Saving new best policy, reward=12.671!
[2023-02-23 00:39:51,275][11216] Updated weights for policy 0, policy_version 430 (0.0021)
[2023-02-23 00:39:55,488][05422] Fps is (10 sec: 4505.6, 60 sec: 4027.8, 300 sec: 3984.9). Total num frames: 1781760. Throughput: 0: 1012.2. Samples: 443980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:39:55,490][05422] Avg episode reward: [(0, '13.277')]
[2023-02-23 00:39:55,501][11201] Saving new best policy, reward=13.277!
[2023-02-23 00:40:00,489][05422] Fps is (10 sec: 4095.4, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 1798144. Throughput: 0: 985.1. Samples: 450148. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:40:00,494][05422] Avg episode reward: [(0, '13.504')]
[2023-02-23 00:40:00,504][11201] Saving new best policy, reward=13.504!
[2023-02-23 00:40:01,156][11216] Updated weights for policy 0, policy_version 440 (0.0013)
[2023-02-23 00:40:05,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.6, 300 sec: 3971.0). Total num frames: 1814528. Throughput: 0: 954.6. Samples: 452366. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:40:05,490][05422] Avg episode reward: [(0, '13.061')]
[2023-02-23 00:40:10,488][05422] Fps is (10 sec: 3686.9, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1835008. Throughput: 0: 971.9. Samples: 457778. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:40:10,494][05422] Avg episode reward: [(0, '13.560')]
[2023-02-23 00:40:10,500][11201] Saving new best policy, reward=13.560!
[2023-02-23 00:40:11,713][11216] Updated weights for policy 0, policy_version 450 (0.0025)
[2023-02-23 00:40:15,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1859584. Throughput: 0: 1011.8. Samples: 465170. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:40:15,494][05422] Avg episode reward: [(0, '14.121')]
[2023-02-23 00:40:15,504][11201] Saving new best policy, reward=14.121!
[2023-02-23 00:40:20,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3971.0). Total num frames: 1875968. Throughput: 0: 1004.6. Samples: 468398. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:40:20,490][05422] Avg episode reward: [(0, '14.239')]
[2023-02-23 00:40:20,508][11201] Saving new best policy, reward=14.239!
[2023-02-23 00:40:22,068][11216] Updated weights for policy 0, policy_version 460 (0.0027)
[2023-02-23 00:40:25,488][05422] Fps is (10 sec: 3276.6, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 1892352. Throughput: 0: 958.7. Samples: 472978. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:40:25,493][05422] Avg episode reward: [(0, '14.583')]
[2023-02-23 00:40:25,506][11201] Saving new best policy, reward=14.583!
[2023-02-23 00:40:30,488][05422] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1916928. Throughput: 0: 992.9. Samples: 479034. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:40:30,490][05422] Avg episode reward: [(0, '14.893')]
[2023-02-23 00:40:30,495][11201] Saving new best policy, reward=14.893!
[2023-02-23 00:40:32,149][11216] Updated weights for policy 0, policy_version 470 (0.0023)
[2023-02-23 00:40:35,488][05422] Fps is (10 sec: 4915.3, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 1941504. Throughput: 0: 1018.8. Samples: 482674. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:40:35,491][05422] Avg episode reward: [(0, '15.143')]
[2023-02-23 00:40:35,501][11201] Saving new best policy, reward=15.143!
[2023-02-23 00:40:40,489][05422] Fps is (10 sec: 4095.3, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 1957888. Throughput: 0: 998.5. Samples: 488916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:40:40,496][05422] Avg episode reward: [(0, '16.094')]
[2023-02-23 00:40:40,500][11201] Saving new best policy, reward=16.094!
[2023-02-23 00:40:43,111][11216] Updated weights for policy 0, policy_version 480 (0.0014)
[2023-02-23 00:40:45,488][05422] Fps is (10 sec: 2867.3, 60 sec: 3891.2, 300 sec: 3971.1). Total num frames: 1970176. Throughput: 0: 963.7. Samples: 493514. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:40:45,491][05422] Avg episode reward: [(0, '16.300')]
[2023-02-23 00:40:45,570][11201] Saving new best policy, reward=16.300!
[2023-02-23 00:40:50,488][05422] Fps is (10 sec: 3687.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1994752. Throughput: 0: 980.2. Samples: 496476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:40:50,490][05422] Avg episode reward: [(0, '17.453')]
[2023-02-23 00:40:50,492][11201] Saving new best policy, reward=17.453!
[2023-02-23 00:40:52,640][11216] Updated weights for policy 0, policy_version 490 (0.0017)
[2023-02-23 00:40:55,488][05422] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3971.1). Total num frames: 2019328. Throughput: 0: 1021.2. Samples: 503732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:40:55,490][05422] Avg episode reward: [(0, '18.803')]
[2023-02-23 00:40:55,507][11201] Saving new best policy, reward=18.803!
[2023-02-23 00:41:00,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.6, 300 sec: 3971.0). Total num frames: 2035712. Throughput: 0: 985.2. Samples: 509502. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:41:00,494][05422] Avg episode reward: [(0, '19.180')]
[2023-02-23 00:41:00,496][11201] Saving new best policy, reward=19.180!
[2023-02-23 00:41:03,966][11216] Updated weights for policy 0, policy_version 500 (0.0012)
[2023-02-23 00:41:05,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2052096. Throughput: 0: 962.9. Samples: 511728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:41:05,491][05422] Avg episode reward: [(0, '18.818')]
[2023-02-23 00:41:10,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2072576. Throughput: 0: 990.5. Samples: 517550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:41:10,494][05422] Avg episode reward: [(0, '17.612')]
[2023-02-23 00:41:13,063][11216] Updated weights for policy 0, policy_version 510 (0.0025)
[2023-02-23 00:41:15,491][05422] Fps is (10 sec: 4504.3, 60 sec: 3959.3, 300 sec: 3971.0). Total num frames: 2097152. Throughput: 0: 1019.5. Samples: 524916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:41:15,493][05422] Avg episode reward: [(0, '17.544')]
[2023-02-23 00:41:20,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2113536. Throughput: 0: 1002.8. Samples: 527802. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:41:20,496][05422] Avg episode reward: [(0, '18.016')]
[2023-02-23 00:41:24,731][11216] Updated weights for policy 0, policy_version 520 (0.0024)
[2023-02-23 00:41:25,488][05422] Fps is (10 sec: 3277.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2129920. Throughput: 0: 963.3. Samples: 532262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:41:25,493][05422] Avg episode reward: [(0, '19.215')]
[2023-02-23 00:41:25,504][11201] Saving new best policy, reward=19.215!
[2023-02-23 00:41:30,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2154496. Throughput: 0: 1007.1. Samples: 538834. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:41:30,493][05422] Avg episode reward: [(0, '19.725')]
[2023-02-23 00:41:30,496][11201] Saving new best policy, reward=19.725!
[2023-02-23 00:41:33,618][11216] Updated weights for policy 0, policy_version 530 (0.0021)
[2023-02-23 00:41:35,488][05422] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2179072. Throughput: 0: 1021.9. Samples: 542462. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:41:35,493][05422] Avg episode reward: [(0, '19.342')]
[2023-02-23 00:41:40,488][05422] Fps is (10 sec: 4096.1, 60 sec: 3959.6, 300 sec: 3971.1). Total num frames: 2195456. Throughput: 0: 989.3. Samples: 548252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:41:40,491][05422] Avg episode reward: [(0, '21.316')]
[2023-02-23 00:41:40,498][11201] Saving new best policy, reward=21.316!
[2023-02-23 00:41:45,488][05422] Fps is (10 sec: 2867.0, 60 sec: 3959.4, 300 sec: 3957.1). Total num frames: 2207744. Throughput: 0: 964.0. Samples: 552882. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:41:45,491][05422] Avg episode reward: [(0, '20.291')]
[2023-02-23 00:41:45,573][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000540_2211840.pth...
[2023-02-23 00:41:45,575][11216] Updated weights for policy 0, policy_version 540 (0.0017)
[2023-02-23 00:41:45,682][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000308_1261568.pth
[2023-02-23 00:41:50,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2232320. Throughput: 0: 986.5. Samples: 556122. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 00:41:50,495][05422] Avg episode reward: [(0, '21.022')]
[2023-02-23 00:41:54,083][11216] Updated weights for policy 0, policy_version 550 (0.0011)
[2023-02-23 00:41:55,493][05422] Fps is (10 sec: 4912.9, 60 sec: 3959.1, 300 sec: 3971.0). Total num frames: 2256896. Throughput: 0: 1019.5. Samples: 563434. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:41:55,495][05422] Avg episode reward: [(0, '22.037')]
[2023-02-23 00:41:55,508][11201] Saving new best policy, reward=22.037!
[2023-02-23 00:42:00,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2273280. Throughput: 0: 972.7. Samples: 568684. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:42:00,490][05422] Avg episode reward: [(0, '22.558')]
[2023-02-23 00:42:00,495][11201] Saving new best policy, reward=22.558!
[2023-02-23 00:42:05,488][05422] Fps is (10 sec: 3278.6, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2289664. Throughput: 0: 956.8. Samples: 570858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:42:05,495][05422] Avg episode reward: [(0, '21.289')]
[2023-02-23 00:42:06,350][11216] Updated weights for policy 0, policy_version 560 (0.0012)
[2023-02-23 00:42:10,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2310144. Throughput: 0: 998.1. Samples: 577178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:42:10,493][05422] Avg episode reward: [(0, '20.907')]
[2023-02-23 00:42:14,583][11216] Updated weights for policy 0, policy_version 570 (0.0014)
[2023-02-23 00:42:15,489][05422] Fps is (10 sec: 4504.8, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2334720. Throughput: 0: 1016.0. Samples: 584554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:42:15,492][05422] Avg episode reward: [(0, '22.423')]
[2023-02-23 00:42:20,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.1). Total num frames: 2351104. Throughput: 0: 989.3. Samples: 586982. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:42:20,499][05422] Avg episode reward: [(0, '22.625')]
[2023-02-23 00:42:20,507][11201] Saving new best policy, reward=22.625!
[2023-02-23 00:42:25,488][05422] Fps is (10 sec: 3277.3, 60 sec: 3959.4, 300 sec: 3957.1). Total num frames: 2367488. Throughput: 0: 964.0. Samples: 591632. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:42:25,491][05422] Avg episode reward: [(0, '22.469')]
[2023-02-23 00:42:26,595][11216] Updated weights for policy 0, policy_version 580 (0.0015)
[2023-02-23 00:42:30,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2392064. Throughput: 0: 1014.0. Samples: 598510. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:42:30,493][05422] Avg episode reward: [(0, '21.748')]
[2023-02-23 00:42:34,766][11216] Updated weights for policy 0, policy_version 590 (0.0018)
[2023-02-23 00:42:35,489][05422] Fps is (10 sec: 4914.8, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 2416640. Throughput: 0: 1024.7. Samples: 602234. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:42:35,495][05422] Avg episode reward: [(0, '21.653')]
[2023-02-23 00:42:40,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2433024. Throughput: 0: 983.1. Samples: 607668. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:42:40,499][05422] Avg episode reward: [(0, '21.643')]
[2023-02-23 00:42:45,488][05422] Fps is (10 sec: 3277.1, 60 sec: 4027.8, 300 sec: 3971.0). Total num frames: 2449408. Throughput: 0: 972.4. Samples: 612442. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:42:45,495][05422] Avg episode reward: [(0, '21.726')]
[2023-02-23 00:42:46,908][11216] Updated weights for policy 0, policy_version 600 (0.0030)
[2023-02-23 00:42:50,488][05422] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2473984. Throughput: 0: 1003.8. Samples: 616030. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:42:50,495][05422] Avg episode reward: [(0, '23.192')]
[2023-02-23 00:42:50,501][11201] Saving new best policy, reward=23.192!
[2023-02-23 00:42:55,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.8, 300 sec: 3971.1). Total num frames: 2494464. Throughput: 0: 1027.0. Samples: 623394. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:42:55,498][05422] Avg episode reward: [(0, '24.235')]
[2023-02-23 00:42:55,512][11201] Saving new best policy, reward=24.235!
[2023-02-23 00:42:55,528][11216] Updated weights for policy 0, policy_version 610 (0.0018)
[2023-02-23 00:43:00,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2510848. Throughput: 0: 969.6. Samples: 628184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:43:00,489][05422] Avg episode reward: [(0, '24.888')]
[2023-02-23 00:43:00,497][11201] Saving new best policy, reward=24.888!
[2023-02-23 00:43:05,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2527232. Throughput: 0: 966.0. Samples: 630452. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:43:05,496][05422] Avg episode reward: [(0, '26.120')]
[2023-02-23 00:43:05,514][11201] Saving new best policy, reward=26.120!
[2023-02-23 00:43:07,368][11216] Updated weights for policy 0, policy_version 620 (0.0022)
[2023-02-23 00:43:10,488][05422] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3957.2). Total num frames: 2551808. Throughput: 0: 1011.7. Samples: 637156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:43:10,493][05422] Avg episode reward: [(0, '27.034')]
[2023-02-23 00:43:10,497][11201] Saving new best policy, reward=27.034!
[2023-02-23 00:43:15,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.6, 300 sec: 3957.2). Total num frames: 2572288. Throughput: 0: 1013.5. Samples: 644116. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:43:15,493][05422] Avg episode reward: [(0, '25.373')]
[2023-02-23 00:43:16,731][11216] Updated weights for policy 0, policy_version 630 (0.0018)
[2023-02-23 00:43:20,488][05422] Fps is (10 sec: 3686.3, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2588672. Throughput: 0: 982.0. Samples: 646422. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:43:20,496][05422] Avg episode reward: [(0, '25.465')]
[2023-02-23 00:43:25,488][05422] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2609152. Throughput: 0: 965.9. Samples: 651134. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:43:25,495][05422] Avg episode reward: [(0, '24.594')]
[2023-02-23 00:43:27,739][11216] Updated weights for policy 0, policy_version 640 (0.0017)
[2023-02-23 00:43:30,488][05422] Fps is (10 sec: 4505.7, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2633728. Throughput: 0: 1022.6. Samples: 658458. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:43:30,491][05422] Avg episode reward: [(0, '23.370')]
[2023-02-23 00:43:35,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3971.1). Total num frames: 2654208. Throughput: 0: 1024.0. Samples: 662112. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:43:35,495][05422] Avg episode reward: [(0, '21.802')]
[2023-02-23 00:43:37,699][11216] Updated weights for policy 0, policy_version 650 (0.0012)
[2023-02-23 00:43:40,495][05422] Fps is (10 sec: 3683.8, 60 sec: 3959.0, 300 sec: 3970.9). Total num frames: 2670592. Throughput: 0: 971.5. Samples: 667118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 00:43:40,497][05422] Avg episode reward: [(0, '21.880')]
[2023-02-23 00:43:45,490][05422] Fps is (10 sec: 3276.2, 60 sec: 3959.3, 300 sec: 3971.0). Total num frames: 2686976. Throughput: 0: 979.4. Samples: 672260. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:43:45,493][05422] Avg episode reward: [(0, '21.545')]
[2023-02-23 00:43:45,502][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000656_2686976.pth...
[2023-02-23 00:43:45,618][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000424_1736704.pth
[2023-02-23 00:43:48,320][11216] Updated weights for policy 0, policy_version 660 (0.0019)
[2023-02-23 00:43:50,488][05422] Fps is (10 sec: 4098.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2711552. Throughput: 0: 1006.6. Samples: 675750. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:43:50,491][05422] Avg episode reward: [(0, '20.230')]
[2023-02-23 00:43:55,488][05422] Fps is (10 sec: 4506.2, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 2732032. Throughput: 0: 1021.6. Samples: 683128. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:43:55,494][05422] Avg episode reward: [(0, '20.512')]
[2023-02-23 00:43:58,353][11216] Updated weights for policy 0, policy_version 670 (0.0011)
[2023-02-23 00:44:00,488][05422] Fps is (10 sec: 3686.5, 60 sec: 3959.5, 300 sec: 3971.1). Total num frames: 2748416. Throughput: 0: 967.0. Samples: 687632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:44:00,494][05422] Avg episode reward: [(0, '20.971')]
[2023-02-23 00:44:05,488][05422] Fps is (10 sec: 3686.6, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2768896. Throughput: 0: 968.5. Samples: 690004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:44:05,491][05422] Avg episode reward: [(0, '20.765')]
[2023-02-23 00:44:08,967][11216] Updated weights for policy 0, policy_version 680 (0.0014)
[2023-02-23 00:44:10,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2789376. Throughput: 0: 1016.5. Samples: 696876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:44:10,490][05422] Avg episode reward: [(0, '19.482')]
[2023-02-23 00:44:15,489][05422] Fps is (10 sec: 4095.6, 60 sec: 3959.4, 300 sec: 3957.1). Total num frames: 2809856. Throughput: 0: 1003.0. Samples: 703594. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:44:15,495][05422] Avg episode reward: [(0, '19.989')]
[2023-02-23 00:44:19,360][11216] Updated weights for policy 0, policy_version 690 (0.0020)
[2023-02-23 00:44:20,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2826240. Throughput: 0: 973.0. Samples: 705898. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:44:20,490][05422] Avg episode reward: [(0, '22.313')]
[2023-02-23 00:44:25,488][05422] Fps is (10 sec: 3686.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2846720. Throughput: 0: 973.1. Samples: 710900. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:44:25,489][05422] Avg episode reward: [(0, '22.552')]
[2023-02-23 00:44:29,102][11216] Updated weights for policy 0, policy_version 700 (0.0040)
[2023-02-23 00:44:30,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2871296. Throughput: 0: 1020.3. Samples: 718172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:44:30,495][05422] Avg episode reward: [(0, '24.520')]
[2023-02-23 00:44:35,490][05422] Fps is (10 sec: 4504.7, 60 sec: 3959.3, 300 sec: 3971.0). Total num frames: 2891776. Throughput: 0: 1024.7. Samples: 721862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:44:35,492][05422] Avg episode reward: [(0, '24.967')]
[2023-02-23 00:44:40,048][11216] Updated weights for policy 0, policy_version 710 (0.0011)
[2023-02-23 00:44:40,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.9, 300 sec: 3971.0). Total num frames: 2908160. Throughput: 0: 967.5. Samples: 726664. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 00:44:40,491][05422] Avg episode reward: [(0, '25.739')]
[2023-02-23 00:44:45,488][05422] Fps is (10 sec: 3687.1, 60 sec: 4027.9, 300 sec: 3971.0). Total num frames: 2928640. Throughput: 0: 988.5. Samples: 732114. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:44:45,492][05422] Avg episode reward: [(0, '26.508')]
[2023-02-23 00:44:49,539][11216] Updated weights for policy 0, policy_version 720 (0.0011)
[2023-02-23 00:44:50,488][05422] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2953216. Throughput: 0: 1017.1. Samples: 735774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:44:50,494][05422] Avg episode reward: [(0, '26.296')]
[2023-02-23 00:44:55,488][05422] Fps is (10 sec: 4505.6, 60 sec: 4027.8, 300 sec: 3984.9). Total num frames: 2973696. Throughput: 0: 1021.1. Samples: 742824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:44:55,492][05422] Avg episode reward: [(0, '27.919')]
[2023-02-23 00:44:55,508][11201] Saving new best policy, reward=27.919!
[2023-02-23 00:45:00,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2985984. Throughput: 0: 971.8. Samples: 747324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:45:00,490][05422] Avg episode reward: [(0, '26.484')]
[2023-02-23 00:45:00,675][11216] Updated weights for policy 0, policy_version 730 (0.0013)
[2023-02-23 00:45:05,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3006464. Throughput: 0: 974.4. Samples: 749746. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:45:05,493][05422] Avg episode reward: [(0, '25.442')]
[2023-02-23 00:45:10,230][11216] Updated weights for policy 0, policy_version 740 (0.0015)
[2023-02-23 00:45:10,488][05422] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 3031040. Throughput: 0: 1017.3. Samples: 756678. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:45:10,490][05422] Avg episode reward: [(0, '25.029')]
[2023-02-23 00:45:15,488][05422] Fps is (10 sec: 4505.6, 60 sec: 4027.8, 300 sec: 3984.9). Total num frames: 3051520. Throughput: 0: 1000.8. Samples: 763210. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:45:15,490][05422] Avg episode reward: [(0, '25.032')]
[2023-02-23 00:45:20,488][05422] Fps is (10 sec: 3276.7, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 3063808. Throughput: 0: 970.6. Samples: 765536. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:45:20,491][05422] Avg episode reward: [(0, '24.345')]
[2023-02-23 00:45:21,838][11216] Updated weights for policy 0, policy_version 750 (0.0025)
[2023-02-23 00:45:25,488][05422] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 3088384. Throughput: 0: 976.4. Samples: 770600. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:45:25,496][05422] Avg episode reward: [(0, '22.208')]
[2023-02-23 00:45:30,483][11216] Updated weights for policy 0, policy_version 760 (0.0018)
[2023-02-23 00:45:30,494][05422] Fps is (10 sec: 4912.4, 60 sec: 4027.3, 300 sec: 3971.0). Total num frames: 3112960. Throughput: 0: 1017.3. Samples: 777900. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:45:30,516][05422] Avg episode reward: [(0, '21.786')]
[2023-02-23 00:45:35,488][05422] Fps is (10 sec: 4095.9, 60 sec: 3959.6, 300 sec: 3971.1). Total num frames: 3129344. Throughput: 0: 1017.3. Samples: 781552. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:45:35,497][05422] Avg episode reward: [(0, '21.745')]
[2023-02-23 00:45:40,490][05422] Fps is (10 sec: 3278.0, 60 sec: 3959.3, 300 sec: 3984.9). Total num frames: 3145728. Throughput: 0: 959.8. Samples: 786018. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:45:40,500][05422] Avg episode reward: [(0, '21.269')]
[2023-02-23 00:45:42,494][11216] Updated weights for policy 0, policy_version 770 (0.0046)
[2023-02-23 00:45:45,488][05422] Fps is (10 sec: 3686.5, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3166208. Throughput: 0: 986.9. Samples: 791736. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:45:45,493][05422] Avg episode reward: [(0, '19.545')]
[2023-02-23 00:45:45,507][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000773_3166208.pth...
[2023-02-23 00:45:45,634][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000540_2211840.pth
[2023-02-23 00:45:50,488][05422] Fps is (10 sec: 4506.7, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3190784. Throughput: 0: 1013.1. Samples: 795336. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:45:50,495][05422] Avg episode reward: [(0, '20.559')]
[2023-02-23 00:45:50,915][11216] Updated weights for policy 0, policy_version 780 (0.0025)
[2023-02-23 00:45:55,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3211264. Throughput: 0: 1010.6. Samples: 802154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:45:55,493][05422] Avg episode reward: [(0, '22.321')]
[2023-02-23 00:46:00,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3223552. Throughput: 0: 967.1. Samples: 806728. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:46:00,493][05422] Avg episode reward: [(0, '22.366')]
[2023-02-23 00:46:03,037][11216] Updated weights for policy 0, policy_version 790 (0.0037)
[2023-02-23 00:46:05,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3244032. Throughput: 0: 973.2. Samples: 809328. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:46:05,490][05422] Avg episode reward: [(0, '21.878')]
[2023-02-23 00:46:10,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3971.1). Total num frames: 3268608. Throughput: 0: 1019.9. Samples: 816496. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:46:10,490][05422] Avg episode reward: [(0, '24.348')]
[2023-02-23 00:46:11,547][11216] Updated weights for policy 0, policy_version 800 (0.0023)
[2023-02-23 00:46:15,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3289088. Throughput: 0: 997.4. Samples: 822778. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 00:46:15,493][05422] Avg episode reward: [(0, '25.191')]
[2023-02-23 00:46:20,488][05422] Fps is (10 sec: 3686.4, 60 sec: 4027.8, 300 sec: 3984.9). Total num frames: 3305472. Throughput: 0: 967.5. Samples: 825088. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:46:20,496][05422] Avg episode reward: [(0, '24.570')]
[2023-02-23 00:46:23,352][11216] Updated weights for policy 0, policy_version 810 (0.0023)
[2023-02-23 00:46:25,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3325952. Throughput: 0: 990.2. Samples: 830576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:46:25,495][05422] Avg episode reward: [(0, '24.721')]
[2023-02-23 00:46:30,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.9, 300 sec: 3971.0). Total num frames: 3350528. Throughput: 0: 1024.2. Samples: 837824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:46:30,490][05422] Avg episode reward: [(0, '25.579')]
[2023-02-23 00:46:31,814][11216] Updated weights for policy 0, policy_version 820 (0.0015)
[2023-02-23 00:46:35,490][05422] Fps is (10 sec: 4095.2, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 3366912. Throughput: 0: 1020.1. Samples: 841242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 00:46:35,492][05422] Avg episode reward: [(0, '22.954')]
[2023-02-23 00:46:40,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.6, 300 sec: 3984.9). Total num frames: 3383296. Throughput: 0: 968.2. Samples: 845724. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:46:40,494][05422] Avg episode reward: [(0, '22.674')]
[2023-02-23 00:46:43,747][11216] Updated weights for policy 0, policy_version 830 (0.0021)
[2023-02-23 00:46:45,488][05422] Fps is (10 sec: 4096.8, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 3407872. Throughput: 0: 1002.8. Samples: 851856. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 00:46:45,490][05422] Avg episode reward: [(0, '22.881')]
[2023-02-23 00:46:50,488][05422] Fps is (10 sec: 4915.3, 60 sec: 4027.7, 300 sec: 3985.0). Total num frames: 3432448. Throughput: 0: 1026.2. Samples: 855506. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:46:50,495][05422] Avg episode reward: [(0, '22.059')]
[2023-02-23 00:46:52,219][11216] Updated weights for policy 0, policy_version 840 (0.0016)
[2023-02-23 00:46:55,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3448832. Throughput: 0: 1006.8. Samples: 861800. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:46:55,491][05422] Avg episode reward: [(0, '21.616')]
[2023-02-23 00:47:00,488][05422] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 3465216. Throughput: 0: 970.4. Samples: 866448. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:47:00,494][05422] Avg episode reward: [(0, '21.640')]
[2023-02-23 00:47:04,158][11216] Updated weights for policy 0, policy_version 850 (0.0038)
[2023-02-23 00:47:05,488][05422] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 3485696. Throughput: 0: 986.5. Samples: 869480. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:47:05,490][05422] Avg episode reward: [(0, '22.373')]
[2023-02-23 00:47:10,488][05422] Fps is (10 sec: 4505.4, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 3510272. Throughput: 0: 1027.3. Samples: 876804. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 00:47:10,491][05422] Avg episode reward: [(0, '23.329')]
[2023-02-23 00:47:13,071][11216] Updated weights for policy 0, policy_version 860 (0.0012)
[2023-02-23 00:47:15,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3526656. Throughput: 0: 990.8. Samples: 882412. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:47:15,497][05422] Avg episode reward: [(0, '24.037')]
[2023-02-23 00:47:20,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.4, 300 sec: 3984.9). Total num frames: 3543040. Throughput: 0: 966.4. Samples: 884730. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:47:20,495][05422] Avg episode reward: [(0, '25.005')]
[2023-02-23 00:47:24,422][11216] Updated weights for policy 0, policy_version 870 (0.0014)
[2023-02-23 00:47:25,488][05422] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 3567616. Throughput: 0: 999.3. Samples: 890692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:47:25,490][05422] Avg episode reward: [(0, '25.160')]
[2023-02-23 00:47:30,488][05422] Fps is (10 sec: 4915.4, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 3592192. Throughput: 0: 1026.9. Samples: 898066. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:47:30,490][05422] Avg episode reward: [(0, '26.276')]
[2023-02-23 00:47:33,976][11216] Updated weights for policy 0, policy_version 880 (0.0022)
[2023-02-23 00:47:35,490][05422] Fps is (10 sec: 4095.2, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 3608576. Throughput: 0: 1009.0. Samples: 900912. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:47:35,492][05422] Avg episode reward: [(0, '25.671')]
[2023-02-23 00:47:40,491][05422] Fps is (10 sec: 2866.1, 60 sec: 3959.2, 300 sec: 3971.0). Total num frames: 3620864. Throughput: 0: 973.1. Samples: 905594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:47:40,495][05422] Avg episode reward: [(0, '25.055')]
[2023-02-23 00:47:44,734][11216] Updated weights for policy 0, policy_version 890 (0.0027)
[2023-02-23 00:47:45,488][05422] Fps is (10 sec: 3687.1, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3645440. Throughput: 0: 1013.0. Samples: 912032. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:47:45,494][05422] Avg episode reward: [(0, '23.571')]
[2023-02-23 00:47:45,591][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000891_3649536.pth...
[2023-02-23 00:47:45,742][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000656_2686976.pth
[2023-02-23 00:47:50,488][05422] Fps is (10 sec: 4917.0, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3670016. Throughput: 0: 1025.8. Samples: 915642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:47:50,497][05422] Avg episode reward: [(0, '23.293')]
[2023-02-23 00:47:54,420][11216] Updated weights for policy 0, policy_version 900 (0.0023)
[2023-02-23 00:47:55,489][05422] Fps is (10 sec: 4095.6, 60 sec: 3959.4, 300 sec: 3984.9). Total num frames: 3686400. Throughput: 0: 994.9. Samples: 921576. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:47:55,491][05422] Avg episode reward: [(0, '22.794')]
[2023-02-23 00:48:00,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3702784. Throughput: 0: 973.7. Samples: 926228. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:48:00,497][05422] Avg episode reward: [(0, '21.682')]
[2023-02-23 00:48:05,113][11216] Updated weights for policy 0, policy_version 910 (0.0016)
[2023-02-23 00:48:05,488][05422] Fps is (10 sec: 4096.4, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 3727360. Throughput: 0: 995.2. Samples: 929512. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:48:05,490][05422] Avg episode reward: [(0, '23.644')]
[2023-02-23 00:48:10,488][05422] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 3751936. Throughput: 0: 1024.5. Samples: 936796. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:48:10,490][05422] Avg episode reward: [(0, '25.304')]
[2023-02-23 00:48:15,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3764224. Throughput: 0: 976.0. Samples: 941986. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:48:15,496][05422] Avg episode reward: [(0, '25.331')]
[2023-02-23 00:48:15,734][11216] Updated weights for policy 0, policy_version 920 (0.0018)
[2023-02-23 00:48:20,488][05422] Fps is (10 sec: 2867.2, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3780608. Throughput: 0: 963.7. Samples: 944276. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:48:20,490][05422] Avg episode reward: [(0, '25.845')]
[2023-02-23 00:48:25,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3805184. Throughput: 0: 999.9. Samples: 950584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:48:25,491][05422] Avg episode reward: [(0, '25.904')]
[2023-02-23 00:48:25,688][11216] Updated weights for policy 0, policy_version 930 (0.0016)
[2023-02-23 00:48:30,490][05422] Fps is (10 sec: 4914.2, 60 sec: 3959.3, 300 sec: 3984.9). Total num frames: 3829760. Throughput: 0: 1018.9. Samples: 957884. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:48:30,493][05422] Avg episode reward: [(0, '24.940')]
[2023-02-23 00:48:35,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.6, 300 sec: 3985.0). Total num frames: 3846144. Throughput: 0: 994.8. Samples: 960408. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:48:35,500][05422] Avg episode reward: [(0, '24.071')]
[2023-02-23 00:48:36,229][11216] Updated weights for policy 0, policy_version 940 (0.0021)
[2023-02-23 00:48:40,488][05422] Fps is (10 sec: 3277.5, 60 sec: 4028.0, 300 sec: 3984.9). Total num frames: 3862528. Throughput: 0: 965.2. Samples: 965010. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:48:40,495][05422] Avg episode reward: [(0, '24.124')]
[2023-02-23 00:48:45,488][05422] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 3887104. Throughput: 0: 1012.1. Samples: 971772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:48:45,491][05422] Avg episode reward: [(0, '22.902')]
[2023-02-23 00:48:46,356][11216] Updated weights for policy 0, policy_version 950 (0.0016)
[2023-02-23 00:48:50,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3907584. Throughput: 0: 1019.3. Samples: 975380. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:48:50,493][05422] Avg episode reward: [(0, '23.587')]
[2023-02-23 00:48:55,488][05422] Fps is (10 sec: 3686.3, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3923968. Throughput: 0: 980.4. Samples: 980916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:48:55,489][05422] Avg episode reward: [(0, '22.551')]
[2023-02-23 00:48:57,271][11216] Updated weights for policy 0, policy_version 960 (0.0011)
[2023-02-23 00:49:00,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3940352. Throughput: 0: 970.1. Samples: 985640. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:49:00,496][05422] Avg episode reward: [(0, '24.621')]
[2023-02-23 00:49:05,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3964928. Throughput: 0: 998.8. Samples: 989222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:49:05,495][05422] Avg episode reward: [(0, '23.496')]
[2023-02-23 00:49:06,532][11216] Updated weights for policy 0, policy_version 970 (0.0026)
[2023-02-23 00:49:10,488][05422] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3998.8). Total num frames: 3989504. Throughput: 0: 1019.2. Samples: 996446. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:49:10,493][05422] Avg episode reward: [(0, '25.519')]
[2023-02-23 00:49:15,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 4001792. Throughput: 0: 964.0. Samples: 1001264. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:49:15,496][05422] Avg episode reward: [(0, '26.034')]
[2023-02-23 00:49:18,357][11216] Updated weights for policy 0, policy_version 980 (0.0015)
[2023-02-23 00:49:20,488][05422] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 4022272. Throughput: 0: 958.6. Samples: 1003546. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:49:20,489][05422] Avg episode reward: [(0, '26.995')]
[2023-02-23 00:49:25,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4042752. Throughput: 0: 1003.7. Samples: 1010176. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:49:25,490][05422] Avg episode reward: [(0, '29.030')]
[2023-02-23 00:49:25,526][11201] Saving new best policy, reward=29.030!
[2023-02-23 00:49:27,281][11216] Updated weights for policy 0, policy_version 990 (0.0016)
[2023-02-23 00:49:30,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.6, 300 sec: 3984.9). Total num frames: 4067328. Throughput: 0: 1010.3. Samples: 1017234. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 00:49:30,491][05422] Avg episode reward: [(0, '29.094')]
[2023-02-23 00:49:30,494][11201] Saving new best policy, reward=29.094!
[2023-02-23 00:49:35,491][05422] Fps is (10 sec: 4094.8, 60 sec: 3959.3, 300 sec: 3984.9). Total num frames: 4083712. Throughput: 0: 979.7. Samples: 1019470. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 00:49:35,495][05422] Avg episode reward: [(0, '28.854')]
[2023-02-23 00:49:39,179][11216] Updated weights for policy 0, policy_version 1000 (0.0013)
[2023-02-23 00:49:40,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4100096. Throughput: 0: 962.3. Samples: 1024218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:49:40,495][05422] Avg episode reward: [(0, '27.556')]
[2023-02-23 00:49:45,488][05422] Fps is (10 sec: 4097.2, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4124672. Throughput: 0: 1012.3. Samples: 1031192. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:49:45,490][05422] Avg episode reward: [(0, '27.241')]
[2023-02-23 00:49:45,500][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001007_4124672.pth...
[2023-02-23 00:49:45,613][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000773_3166208.pth
[2023-02-23 00:49:47,918][11216] Updated weights for policy 0, policy_version 1010 (0.0025)
[2023-02-23 00:49:50,490][05422] Fps is (10 sec: 4504.4, 60 sec: 3959.3, 300 sec: 3971.0). Total num frames: 4145152. Throughput: 0: 1012.3. Samples: 1034780. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:49:50,510][05422] Avg episode reward: [(0, '26.702')]
[2023-02-23 00:49:55,491][05422] Fps is (10 sec: 3685.1, 60 sec: 3959.2, 300 sec: 3984.9). Total num frames: 4161536. Throughput: 0: 966.7. Samples: 1039952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:49:55,497][05422] Avg episode reward: [(0, '27.615')]
[2023-02-23 00:49:59,985][11216] Updated weights for policy 0, policy_version 1020 (0.0011)
[2023-02-23 00:50:00,488][05422] Fps is (10 sec: 3277.7, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4177920. Throughput: 0: 971.2. Samples: 1044968. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:50:00,491][05422] Avg episode reward: [(0, '27.115')]
[2023-02-23 00:50:05,488][05422] Fps is (10 sec: 4097.5, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4202496. Throughput: 0: 1000.0. Samples: 1048544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:50:05,494][05422] Avg episode reward: [(0, '26.556')]
[2023-02-23 00:50:08,347][11216] Updated weights for policy 0, policy_version 1030 (0.0013)
[2023-02-23 00:50:10,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3971.0). Total num frames: 4222976. Throughput: 0: 1014.4. Samples: 1055822. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:50:10,495][05422] Avg episode reward: [(0, '27.447')]
[2023-02-23 00:50:15,492][05422] Fps is (10 sec: 3684.9, 60 sec: 3959.2, 300 sec: 3984.9). Total num frames: 4239360. Throughput: 0: 957.4. Samples: 1060320. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:50:15,499][05422] Avg episode reward: [(0, '28.444')]
[2023-02-23 00:50:20,404][11216] Updated weights for policy 0, policy_version 1040 (0.0018)
[2023-02-23 00:50:20,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4259840. Throughput: 0: 959.3. Samples: 1062634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:50:20,490][05422] Avg episode reward: [(0, '28.743')]
[2023-02-23 00:50:25,488][05422] Fps is (10 sec: 4097.6, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 4280320. Throughput: 0: 1010.2. Samples: 1069676. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:50:25,491][05422] Avg episode reward: [(0, '27.220')]
[2023-02-23 00:50:29,145][11216] Updated weights for policy 0, policy_version 1050 (0.0013)
[2023-02-23 00:50:30,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 4304896. Throughput: 0: 1001.7. Samples: 1076270. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:50:30,493][05422] Avg episode reward: [(0, '27.395')]
[2023-02-23 00:50:35,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3891.4, 300 sec: 3971.1). Total num frames: 4317184. Throughput: 0: 974.8. Samples: 1078644. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:50:35,495][05422] Avg episode reward: [(0, '27.487')]
[2023-02-23 00:50:40,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4337664. Throughput: 0: 970.8. Samples: 1083634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:50:40,490][05422] Avg episode reward: [(0, '28.467')]
[2023-02-23 00:50:40,891][11216] Updated weights for policy 0, policy_version 1060 (0.0013)
[2023-02-23 00:50:45,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4362240. Throughput: 0: 1018.0. Samples: 1090778. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:50:45,490][05422] Avg episode reward: [(0, '26.105')]
[2023-02-23 00:50:50,066][11216] Updated weights for policy 0, policy_version 1070 (0.0011)
[2023-02-23 00:50:50,488][05422] Fps is (10 sec: 4505.5, 60 sec: 3959.6, 300 sec: 3971.0). Total num frames: 4382720. Throughput: 0: 1018.8. Samples: 1094390. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:50:50,498][05422] Avg episode reward: [(0, '26.765')]
[2023-02-23 00:50:55,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.7, 300 sec: 3984.9). Total num frames: 4399104. Throughput: 0: 959.9. Samples: 1099016. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:50:55,495][05422] Avg episode reward: [(0, '26.238')]
[2023-02-23 00:51:00,488][05422] Fps is (10 sec: 3686.5, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 4419584. Throughput: 0: 986.9. Samples: 1104726. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:51:00,491][05422] Avg episode reward: [(0, '27.297')]
[2023-02-23 00:51:01,413][11216] Updated weights for policy 0, policy_version 1080 (0.0022)
[2023-02-23 00:51:05,488][05422] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 4444160. Throughput: 0: 1018.0. Samples: 1108446. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:51:05,497][05422] Avg episode reward: [(0, '25.537')]
[2023-02-23 00:51:10,488][05422] Fps is (10 sec: 4095.9, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4460544. Throughput: 0: 1007.6. Samples: 1115018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:51:10,494][05422] Avg episode reward: [(0, '25.350')]
[2023-02-23 00:51:11,244][11216] Updated weights for policy 0, policy_version 1090 (0.0022)
[2023-02-23 00:51:15,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.7, 300 sec: 3971.0). Total num frames: 4476928. Throughput: 0: 964.6. Samples: 1119676. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:51:15,493][05422] Avg episode reward: [(0, '24.688')]
[2023-02-23 00:51:20,488][05422] Fps is (10 sec: 3686.5, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4497408. Throughput: 0: 969.1. Samples: 1122252. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 00:51:20,490][05422] Avg episode reward: [(0, '24.854')]
[2023-02-23 00:51:21,649][11216] Updated weights for policy 0, policy_version 1100 (0.0034)
[2023-02-23 00:51:25,488][05422] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 4521984. Throughput: 0: 1020.4. Samples: 1129552. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:51:25,495][05422] Avg episode reward: [(0, '22.971')]
[2023-02-23 00:51:30,493][05422] Fps is (10 sec: 4503.4, 60 sec: 3959.1, 300 sec: 3984.9). Total num frames: 4542464. Throughput: 0: 995.6. Samples: 1135584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:51:30,495][05422] Avg episode reward: [(0, '23.899')]
[2023-02-23 00:51:31,709][11216] Updated weights for policy 0, policy_version 1110 (0.0014)
[2023-02-23 00:51:35,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4554752. Throughput: 0: 968.0. Samples: 1137950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:51:35,491][05422] Avg episode reward: [(0, '25.080')]
[2023-02-23 00:51:40,488][05422] Fps is (10 sec: 3688.2, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 4579328. Throughput: 0: 989.1. Samples: 1143524. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:51:40,490][05422] Avg episode reward: [(0, '26.435')]
[2023-02-23 00:51:42,102][11216] Updated weights for policy 0, policy_version 1120 (0.0030)
[2023-02-23 00:51:45,488][05422] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 4603904. Throughput: 0: 1027.7. Samples: 1150972. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:51:45,491][05422] Avg episode reward: [(0, '25.343')]
[2023-02-23 00:51:45,502][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001124_4603904.pth...
[2023-02-23 00:51:45,627][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000891_3649536.pth
[2023-02-23 00:51:50,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4620288. Throughput: 0: 1011.5. Samples: 1153964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:51:50,495][05422] Avg episode reward: [(0, '24.393')]
[2023-02-23 00:51:52,842][11216] Updated weights for policy 0, policy_version 1130 (0.0012)
[2023-02-23 00:51:55,488][05422] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 4632576. Throughput: 0: 966.8. Samples: 1158522. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:51:55,496][05422] Avg episode reward: [(0, '24.711')]
[2023-02-23 00:52:00,488][05422] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4657152. Throughput: 0: 1002.0. Samples: 1164764. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:52:00,495][05422] Avg episode reward: [(0, '23.945')]
[2023-02-23 00:52:02,373][11216] Updated weights for policy 0, policy_version 1140 (0.0026)
[2023-02-23 00:52:05,488][05422] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4681728. Throughput: 0: 1026.0. Samples: 1168424. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:52:05,494][05422] Avg episode reward: [(0, '25.402')]
[2023-02-23 00:52:10,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4698112. Throughput: 0: 997.4. Samples: 1174436. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 00:52:10,494][05422] Avg episode reward: [(0, '24.969')]
[2023-02-23 00:52:13,560][11216] Updated weights for policy 0, policy_version 1150 (0.0016)
[2023-02-23 00:52:15,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4714496. Throughput: 0: 967.0. Samples: 1179092. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:52:15,495][05422] Avg episode reward: [(0, '26.152')]
[2023-02-23 00:52:20,488][05422] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 4739072. Throughput: 0: 983.6. Samples: 1182214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:52:20,497][05422] Avg episode reward: [(0, '26.900')]
[2023-02-23 00:52:22,933][11216] Updated weights for policy 0, policy_version 1160 (0.0023)
[2023-02-23 00:52:25,488][05422] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 4759552. Throughput: 0: 1021.6. Samples: 1189498. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:52:25,494][05422] Avg episode reward: [(0, '29.178')]
[2023-02-23 00:52:25,509][11201] Saving new best policy, reward=29.178!
[2023-02-23 00:52:30,488][05422] Fps is (10 sec: 3686.3, 60 sec: 3891.5, 300 sec: 3957.2). Total num frames: 4775936. Throughput: 0: 979.1. Samples: 1195032. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 00:52:30,493][05422] Avg episode reward: [(0, '29.569')]
[2023-02-23 00:52:30,500][11201] Saving new best policy, reward=29.569!
[2023-02-23 00:52:34,859][11216] Updated weights for policy 0, policy_version 1170 (0.0022)
[2023-02-23 00:52:35,488][05422] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3971.1). Total num frames: 4792320. Throughput: 0: 961.5. Samples: 1197230. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:52:35,493][05422] Avg episode reward: [(0, '29.787')]
[2023-02-23 00:52:35,507][11201] Saving new best policy, reward=29.787!
[2023-02-23 00:52:40,488][05422] Fps is (10 sec: 4096.1, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4816896. Throughput: 0: 995.2. Samples: 1203308. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 00:52:40,495][05422] Avg episode reward: [(0, '29.138')]
[2023-02-23 00:52:43,437][11216] Updated weights for policy 0, policy_version 1180 (0.0014)
[2023-02-23 00:52:45,488][05422] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4841472. Throughput: 0: 1021.6. Samples: 1210736. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 00:52:45,494][05422] Avg episode reward: [(0, '29.574')]
[2023-02-23 00:52:50,488][05422] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4857856. Throughput: 0: 994.9. Samples: 1213196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:52:50,491][05422] Avg episode reward: [(0, '30.228')]
[2023-02-23 00:52:50,493][11201] Saving new best policy, reward=30.228!
[2023-02-23 00:52:55,416][11216] Updated weights for policy 0, policy_version 1190 (0.0015)
[2023-02-23 00:52:55,488][05422] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 4874240. Throughput: 0: 963.3. Samples: 1217784. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 00:52:55,496][05422] Avg episode reward: [(0, '30.731')]
[2023-02-23 00:52:55,507][11201] Saving new best policy, reward=30.731!
[2023-02-23 00:53:00,488][05422] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 4898816. Throughput: 0: 1009.6. Samples: 1224526. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:53:00,490][05422] Avg episode reward: [(0, '28.819')]
[2023-02-23 00:53:03,842][11216] Updated weights for policy 0, policy_version 1200 (0.0016)
[2023-02-23 00:53:05,494][05422] Fps is (10 sec: 4502.8, 60 sec: 3959.1, 300 sec: 3957.1). Total num frames: 4919296. Throughput: 0: 1021.1. Samples: 1228172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:53:05,497][05422] Avg episode reward: [(0, '28.916')]
[2023-02-23 00:53:10,488][05422] Fps is (10 sec: 3686.2, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 4935680. Throughput: 0: 983.2. Samples: 1233744. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:53:10,493][05422] Avg episode reward: [(0, '29.610')]
[2023-02-23 00:53:15,493][05422] Fps is (10 sec: 3277.2, 60 sec: 3959.1, 300 sec: 3971.0). Total num frames: 4952064. Throughput: 0: 965.7. Samples: 1238492. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:53:15,495][05422] Avg episode reward: [(0, '28.477')]
[2023-02-23 00:53:15,604][11216] Updated weights for policy 0, policy_version 1210 (0.0015)
[2023-02-23 00:53:20,488][05422] Fps is (10 sec: 4096.1, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 4976640. Throughput: 0: 996.0. Samples: 1242050. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 00:53:20,496][05422] Avg episode reward: [(0, '26.695')]
[2023-02-23 00:53:24,181][11216] Updated weights for policy 0, policy_version 1220 (0.0011)
[2023-02-23 00:53:25,491][05422] Fps is (10 sec: 4916.2, 60 sec: 4027.5, 300 sec: 3971.0). Total num frames: 5001216. Throughput: 0: 1024.4. Samples: 1249408. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 00:53:25,497][05422] Avg episode reward: [(0, '25.489')]
[2023-02-23 00:53:26,659][11201] Stopping Batcher_0...
[2023-02-23 00:53:26,659][11201] Loop batcher_evt_loop terminating...
[2023-02-23 00:53:26,661][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth...
[2023-02-23 00:53:26,660][05422] Component Batcher_0 stopped!
[2023-02-23 00:53:26,719][11216] Weights refcount: 2 0
[2023-02-23 00:53:26,736][11216] Stopping InferenceWorker_p0-w0...
[2023-02-23 00:53:26,736][05422] Component InferenceWorker_p0-w0 stopped!
[2023-02-23 00:53:26,742][11216] Loop inference_proc0-0_evt_loop terminating...
[2023-02-23 00:53:26,773][11223] Stopping RolloutWorker_w7...
[2023-02-23 00:53:26,774][05422] Component RolloutWorker_w7 stopped!
[2023-02-23 00:53:26,779][11221] Stopping RolloutWorker_w5...
[2023-02-23 00:53:26,782][11217] Stopping RolloutWorker_w1...
[2023-02-23 00:53:26,788][11219] Stopping RolloutWorker_w3...
[2023-02-23 00:53:26,780][05422] Component RolloutWorker_w5 stopped!
[2023-02-23 00:53:26,790][05422] Component RolloutWorker_w1 stopped!
[2023-02-23 00:53:26,791][05422] Component RolloutWorker_w3 stopped!
[2023-02-23 00:53:26,780][11221] Loop rollout_proc5_evt_loop terminating...
[2023-02-23 00:53:26,798][11219] Loop rollout_proc3_evt_loop terminating...
[2023-02-23 00:53:26,778][11223] Loop rollout_proc7_evt_loop terminating...
[2023-02-23 00:53:26,783][11217] Loop rollout_proc1_evt_loop terminating...
[2023-02-23 00:53:26,825][11218] Stopping RolloutWorker_w2...
[2023-02-23 00:53:26,825][11218] Loop rollout_proc2_evt_loop terminating...
[2023-02-23 00:53:26,825][05422] Component RolloutWorker_w2 stopped!
[2023-02-23 00:53:26,852][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001007_4124672.pth
[2023-02-23 00:53:26,856][05422] Component RolloutWorker_w6 stopped!
[2023-02-23 00:53:26,856][11222] Stopping RolloutWorker_w6...
[2023-02-23 00:53:26,862][11222] Loop rollout_proc6_evt_loop terminating...
[2023-02-23 00:53:26,869][11220] Stopping RolloutWorker_w4...
[2023-02-23 00:53:26,869][11220] Loop rollout_proc4_evt_loop terminating...
[2023-02-23 00:53:26,869][05422] Component RolloutWorker_w4 stopped!
[2023-02-23 00:53:26,877][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth...
[2023-02-23 00:53:26,918][11215] Stopping RolloutWorker_w0...
[2023-02-23 00:53:26,919][11215] Loop rollout_proc0_evt_loop terminating...
[2023-02-23 00:53:26,918][05422] Component RolloutWorker_w0 stopped!
[2023-02-23 00:53:27,207][11201] Stopping LearnerWorker_p0...
[2023-02-23 00:53:27,208][11201] Loop learner_proc0_evt_loop terminating...
[2023-02-23 00:53:27,208][05422] Component LearnerWorker_p0 stopped!
[2023-02-23 00:53:27,210][05422] Waiting for process learner_proc0 to stop...
[2023-02-23 00:53:29,603][05422] Waiting for process inference_proc0-0 to join...
[2023-02-23 00:53:30,165][05422] Waiting for process rollout_proc0 to join...
[2023-02-23 00:53:30,715][05422] Waiting for process rollout_proc1 to join...
[2023-02-23 00:53:30,718][05422] Waiting for process rollout_proc2 to join...
[2023-02-23 00:53:30,720][05422] Waiting for process rollout_proc3 to join...
[2023-02-23 00:53:30,724][05422] Waiting for process rollout_proc4 to join...
[2023-02-23 00:53:30,726][05422] Waiting for process rollout_proc5 to join...
[2023-02-23 00:53:30,728][05422] Waiting for process rollout_proc6 to join...
[2023-02-23 00:53:30,733][05422] Waiting for process rollout_proc7 to join...
[2023-02-23 00:53:30,734][05422] Batcher 0 profile tree view:
batching: 31.8950, releasing_batches: 0.0278
[2023-02-23 00:53:30,736][05422] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 629.9773
update_model: 8.8445
weight_update: 0.0011
one_step: 0.0119
handle_policy_step: 588.5976
deserialize: 17.1312, stack: 3.4701, obs_to_device_normalize: 132.6656, forward: 279.8045, send_messages: 30.6976
prepare_outputs: 95.2883
to_cpu: 60.5109
[2023-02-23 00:53:30,737][05422] Learner 0 profile tree view:
misc: 0.0076, prepare_batch: 18.7716
train: 92.7187
epoch_init: 0.0230, minibatch_init: 0.0122, losses_postprocess: 0.7026, kl_divergence: 0.6492, after_optimizer: 41.1845
calculate_losses: 32.8179
losses_init: 0.0042, forward_head: 1.9767, bptt_initial: 21.9087, tail: 1.2376, advantages_returns: 0.3497, losses: 4.2405
bptt: 2.6913
bptt_forward_core: 2.6027
update: 16.6400
clip: 1.6279
[2023-02-23 00:53:30,738][05422] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.4551, enqueue_policy_requests: 169.2354, env_step: 962.0005, overhead: 22.5413, complete_rollouts: 8.2889
save_policy_outputs: 22.7289
split_output_tensors: 10.9194
[2023-02-23 00:53:30,739][05422] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.4209, enqueue_policy_requests: 162.3719, env_step: 970.4761, overhead: 22.6631, complete_rollouts: 8.4823
save_policy_outputs: 23.0393
split_output_tensors: 11.4559
[2023-02-23 00:53:30,741][05422] Loop Runner_EvtLoop terminating...
[2023-02-23 00:53:30,744][05422] Runner profile tree view:
main_loop: 1301.1577
[2023-02-23 00:53:30,745][05422] Collected {0: 5005312}, FPS: 3846.8
[2023-02-23 00:53:30,809][05422] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 00:53:30,810][05422] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 00:53:30,812][05422] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 00:53:30,814][05422] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 00:53:30,816][05422] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 00:53:30,819][05422] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 00:53:30,821][05422] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 00:53:30,822][05422] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 00:53:30,823][05422] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-23 00:53:30,824][05422] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-23 00:53:30,826][05422] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 00:53:30,827][05422] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 00:53:30,828][05422] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 00:53:30,829][05422] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 00:53:30,830][05422] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 00:53:30,854][05422] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 00:53:30,859][05422] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 00:53:30,862][05422] RunningMeanStd input shape: (1,)
[2023-02-23 00:53:30,878][05422] ConvEncoder: input_channels=3
[2023-02-23 00:53:31,562][05422] Conv encoder output size: 512
[2023-02-23 00:53:31,567][05422] Policy head output size: 512
[2023-02-23 00:53:33,921][05422] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth...
[2023-02-23 00:53:35,134][05422] Num frames 100...
[2023-02-23 00:53:35,245][05422] Num frames 200...
[2023-02-23 00:53:35,361][05422] Num frames 300...
[2023-02-23 00:53:35,470][05422] Num frames 400...
[2023-02-23 00:53:35,613][05422] Avg episode rewards: #0: 6.800, true rewards: #0: 4.800
[2023-02-23 00:53:35,615][05422] Avg episode reward: 6.800, avg true_objective: 4.800
[2023-02-23 00:53:35,641][05422] Num frames 500...
[2023-02-23 00:53:35,754][05422] Num frames 600...
[2023-02-23 00:53:35,861][05422] Num frames 700...
[2023-02-23 00:53:35,970][05422] Num frames 800...
[2023-02-23 00:53:36,081][05422] Num frames 900...
[2023-02-23 00:53:36,190][05422] Num frames 1000...
[2023-02-23 00:53:36,305][05422] Num frames 1100...
[2023-02-23 00:53:36,413][05422] Num frames 1200...
[2023-02-23 00:53:36,524][05422] Num frames 1300...
[2023-02-23 00:53:36,644][05422] Num frames 1400...
[2023-02-23 00:53:36,750][05422] Avg episode rewards: #0: 14.220, true rewards: #0: 7.220
[2023-02-23 00:53:36,751][05422] Avg episode reward: 14.220, avg true_objective: 7.220
[2023-02-23 00:53:36,814][05422] Num frames 1500...
[2023-02-23 00:53:36,930][05422] Num frames 1600...
[2023-02-23 00:53:37,041][05422] Num frames 1700...
[2023-02-23 00:53:37,152][05422] Num frames 1800...
[2023-02-23 00:53:37,263][05422] Num frames 1900...
[2023-02-23 00:53:37,377][05422] Num frames 2000...
[2023-02-23 00:53:37,496][05422] Num frames 2100...
[2023-02-23 00:53:37,611][05422] Num frames 2200...
[2023-02-23 00:53:37,713][05422] Avg episode rewards: #0: 15.143, true rewards: #0: 7.477
[2023-02-23 00:53:37,716][05422] Avg episode reward: 15.143, avg true_objective: 7.477
[2023-02-23 00:53:37,780][05422] Num frames 2300...
[2023-02-23 00:53:37,889][05422] Num frames 2400...
[2023-02-23 00:53:37,998][05422] Num frames 2500...
[2023-02-23 00:53:38,109][05422] Num frames 2600...
[2023-02-23 00:53:38,218][05422] Num frames 2700...
[2023-02-23 00:53:38,333][05422] Num frames 2800...
[2023-02-23 00:53:38,441][05422] Num frames 2900...
[2023-02-23 00:53:38,548][05422] Num frames 3000...
[2023-02-23 00:53:38,654][05422] Num frames 3100...
[2023-02-23 00:53:38,770][05422] Num frames 3200...
[2023-02-23 00:53:38,880][05422] Num frames 3300...
[2023-02-23 00:53:39,038][05422] Avg episode rewards: #0: 17.488, true rewards: #0: 8.487
[2023-02-23 00:53:39,040][05422] Avg episode reward: 17.488, avg true_objective: 8.487
[2023-02-23 00:53:39,049][05422] Num frames 3400...
[2023-02-23 00:53:39,160][05422] Num frames 3500...
[2023-02-23 00:53:39,271][05422] Num frames 3600...
[2023-02-23 00:53:39,386][05422] Num frames 3700...
[2023-02-23 00:53:39,494][05422] Num frames 3800...
[2023-02-23 00:53:39,563][05422] Avg episode rewards: #0: 15.622, true rewards: #0: 7.622
[2023-02-23 00:53:39,564][05422] Avg episode reward: 15.622, avg true_objective: 7.622
[2023-02-23 00:53:39,663][05422] Num frames 3900...
[2023-02-23 00:53:39,770][05422] Num frames 4000...
[2023-02-23 00:53:39,879][05422] Num frames 4100...
[2023-02-23 00:53:39,986][05422] Num frames 4200...
[2023-02-23 00:53:40,138][05422] Avg episode rewards: #0: 14.152, true rewards: #0: 7.152
[2023-02-23 00:53:40,141][05422] Avg episode reward: 14.152, avg true_objective: 7.152
[2023-02-23 00:53:40,154][05422] Num frames 4300...
[2023-02-23 00:53:40,263][05422] Num frames 4400...
[2023-02-23 00:53:40,375][05422] Num frames 4500...
[2023-02-23 00:53:40,482][05422] Num frames 4600...
[2023-02-23 00:53:40,589][05422] Num frames 4700...
[2023-02-23 00:53:40,700][05422] Num frames 4800...
[2023-02-23 00:53:40,863][05422] Num frames 4900...
[2023-02-23 00:53:41,016][05422] Avg episode rewards: #0: 14.090, true rewards: #0: 7.090
[2023-02-23 00:53:41,018][05422] Avg episode reward: 14.090, avg true_objective: 7.090
[2023-02-23 00:53:41,076][05422] Num frames 5000...
[2023-02-23 00:53:41,227][05422] Num frames 5100...
[2023-02-23 00:53:41,383][05422] Num frames 5200...
[2023-02-23 00:53:41,532][05422] Num frames 5300...
[2023-02-23 00:53:41,683][05422] Num frames 5400...
[2023-02-23 00:53:41,833][05422] Num frames 5500...
[2023-02-23 00:53:41,901][05422] Avg episode rewards: #0: 13.509, true rewards: #0: 6.884
[2023-02-23 00:53:41,903][05422] Avg episode reward: 13.509, avg true_objective: 6.884
[2023-02-23 00:53:42,043][05422] Num frames 5600...
[2023-02-23 00:53:42,196][05422] Num frames 5700...
[2023-02-23 00:53:42,351][05422] Num frames 5800...
[2023-02-23 00:53:42,553][05422] Avg episode rewards: #0: 12.768, true rewards: #0: 6.546
[2023-02-23 00:53:42,555][05422] Avg episode reward: 12.768, avg true_objective: 6.546
[2023-02-23 00:53:42,572][05422] Num frames 5900...
[2023-02-23 00:53:42,725][05422] Num frames 6000...
[2023-02-23 00:53:42,881][05422] Num frames 6100...
[2023-02-23 00:53:43,040][05422] Num frames 6200...
[2023-02-23 00:53:43,199][05422] Num frames 6300...
[2023-02-23 00:53:43,356][05422] Num frames 6400...
[2023-02-23 00:53:43,515][05422] Num frames 6500...
[2023-02-23 00:53:43,669][05422] Num frames 6600...
[2023-02-23 00:53:43,830][05422] Avg episode rewards: #0: 13.266, true rewards: #0: 6.666
[2023-02-23 00:53:43,833][05422] Avg episode reward: 13.266, avg true_objective: 6.666
[2023-02-23 00:54:21,822][05422] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-23 00:54:22,095][05422] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 00:54:22,097][05422] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 00:54:22,099][05422] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 00:54:22,101][05422] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 00:54:22,103][05422] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 00:54:22,104][05422] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 00:54:22,105][05422] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-23 00:54:22,107][05422] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 00:54:22,108][05422] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-23 00:54:22,109][05422] Adding new argument 'hf_repository'='saikiranp/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-23 00:54:22,110][05422] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 00:54:22,111][05422] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 00:54:22,112][05422] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 00:54:22,113][05422] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 00:54:22,114][05422] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 00:54:22,135][05422] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 00:54:22,138][05422] RunningMeanStd input shape: (1,)
[2023-02-23 00:54:22,156][05422] ConvEncoder: input_channels=3
[2023-02-23 00:54:22,214][05422] Conv encoder output size: 512
[2023-02-23 00:54:22,216][05422] Policy head output size: 512
[2023-02-23 00:54:22,243][05422] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth...
[2023-02-23 00:54:23,012][05422] Num frames 100...
[2023-02-23 00:54:23,159][05422] Num frames 200...
[2023-02-23 00:54:23,303][05422] Num frames 300...
[2023-02-23 00:54:23,511][05422] Num frames 400...
[2023-02-23 00:54:23,724][05422] Avg episode rewards: #0: 7.800, true rewards: #0: 4.800
[2023-02-23 00:54:23,727][05422] Avg episode reward: 7.800, avg true_objective: 4.800
[2023-02-23 00:54:23,771][05422] Num frames 500...
[2023-02-23 00:54:23,929][05422] Num frames 600...
[2023-02-23 00:54:24,119][05422] Num frames 700...
[2023-02-23 00:54:24,300][05422] Num frames 800...
[2023-02-23 00:54:24,472][05422] Num frames 900...
[2023-02-23 00:54:24,652][05422] Num frames 1000...
[2023-02-23 00:54:24,818][05422] Num frames 1100...
[2023-02-23 00:54:24,980][05422] Num frames 1200...
[2023-02-23 00:54:25,149][05422] Num frames 1300...
[2023-02-23 00:54:25,315][05422] Num frames 1400...
[2023-02-23 00:54:25,461][05422] Num frames 1500...
[2023-02-23 00:54:25,647][05422] Num frames 1600...
[2023-02-23 00:54:25,702][05422] Avg episode rewards: #0: 17.000, true rewards: #0: 8.000
[2023-02-23 00:54:25,705][05422] Avg episode reward: 17.000, avg true_objective: 8.000
[2023-02-23 00:54:25,892][05422] Num frames 1700...
[2023-02-23 00:54:26,113][05422] Num frames 1800...
[2023-02-23 00:54:26,309][05422] Num frames 1900...
[2023-02-23 00:54:26,501][05422] Num frames 2000...
[2023-02-23 00:54:26,690][05422] Num frames 2100...
[2023-02-23 00:54:26,874][05422] Num frames 2200...
[2023-02-23 00:54:27,065][05422] Num frames 2300...
[2023-02-23 00:54:27,267][05422] Num frames 2400...
[2023-02-23 00:54:27,448][05422] Num frames 2500...
[2023-02-23 00:54:27,634][05422] Num frames 2600...
[2023-02-23 00:54:27,819][05422] Num frames 2700...
[2023-02-23 00:54:27,985][05422] Num frames 2800...
[2023-02-23 00:54:28,139][05422] Num frames 2900...
[2023-02-23 00:54:28,303][05422] Avg episode rewards: #0: 20.897, true rewards: #0: 9.897
[2023-02-23 00:54:28,305][05422] Avg episode reward: 20.897, avg true_objective: 9.897
[2023-02-23 00:54:28,341][05422] Num frames 3000...
[2023-02-23 00:54:28,449][05422] Num frames 3100...
[2023-02-23 00:54:28,557][05422] Num frames 3200...
[2023-02-23 00:54:28,668][05422] Num frames 3300...
[2023-02-23 00:54:28,785][05422] Num frames 3400...
[2023-02-23 00:54:28,896][05422] Num frames 3500...
[2023-02-23 00:54:29,007][05422] Num frames 3600...
[2023-02-23 00:54:29,116][05422] Num frames 3700...
[2023-02-23 00:54:29,225][05422] Num frames 3800...
[2023-02-23 00:54:29,336][05422] Num frames 3900...
[2023-02-23 00:54:29,444][05422] Num frames 4000...
[2023-02-23 00:54:29,552][05422] Num frames 4100...
[2023-02-23 00:54:29,661][05422] Num frames 4200...
[2023-02-23 00:54:29,778][05422] Num frames 4300...
[2023-02-23 00:54:29,889][05422] Num frames 4400...
[2023-02-23 00:54:30,048][05422] Avg episode rewards: #0: 24.990, true rewards: #0: 11.240
[2023-02-23 00:54:30,049][05422] Avg episode reward: 24.990, avg true_objective: 11.240
[2023-02-23 00:54:30,059][05422] Num frames 4500...
[2023-02-23 00:54:30,167][05422] Num frames 4600...
[2023-02-23 00:54:30,275][05422] Num frames 4700...
[2023-02-23 00:54:30,385][05422] Num frames 4800...
[2023-02-23 00:54:30,494][05422] Num frames 4900...
[2023-02-23 00:54:30,608][05422] Num frames 5000...
[2023-02-23 00:54:30,724][05422] Num frames 5100...
[2023-02-23 00:54:30,785][05422] Avg episode rewards: #0: 22.608, true rewards: #0: 10.208
[2023-02-23 00:54:30,788][05422] Avg episode reward: 22.608, avg true_objective: 10.208
[2023-02-23 00:54:30,894][05422] Num frames 5200...
[2023-02-23 00:54:31,000][05422] Num frames 5300...
[2023-02-23 00:54:31,108][05422] Num frames 5400...
[2023-02-23 00:54:31,216][05422] Num frames 5500...
[2023-02-23 00:54:31,326][05422] Num frames 5600...
[2023-02-23 00:54:31,442][05422] Num frames 5700...
[2023-02-23 00:54:31,566][05422] Num frames 5800...
[2023-02-23 00:54:31,688][05422] Num frames 5900...
[2023-02-23 00:54:31,805][05422] Num frames 6000...
[2023-02-23 00:54:31,922][05422] Num frames 6100...
[2023-02-23 00:54:32,053][05422] Num frames 6200...
[2023-02-23 00:54:32,169][05422] Num frames 6300...
[2023-02-23 00:54:32,284][05422] Num frames 6400...
[2023-02-23 00:54:32,393][05422] Num frames 6500...
[2023-02-23 00:54:32,502][05422] Num frames 6600...
[2023-02-23 00:54:32,613][05422] Num frames 6700...
[2023-02-23 00:54:32,724][05422] Num frames 6800...
[2023-02-23 00:54:32,839][05422] Num frames 6900...
[2023-02-23 00:54:32,957][05422] Num frames 7000...
[2023-02-23 00:54:33,068][05422] Num frames 7100...
[2023-02-23 00:54:33,177][05422] Num frames 7200...
[2023-02-23 00:54:33,238][05422] Avg episode rewards: #0: 29.173, true rewards: #0: 12.007
[2023-02-23 00:54:33,240][05422] Avg episode reward: 29.173, avg true_objective: 12.007
[2023-02-23 00:54:33,345][05422] Num frames 7300...
[2023-02-23 00:54:33,453][05422] Num frames 7400...
[2023-02-23 00:54:33,561][05422] Num frames 7500...
[2023-02-23 00:54:33,669][05422] Num frames 7600...
[2023-02-23 00:54:33,782][05422] Num frames 7700...
[2023-02-23 00:54:33,892][05422] Num frames 7800...
[2023-02-23 00:54:33,999][05422] Num frames 7900...
[2023-02-23 00:54:34,106][05422] Num frames 8000...
[2023-02-23 00:54:34,214][05422] Num frames 8100...
[2023-02-23 00:54:34,304][05422] Avg episode rewards: #0: 27.760, true rewards: #0: 11.617
[2023-02-23 00:54:34,306][05422] Avg episode reward: 27.760, avg true_objective: 11.617
[2023-02-23 00:54:34,383][05422] Num frames 8200...
[2023-02-23 00:54:34,492][05422] Num frames 8300...
[2023-02-23 00:54:34,601][05422] Num frames 8400...
[2023-02-23 00:54:34,709][05422] Num frames 8500...
[2023-02-23 00:54:34,822][05422] Num frames 8600...
[2023-02-23 00:54:34,931][05422] Num frames 8700...
[2023-02-23 00:54:35,042][05422] Num frames 8800...
[2023-02-23 00:54:35,157][05422] Num frames 8900...
[2023-02-23 00:54:35,248][05422] Avg episode rewards: #0: 26.165, true rewards: #0: 11.165
[2023-02-23 00:54:35,249][05422] Avg episode reward: 26.165, avg true_objective: 11.165
[2023-02-23 00:54:35,327][05422] Num frames 9000...
[2023-02-23 00:54:35,436][05422] Num frames 9100...
[2023-02-23 00:54:35,545][05422] Num frames 9200...
[2023-02-23 00:54:35,655][05422] Num frames 9300...
[2023-02-23 00:54:35,764][05422] Num frames 9400...
[2023-02-23 00:54:35,876][05422] Num frames 9500...
[2023-02-23 00:54:35,984][05422] Num frames 9600...
[2023-02-23 00:54:36,097][05422] Num frames 9700...
[2023-02-23 00:54:36,207][05422] Num frames 9800...
[2023-02-23 00:54:36,316][05422] Num frames 9900...
[2023-02-23 00:54:36,425][05422] Num frames 10000...
[2023-02-23 00:54:36,538][05422] Num frames 10100...
[2023-02-23 00:54:36,656][05422] Num frames 10200...
[2023-02-23 00:54:36,769][05422] Num frames 10300...
[2023-02-23 00:54:36,887][05422] Num frames 10400...
[2023-02-23 00:54:36,996][05422] Num frames 10500...
[2023-02-23 00:54:37,146][05422] Avg episode rewards: #0: 27.874, true rewards: #0: 11.763
[2023-02-23 00:54:37,148][05422] Avg episode reward: 27.874, avg true_objective: 11.763
[2023-02-23 00:54:37,165][05422] Num frames 10600...
[2023-02-23 00:54:37,272][05422] Num frames 10700...
[2023-02-23 00:54:37,380][05422] Num frames 10800...
[2023-02-23 00:54:37,486][05422] Num frames 10900...
[2023-02-23 00:54:37,595][05422] Num frames 11000...
[2023-02-23 00:54:37,704][05422] Num frames 11100...
[2023-02-23 00:54:37,864][05422] Num frames 11200...
[2023-02-23 00:54:38,019][05422] Num frames 11300...
[2023-02-23 00:54:38,175][05422] Num frames 11400...
[2023-02-23 00:54:38,366][05422] Avg episode rewards: #0: 27.083, true rewards: #0: 11.483
[2023-02-23 00:54:38,371][05422] Avg episode reward: 27.083, avg true_objective: 11.483
[2023-02-23 00:55:44,973][05422] Replay video saved to /content/train_dir/default_experiment/replay.mp4!