PPO Agent playing PongNoFrameskip-v4

This is a trained model of a PPO agent playing PongNoFrameskip-v4 using the stable-baselines3 library and the RL Zoo.

The RL Zoo is a training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

Usage (with SB3 RL Zoo)

RL Zoo: https://github.com/DLR-RM/rl-baselines3-zoo
SB3: https://github.com/DLR-RM/stable-baselines3
SB3 Contrib: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib

Install the RL Zoo (with SB3 and SB3-Contrib):

pip install rl_zoo3
# Download model and save it into the logs/ folder
python -m rl_zoo3.load_from_hub --algo ppo --env PongNoFrameskip-v4 -orga MattStammers -f logs/
python -m rl_zoo3.enjoy --algo ppo --env PongNoFrameskip-v4  -f logs/

If you installed the RL Zoo3 via pip (pip install rl_zoo3), from anywhere you can do:

python -m rl_zoo3.load_from_hub --algo ppo --env PongNoFrameskip-v4 -orga MattStammers -f logs/
python -m rl_zoo3.enjoy --algo ppo --env PongNoFrameskip-v4  -f logs/

Training (with the RL Zoo)

python -m rl_zoo3.train --algo ppo --env PongNoFrameskip-v4 -f logs/
# Upload the model and generate video (when possible)
python -m rl_zoo3.push_to_hub --algo ppo --env PongNoFrameskip-v4 -f logs/ -orga MattStammers


OrderedDict([('batch_size', 256),
             ('clip_range', 'lin_0.1'),
             ('ent_coef', 0.01),
             ('frame_stack', 4),
             ('learning_rate', 'lin_2.5e-4'),
             ('n_envs', 8),
             ('n_epochs', 4),
             ('n_steps', 128),
             ('n_timesteps', 100000000.0),
             ('normalize', False),
             ('policy', 'CnnPolicy'),
             ('vf_coef', 0.5)])

Environment Arguments

{'render_mode': 'rgb_array'}

This agent like the others is very nearly perfect against the pong computer player. Now we just need to get them to play each other to improve further!

Downloads last month
Video Preview

Evaluation results