What is the training command for this model (mainly about the 2B)?

by ksridhar - opened Jun 15, 2024

Discussion

ksridhar

Jun 15, 2024

Hey @edbeeching ,

Thank you very much for these policies and for the JAT dataset created with these policies.

I'm trying to create similar policies for some Atari envs outside the Atari 57.

Is the training command used to create these with sample factory the following:

python -m sf_examples.atari.train_atari --algo=APPO --env=${ENV} --train_for_env_steps=2000000000 --experiment="atari_2B_${ENV}_1111"

In particular, does the 2B in model name mean 2 billion train_for_env_steps? The default value of this appears to be 100 million.

Thank you,
Kaustubh

ksridhar

Jun 15, 2024

Actually, nvm, I think I found the training command inside here: https://huggingface.co/edbeeching/atari_2B_atari_mspacman_1111/blob/main/cfg.json#L124

ksridhar

Jun 15, 2024

One last question: When I try to run the command above, I get an error as follows

train_atari.py: error: unrecognized arguments: --env_agents=512

Can I just remove --env_agents=512? @edbeeching

ksridhar

Jun 15, 2024

Edit:

The reason I got the above error was becuase I was using sf_examples.atari.train_atari instead of sf_examples.envpool.atari.train_envpool_atari.

The correct training command is

python -m sf_examples.envpool.atari.train_envpool_atari --seed=1111 --experiment=atari_2B_${ENV}_1111 --env=${ENV} --train_for_seconds=3600000 --algo=APPO --gamma=0.99 --num_workers=4 --num_envs_per_worker=1 --worker_num_splits=1 --env_agents=512 --benchmark=False --max_grad_norm=0.0 --decorrelate_experience_max_seconds=1 --encoder_conv_architecture=convnet_atari --encoder_conv_mlp_layers 512 --nonlinearity=relu --num_policies=1 --normalize_input=True --normalize_input_keys obs --normalize_returns=True --async_rl=True --batched_sampling=True --train_for_env_steps=2000000000 --save_milestones_sec=1200 --train_dir train_dir --rollout 64 --exploration_loss_coeff 0.0004677351413 --num_epochs 2 --batch_size 1024 --num_batches_per_epoch 8 --learning_rate 0.0003033891184

ksridhar changed discussion status to closed Jun 15, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment