Lew commited on
Commit
e3bb541
·
1 Parent(s): 625bc69
Files changed (3) hide show
  1. README.md +15 -34
  2. replay.mp4 +0 -0
  3. results.json +1 -1
README.md CHANGED
@@ -1,11 +1,10 @@
1
  ---
 
2
  tags:
3
  - LunarLander-v2
4
- - ppo
5
  - deep-reinforcement-learning
6
  - reinforcement-learning
7
- - custom-implementation
8
- - deep-rl-course
9
  model-index:
10
  - name: PPO
11
  results:
@@ -17,40 +16,22 @@ model-index:
17
  type: LunarLander-v2
18
  metrics:
19
  - type: mean_reward
20
- value: 191.79 +/- 96.38
21
  name: mean_reward
22
  verified: false
23
  ---
24
 
25
- # PPO Agent Playing LunarLander-v2
 
 
26
 
27
- This is a trained model of a PPO agent playing LunarLander-v2.
 
28
 
29
- # Hyperparameters
30
- ```python
31
- {'path': '/content/drive/MyDrive/Colab Notebooks/HuggingFace/RL/Unit08'
32
- 'name': 'ppo-LunaLander_1.pt'
33
- 'env-id': 'LunarLander-v2'
34
- 'agent_properties': {'num_layers': 2
35
- 'hidden': 128
36
- 'activation': 'Tanh'}
37
- 'seed': ''
38
- 'device': 'cuda'
39
- 'total_timesteps': 100000
40
- 'num_steps': 32768
41
- 'batch_size': 64
42
- 'update_epochs': 2
43
- 'learning_rate': 1e-05
44
- 'lr_schedule': 'Exp'
45
- 'lr_final': 1e-06
46
- 'gamma': 0.995
47
- 'gae_lambda': 0.99
48
- 'norm_adv': 'True'
49
- 'clip_coef': 0.2
50
- 'clip_vloss': 'False'
51
- 'entropy_loss_coef': 0.01
52
- 'value_loss_coef': 0.5
53
- 'max_grad_norm': 0.5
54
- 'n_eval_episodes': 10}
55
- ```
56
-
 
1
  ---
2
+ library_name: stable-baselines3
3
  tags:
4
  - LunarLander-v2
 
5
  - deep-reinforcement-learning
6
  - reinforcement-learning
7
+ - stable-baselines3
 
8
  model-index:
9
  - name: PPO
10
  results:
 
16
  type: LunarLander-v2
17
  metrics:
18
  - type: mean_reward
19
+ value: 288.87 +/- 17.85
20
  name: mean_reward
21
  verified: false
22
  ---
23
 
24
+ # **PPO** Agent playing **LunarLander-v2**
25
+ This is a trained model of a **PPO** agent playing **LunarLander-v2**
26
+ using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
27
 
28
+ ## Usage (with Stable-baselines3)
29
+ TODO: Add your code
30
 
31
+
32
+ ```python
33
+ from stable_baselines3 import ...
34
+ from huggingface_sb3 import load_from_hub
35
+
36
+ ...
37
+ ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
replay.mp4 CHANGED
Binary files a/replay.mp4 and b/replay.mp4 differ
 
results.json CHANGED
@@ -1 +1 @@
1
- {"env_id": "LunarLander-v2", "mean_reward": 191.7946600494358, "std_reward": 96.37600757418993, "n_evaluation_episodes": 10, "eval_datetime": "2023-12-30T10:53:53.358557"}
 
1
+ {"mean_reward": 288.8737752, "std_reward": 17.847514688393968, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2023-11-08T10:22:42.851796"}