Spaces:
Sleeping
Sleeping
Andrei Cozma
commited on
Commit
·
0c8eecb
1
Parent(s):
879176c
Updates
Browse files- MonteCarloAgent.py +0 -1
- README.md +7 -1
- policy_mc_CliffWalking-v0_e2000_s500_g0.99_e0.1.npy +0 -0
MonteCarloAgent.py
CHANGED
@@ -116,7 +116,6 @@ class MonteCarloAgent:
|
|
116 |
|
117 |
if e % test_every == 0:
|
118 |
test_success_rate = self.test(verbose=False, **kwargs)
|
119 |
-
|
120 |
if log_wandb:
|
121 |
self.wandb_log_img(episode=e)
|
122 |
|
|
|
116 |
|
117 |
if e % test_every == 0:
|
118 |
test_success_rate = self.test(verbose=False, **kwargs)
|
|
|
119 |
if log_wandb:
|
120 |
self.wandb_log_img(episode=e)
|
121 |
|
README.md
CHANGED
@@ -4,9 +4,15 @@
|
|
4 |
|
5 |
Evolution of Reinforcement Learning methods from pure Dynamic Programming-based methods to Monte Carlo methods + Bellman Optimization Comparison
|
6 |
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
## Monte-Carlo Agent
|
8 |
|
9 |
-
The implementation of the epsilon-greedy Monte-Carlo agent for the [Cliff Walking](https://gymnasium.farama.org/environments/toy_text/cliff_walking/) toy environment.
|
10 |
|
11 |
### Training
|
12 |
|
|
|
4 |
|
5 |
Evolution of Reinforcement Learning methods from pure Dynamic Programming-based methods to Monte Carlo methods + Bellman Optimization Comparison
|
6 |
|
7 |
+
## Requirements
|
8 |
+
|
9 |
+
- Python 3
|
10 |
+
- Gymnasium: <https://pypi.org/project/gymnasium/>
|
11 |
+
- WandB: <https://pypi.org/project/wandb/> (optional for logging)
|
12 |
+
|
13 |
## Monte-Carlo Agent
|
14 |
|
15 |
+
The implementation of the epsilon-greedy Monte-Carlo agent for the [Cliff Walking](https://gymnasium.farama.org/environments/toy_text/cliff_walking/) toy environment as part of Gymnasium.
|
16 |
|
17 |
### Training
|
18 |
|
policy_mc_CliffWalking-v0_e2000_s500_g0.99_e0.1.npy
CHANGED
Binary files a/policy_mc_CliffWalking-v0_e2000_s500_g0.99_e0.1.npy and b/policy_mc_CliffWalking-v0_e2000_s500_g0.99_e0.1.npy differ
|
|