Update README.md
Browse files
README.md
CHANGED
@@ -43,7 +43,7 @@ Hyperparameters for DIPO have been shown as follow for easily reproducing our re
|
|
43 |
| No. of hidden nodes | 256 | 256 | 256 | 256 |
|
44 |
| Activation | mish | relu | relu | tanh |
|
45 |
| Batch size | 256 | 256 | 256 | 256 |
|
46 |
-
| Discount for reward
|
47 |
| Target smoothing coefficient $\tau$ | 0.005 | 0.005 | 0.005 | 0.005 |
|
48 |
| Learning rate for actor | $3 × 10^{-4}$ | $3 × 10^{-4}$ | $3 × 10^{-4}$ | $7 × 10^{-4}$ |
|
49 |
| Learning rate for actor | $3 × 10^{-4}$ | $3 × 10^{-4}$ | $3 × 10^{-4}$ | $7 × 10^{-4}$ |
|
|
|
43 |
| No. of hidden nodes | 256 | 256 | 256 | 256 |
|
44 |
| Activation | mish | relu | relu | tanh |
|
45 |
| Batch size | 256 | 256 | 256 | 256 |
|
46 |
+
| Discount for reward $$\gamma$$ | 0.99 | 0.99 | 0.99 | 0.99 |
|
47 |
| Target smoothing coefficient $\tau$ | 0.005 | 0.005 | 0.005 | 0.005 |
|
48 |
| Learning rate for actor | $3 × 10^{-4}$ | $3 × 10^{-4}$ | $3 × 10^{-4}$ | $7 × 10^{-4}$ |
|
49 |
| Learning rate for actor | $3 × 10^{-4}$ | $3 × 10^{-4}$ | $3 × 10^{-4}$ | $7 × 10^{-4}$ |
|