Spaces:
Running
Running
sugarfreez
commited on
Commit
โข
df2b142
1
Parent(s):
3c336c3
style(nyz): add status emoji and env link
Browse files
README.md
CHANGED
@@ -25,22 +25,19 @@ If you want to contact us & join us, you can โ๏ธ to our team : <opendilab@p
|
|
25 |
|
26 |
|
27 |
# Overview of Model Zoo
|
28 |
-
|
29 |
-
<sup>(
|
30 |
-
<sup>(2): "W" means that the corresponding model is in the upload waitinglist.</sup>
|
31 |
-
|
32 |
### Deep Reinforcement Learning
|
33 |
-
|
34 |
-
|
|
35 |
-
|
|
36 |
-
| [
|
37 |
-
| [
|
38 |
-
| [
|
39 |
-
| [
|
40 |
-
| [
|
41 |
-
| [
|
42 |
-
| [
|
43 |
-
| [SAC](https://arxiv.org/pdf/1801.01290.pdf) | | | | - | - | - | | | |
|
44 |
|
45 |
|
46 |
### Multi-Agent Reinforcement Learning
|
|
|
25 |
|
26 |
|
27 |
# Overview of Model Zoo
|
28 |
+
<sup>(1): "๐" means that this algorithm doesn't support this environment.</sup>
|
29 |
+
<sup>(2): "๐ฎ" means that the corresponding model is in the upload waitinglist.</sup>
|
|
|
|
|
30 |
### Deep Reinforcement Learning
|
31 |
+
| Algo.\Env. | [LunarLander](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [BipedalWalker](https://di-engine-docs.readthedocs.io/en/latest/13_envs/bipedalwalker.html) | [Pendulum](https://di-engine-docs.readthedocs.io/en/latest/13_envs/pendulum.html) | [Pong](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [SpaceInvaders](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Qbert](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Hopper](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Halfcheetah](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Walker2d](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) |
|
32 |
+
| :-------------: | :-------------: | :------------------------: | :------------: | :--------------: | :------------: | :------------------: | :---------: | :---------: | :---------: |
|
33 |
+
| [PPO](https://arxiv.org/pdf/1707.06347.pdf) | [โ
](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-ppo) | | | | | | [โ
](https://huggingface.co/OpenDILabCommunity/Hopper-v4-PPO) | | |
|
34 |
+
| [PG](https://proceedings.neurips.cc/paper/1999/file/464d828b85b0bed98e80ade0a5c43b0f-Paper.pdf) | ๐ฎ | | | | | |๐ฎ | | |
|
35 |
+
| [A2C](https://arxiv.org/pdf/1602.01783.pdf) | ๐ฎ | | | | | | ๐ฎ | | |
|
36 |
+
| [IMPALA](https://arxiv.org/pdf/1802.01561.pdf) |๐ฎ | | | | | | ๐ฎ | | |
|
37 |
+
| [DQN](https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf) | ๐ฎ | | | | | | ๐ | ๐ | ๐ |
|
38 |
+
| [DDPG](https://arxiv.org/pdf/1509.02971.pdf) | ๐ฎ | | | ๐ | ๐ | ๐ | ๐ฎ | | |
|
39 |
+
| [TD3](https://arxiv.org/pdf/1802.09477.pdf) | ๐ฎ | | | ๐ | ๐ | ๐ |[โ
](https://huggingface.co/OpenDILabCommunity/Hopper-v4-TD3) | | |
|
40 |
+
| [SAC](https://arxiv.org/pdf/1801.01290.pdf) |๐ฎ | | | ๐ | ๐ | ๐ | [โ
](https://huggingface.co/OpenDILabCommunity/Hopper-v4-SAC) | | |
|
|
|
41 |
|
42 |
|
43 |
### Multi-Agent Reinforcement Learning
|