File size: 9,705 Bytes
52bb000
 
 
 
 
 
 
3c336c3
52bb000
 
5a48e8f
d59e077
72adcab
 
 
 
 
 
b500867
72adcab
 
d59e077
8f8a2a0
4d5391a
 
3c336c3
4d5391a
df2b142
1596fb1
3c336c3
1596fb1
 
 
978c0c2
 
098ab5b
b2dde2e
 
 
 
 
f8452cc
1596fb1
 
3c336c3
8bbf056
 
 
 
09d194f
8bbf056
b2dde2e
b7e022f
b2dde2e
198191c
b7e022f
4a994b6
5629246
8bbf056
 
4d5391a
3c336c3
4d5391a
 
 
 
 
3c336c3
4d5391a
 
 
 
 
3c336c3
4d5391a
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
title: README
emoji: πŸ’»
colorFrom: red
colorTo: gray
sdk: static
pinned: false
license: apache-2.0
---



---

OpenDILab is the **first decision intelligence platform** covering **the most comprehensive algorithms and applications both in academia and industry.** Since July 2021, OpenDILab has been officially open-sourced at the **World Artificial Intelligence Conference(WAIC)** opening ceremony. 

As an important part of OpenXLab from Shanghai AI Laboratory, OpenDILab features a complete set of training and deployment frameworks for decision intelligence, consisting of the **application ecological layer, algorithm abstraction layer, distributed management layer, and distributed execution layer.** OpenDILab also supports full-scale scheduling system optimization from a single machine to the joint training of thousands of CPU/GPU.

![feature](./opendilab1.0_feature.png)

OpenDILab contributes to the integration of the latest and most comprehensive achievements in academia as well as the standardization of complex problems in the industry. Our future vision is to promote the development of AI **from perceptual intelligence to decision intelligence,** taking AI technology to a higher level of the general intelligence era.

If you want to contact us & join us, you can  βœ‰οΈ  to our team : <[email protected]>.



# Overview of Model Zoo
<sup>(1): "πŸ”“" means that this algorithm doesn't support this environment.</sup>
<sup>(2): "⏳" means that the corresponding model is in the upload waitinglist (Work In Progress).</sup>
### Deep Reinforcement Learning
<details open>
<summary>(Click to Collapse)</summary>

| Algo.\Env.   | [LunarLander](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [LunarLanderContinuous](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [BipedalWalker](https://di-engine-docs.readthedocs.io/en/latest/13_envs/bipedalwalker.html) | [Pendulum](https://di-engine-docs.readthedocs.io/en/latest/13_envs/pendulum.html) | [Pong](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [SpaceInvaders](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Qbert](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Hopper](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Halfcheetah](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Walker2d](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) |
| :-------------: | :-------------: | :-------------: | :------------------------: | :------------: | :--------------: | :------------: | :------------------: | :---------: | :---------: | :---------: |
| [PPO](https://arxiv.org/pdf/1707.06347.pdf) | [βœ…](https://huggingface.co/OpenDILabCommunity/Lunarlander-v2-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLanderContinuous-v2-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/BipedalWalker-v3-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/SpaceInvadersNoFrameskip-v4-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/QbertNoFrameskip-v4-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v3-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/HalfCheetah-v3-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/Walker2d-v3-PPO) |
| [DQN](https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf) | [βœ…](https://huggingface.co/OpenDILabCommunity/Lunarlander-v2-DQN) | πŸ”’ | πŸ”’ | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-DQN) | [βœ…](https://huggingface.co/OpenDILabCommunity/SpaceInvadersNoFrameskip-v4-DQN) | [βœ…](https://huggingface.co/OpenDILabCommunity/QbertNoFrameskip-v4-DQN) | πŸ”’ | πŸ”’ | πŸ”’ |
| [C51](https://arxiv.org/pdf/1707.06887.pdf) | [βœ…](https://huggingface.co/OpenDILabCommunity/Lunarlander-v2-C51) | πŸ”’ | πŸ”’ | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-C51) | [βœ…](https://huggingface.co/OpenDILabCommunity/SpaceInvadersNoFrameskip-v4-C51) | [βœ…](https://huggingface.co/OpenDILabCommunity/QbertNoFrameskip-v4-C51) | πŸ”’ | πŸ”’ | πŸ”’ |
| [DDPG](https://arxiv.org/pdf/1509.02971.pdf) | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-DDPG) | [βœ…](https://huggingface.co/OpenDILabCommunity/BipedalWalker-v3-DDPG) | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-DDPG) | πŸ”’ | πŸ”’ | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v3-DDPG) | [βœ…](https://huggingface.co/OpenDILabCommunity/HalfCheetah-v3-DDPG) | [βœ…](https://huggingface.co/OpenDILabCommunity/Walker2d-v3-DDPG) |
| [TD3](https://arxiv.org/pdf/1802.09477.pdf) | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-TD3) | [βœ…](https://huggingface.co/OpenDILabCommunity/BipedalWalker-v3-TD3) | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-TD3) | πŸ”’ | πŸ”’ | πŸ”’ |[βœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v3-TD3)  | [βœ…](https://huggingface.co/OpenDILabCommunity/HalfCheetah-v3-TD3) | [βœ…](https://huggingface.co/OpenDILabCommunity/Walker2d-v3-TD3) |
| [SAC](https://arxiv.org/pdf/1801.01290.pdf) | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-SAC) | [βœ…](https://huggingface.co/OpenDILabCommunity/BipedalWalker-v3-SAC) | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-SAC) | πŸ”’ | πŸ”’ | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v3-SAC) | [βœ…](https://huggingface.co/OpenDILabCommunity/HalfCheetah-v3-SAC) | [βœ…](https://huggingface.co/OpenDILabCommunity/Walker2d-v3-SAC) |
| [IMPALA](https://arxiv.org/pdf/1802.01561.pdf) | [βœ…](https://huggingface.co/OpenDILabCommunity/Lunarlander-v2-IMPALA) |  |  |  | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-IMPALA)  | [βœ…](https://huggingface.co/OpenDILabCommunity/SpaceInvadersNoFrameskip-v4-IMPALA)  |  |  |  |  |

</details>

### Monte Carlo tree search
<details open>
<summary>(Click to Collapse)</summary>

| Algo.\Env.   | [CartPole](https://di-engine-docs.readthedocs.io/en/latest/13_envs/cartpole.html) | [LunarLander](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [LunarLanderContinuous](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [Pendulum](https://di-engine-docs.readthedocs.io/en/latest/13_envs/pendulum.html) | [Pong](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Breakout](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [MsPacman](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [TicTacToe]() | []() | []() |
| :-------------: | :-------------: | :-------------: | :------------------------: | :------------: | :--------------: | :------------: | :------------------: | :---------: | :---------: | :---------: |
| [AlphaZero](https://www.science.org/doi/10.1126/science.aar6404) | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/TicTacToe-play-with-bot-AlphaZero) |  |  |
| [Sampled AlphaZero](https://www.science.org/doi/10.1126/science.aar6404) | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/TicTacToe-play-with-bot-SampledAlphaZero) |  |  |
| [MuZero](https://arxiv.org/abs/1911.08265) | [βœ…](https://huggingface.co/OpenDILabCommunity/CartPole-v0-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-MuZero) | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/BreakoutNoFrameskip-v4-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/MsPacmanNoFrameskip-v4-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/TicTacToe-play-with-bot-MuZero) |  |  |
| [EfficientZero](https://arxiv.org/abs/2111.00210) | [βœ…](https://huggingface.co/OpenDILabCommunity/CartPole-v0-EfficientZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-EfficientZero) | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-EfficientZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-EfficientZero) |  | [βœ…](https://huggingface.co/OpenDILabCommunity/MsPacmanNoFrameskip-v4-EfficientZero) | πŸ”’ |  |  |
| [Gumbel MuZero](https://openreview.net/pdf?id=bERaNdoegnO&) | [βœ…](https://huggingface.co/OpenDILabCommunity/CartPole-v0-GumbelMuZero) |  | πŸ”’ |  |  | πŸ”’ | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/TicTacToe-play-with-bot-GumbelMuZero) |  |  |
| [Sampled EfficientZero](https://arxiv.org/abs/2104.06303) | [βœ…](https://huggingface.co/OpenDILabCommunity/CartPole-v0-SampledEfficientZero) |  |  | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-SampledEfficientZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-SampledEfficientZero) |  | [βœ…](https://huggingface.co/OpenDILabCommunity/MsPacmanNoFrameskip-v4-SampledEfficientZero) | πŸ”’ |  |  |
| [Stochastic MuZero](https://openreview.net/pdf?id=X6D9bAHhBQ1) | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ |  |  |

</details>

### Multi-Agent Reinforcement Learning
<details close>
<summary>(Click for Details)</summary>
TBD
</details>

### Offline Reinforcement Learning
<details close>
<summary>(Click for Details)</summary>
TBD
</details>

### Model-Based Reinforcement Learning
<details close>
<summary>(Click for Details)</summary>
TBD
</details>