Spaces:
Running
Running
Feat: add DeepMD pretrain model (#12)
Browse files* feat: add deepmd pretrain model
* separate model into different file in externals
* chore: add deepmd dependency
* skip hf model download for testing external fork
* change class name
* add HF HTTPError
* chore: downgrade deepmd to match pt version
* chore: try install deepmd from repo
* skip missing json on leaderboard; add installation instruction
* fix callout render
* fix readme path in pyproject.toml
---------
Co-authored-by: Yuan Chiang <[email protected]>
- .github/README.md +29 -3
- .github/workflows/test.yaml +2 -2
- mlip_arena/models/externals/deepmd.py +47 -0
- mlip_arena/models/registry.yaml +19 -1
- pyproject.toml +6 -4
- serve/leaderboard.py +12 -12
- tests/test_external_calculators.py +8 -1
.github/README.md
CHANGED
@@ -12,6 +12,25 @@
|
|
12 |
|
13 |
MLIP Arena is a platform for evaluating foundation machine learning interatomic potentials (MLIPs) beyond conventional energy and force error metrics. It focuses on revealing the underlying physics and chemistry learned by these models and assessing their performance in molecular dynamics (MD) simulations. The platform's benchmarks are specifically designed to evaluate the readiness and reliability of open-source, open-weight models in accurately reproducing both qualitative and quantitative behaviors of atomic systems.
|
14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
## Contribute
|
16 |
|
17 |
MLIP Arena is now in pre-alpha. If you're interested in joining the effort, please reach out to Yuan at [[email protected]](mailto:[email protected]). See [project page](https://github.com/orgs/atomind-ai/projects/1) for some outstanding tasks.
|
@@ -22,18 +41,25 @@ MLIP Arena is now in pre-alpha. If you're interested in joining the effort, plea
|
|
22 |
streamlit run serve/app.py
|
23 |
```
|
24 |
|
25 |
-
### Add new benchmark tasks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
1. Follow the task template to implement the task class and upload the script along with metadata to the MLIP Arena [here](../mlip_arena/tasks/README.md).
|
28 |
2. Code a benchmark script to evaluate the performance of your model on the task. The script should be able to load the model and the dataset, and output the evaluation metrics.
|
29 |
|
30 |
-
### Add new MLIP models
|
31 |
|
32 |
If you have pretrained MLIP models that you would like to contribute to the MLIP Arena and show benchmark in real-time, there are two ways:
|
33 |
|
34 |
#### External ASE Calculator (easy)
|
35 |
|
36 |
-
1. Implement new ASE Calculator class in [mlip_arena/models/
|
37 |
2. Name your class with awesome model name and add the same name to [registry](../mlip_arena/models/registry.yaml) with metadata.
|
38 |
|
39 |
> [!CAUTION]
|
|
|
12 |
|
13 |
MLIP Arena is a platform for evaluating foundation machine learning interatomic potentials (MLIPs) beyond conventional energy and force error metrics. It focuses on revealing the underlying physics and chemistry learned by these models and assessing their performance in molecular dynamics (MD) simulations. The platform's benchmarks are specifically designed to evaluate the readiness and reliability of open-source, open-weight models in accurately reproducing both qualitative and quantitative behaviors of atomic systems.
|
14 |
|
15 |
+
## Installation
|
16 |
+
|
17 |
+
### From PyPI (without model running capability)
|
18 |
+
|
19 |
+
```bash
|
20 |
+
pip install mlip-arena
|
21 |
+
```
|
22 |
+
|
23 |
+
### From source
|
24 |
+
|
25 |
+
```bash
|
26 |
+
git clone https://github.com/atomind-ai/mlip-arena.git
|
27 |
+
pip install torch==2.2.0
|
28 |
+
bash scripts/install-pyg.sh
|
29 |
+
bash scripts/install-dgl.sh
|
30 |
+
pip install .[test]
|
31 |
+
pip install .[mace]
|
32 |
+
```
|
33 |
+
|
34 |
## Contribute
|
35 |
|
36 |
MLIP Arena is now in pre-alpha. If you're interested in joining the effort, please reach out to Yuan at [[email protected]](mailto:[email protected]). See [project page](https://github.com/orgs/atomind-ai/projects/1) for some outstanding tasks.
|
|
|
41 |
streamlit run serve/app.py
|
42 |
```
|
43 |
|
44 |
+
### Add new benchmark tasks (WIP)
|
45 |
+
|
46 |
+
> [!NOTE]
|
47 |
+
> Please reuse or extend the general tasks defined as Prefect / Atomate2 workflow.
|
48 |
+
> The following are some tasks implemented:
|
49 |
+
> - [Prefect structure optimization (OPT)](../mlip_arena/tasks/optimize.py)
|
50 |
+
> - [Prefect molecular dynamics (MD)](../mlip_arena/tasks/md.py)
|
51 |
+
> - [Prefect equation of states (EOS)](../mlip_arena/tasks/eos/run.py)
|
52 |
|
53 |
1. Follow the task template to implement the task class and upload the script along with metadata to the MLIP Arena [here](../mlip_arena/tasks/README.md).
|
54 |
2. Code a benchmark script to evaluate the performance of your model on the task. The script should be able to load the model and the dataset, and output the evaluation metrics.
|
55 |
|
56 |
+
### Add new MLIP models
|
57 |
|
58 |
If you have pretrained MLIP models that you would like to contribute to the MLIP Arena and show benchmark in real-time, there are two ways:
|
59 |
|
60 |
#### External ASE Calculator (easy)
|
61 |
|
62 |
+
1. Implement new ASE Calculator class in [mlip_arena/models/externals](../mlip_arena/models/externals).
|
63 |
2. Name your class with awesome model name and add the same name to [registry](../mlip_arena/models/registry.yaml) with metadata.
|
64 |
|
65 |
> [!CAUTION]
|
.github/workflows/test.yaml
CHANGED
@@ -28,14 +28,14 @@ jobs:
|
|
28 |
pip install torch==2.2.0
|
29 |
bash scripts/install-pyg.sh
|
30 |
bash scripts/install-dgl.sh
|
31 |
-
pip install .[mace]
|
32 |
pip install .[test]
|
33 |
-
pip install
|
34 |
|
35 |
- name: List dependencies
|
36 |
run: pip list
|
37 |
|
38 |
- name: Login huggingface
|
|
|
39 |
env:
|
40 |
HF_TOKEN: ${{ secrets.HF_TOKEN_READ_ONLY }}
|
41 |
run:
|
|
|
28 |
pip install torch==2.2.0
|
29 |
bash scripts/install-pyg.sh
|
30 |
bash scripts/install-dgl.sh
|
|
|
31 |
pip install .[test]
|
32 |
+
pip install .[mace]
|
33 |
|
34 |
- name: List dependencies
|
35 |
run: pip list
|
36 |
|
37 |
- name: Login huggingface
|
38 |
+
if: ${{ github.event.pull_request.head.repo.full_name == github.repository }}
|
39 |
env:
|
40 |
HF_TOKEN: ${{ secrets.HF_TOKEN_READ_ONLY }}
|
41 |
run:
|
mlip_arena/models/externals/deepmd.py
ADDED
@@ -0,0 +1,47 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from __future__ import annotations
|
2 |
+
|
3 |
+
from pathlib import Path
|
4 |
+
|
5 |
+
import yaml
|
6 |
+
import requests
|
7 |
+
from deepmd.calculator import DP as DPCalculator
|
8 |
+
|
9 |
+
from mlip_arena.models.utils import get_freer_device
|
10 |
+
|
11 |
+
with open(Path(__file__).parents[1] / "registry.yaml", encoding="utf-8") as f:
|
12 |
+
REGISTRY = yaml.safe_load(f)
|
13 |
+
|
14 |
+
class DeepMD(DPCalculator):
|
15 |
+
def __init__(
|
16 |
+
self,
|
17 |
+
checkpoint=REGISTRY["DeepMD"]["checkpoint"],
|
18 |
+
device=None,
|
19 |
+
**kwargs,
|
20 |
+
):
|
21 |
+
device = device or get_freer_device()
|
22 |
+
|
23 |
+
cache_dir = Path.home() / ".cache" / "deepmd"
|
24 |
+
cache_dir.mkdir(parents=True, exist_ok=True)
|
25 |
+
model_path = cache_dir / checkpoint
|
26 |
+
|
27 |
+
url = "https://bohrium-api.dp.tech/ds-dl/mlip-arena-tfpk-v1.zip"
|
28 |
+
|
29 |
+
if not model_path.exists():
|
30 |
+
import zipfile
|
31 |
+
|
32 |
+
print(f"Downloading DeepMD model from {url} to {model_path}...")
|
33 |
+
try:
|
34 |
+
response = requests.get(url, stream=True, timeout=120)
|
35 |
+
response.raise_for_status()
|
36 |
+
with open(cache_dir/"temp.zip", "wb") as f:
|
37 |
+
for chunk in response.iter_content(chunk_size=8192):
|
38 |
+
f.write(chunk)
|
39 |
+
print("Download completed.")
|
40 |
+
with zipfile.ZipFile(cache_dir/"temp.zip", "r") as zip_ref:
|
41 |
+
zip_ref.extractall(cache_dir)
|
42 |
+
print("Unzip completed.")
|
43 |
+
except requests.exceptions.RequestException as e:
|
44 |
+
raise RuntimeError("Failed to download DeepMD model.") from e
|
45 |
+
|
46 |
+
|
47 |
+
super().__init__(model_path, device=device, **kwargs)
|
mlip_arena/models/registry.yaml
CHANGED
@@ -245,4 +245,22 @@ ALIGNN:
|
|
245 |
npt: true
|
246 |
github: https://github.com/usnistgov/alignn
|
247 |
doi: https://doi.org/10.1038/s41524-021-00650-1
|
248 |
-
date: 2021-11-15
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
245 |
npt: true
|
246 |
github: https://github.com/usnistgov/alignn
|
247 |
doi: https://doi.org/10.1038/s41524-021-00650-1
|
248 |
+
date: 2021-11-15
|
249 |
+
|
250 |
+
DeepMD:
|
251 |
+
module: externals
|
252 |
+
class: DeepMD
|
253 |
+
family: deepmd
|
254 |
+
package: deepmd-kit==v3.0.0b4
|
255 |
+
checkpoint: dp0808c_v024mixu.pth
|
256 |
+
username:
|
257 |
+
last-update: 2024-10-09T00:00:00
|
258 |
+
datetime: 2024-03-25T14:30:00 # TODO: Fake datetime
|
259 |
+
datasets:
|
260 |
+
- MPTrj # TODO: fake HF dataset repo
|
261 |
+
github: https://github.com/deepmodeling/deepmd-kit/
|
262 |
+
doi: https://arxiv.org/abs/2312.15492
|
263 |
+
date: 2024-10-09
|
264 |
+
prediction: EFS
|
265 |
+
nvt: true
|
266 |
+
npt: true
|
pyproject.toml
CHANGED
@@ -3,13 +3,13 @@ requires=["flit_core >=3.2,<4"]
|
|
3 |
build-backend="flit_core.buildapi"
|
4 |
|
5 |
[project]
|
6 |
-
name="
|
7 |
version="0.0.1a1"
|
8 |
authors=[
|
9 |
{name="Yuan Chiang", email="[email protected]"},
|
10 |
]
|
11 |
-
description=""
|
12 |
-
readme="README.md"
|
13 |
requires-python=">=3.10"
|
14 |
keywords=[
|
15 |
"pytorch",
|
@@ -66,9 +66,11 @@ test = [
|
|
66 |
"fairchem-core==1.2.0",
|
67 |
"sevenn==0.9.3.post1",
|
68 |
"orb-models==0.3.1",
|
|
|
69 |
"alignn==2024.5.27",
|
70 |
"pytest",
|
71 |
-
"prefect>=3.0.4"
|
|
|
72 |
]
|
73 |
mace = [
|
74 |
"mace-torch==0.3.4",
|
|
|
3 |
build-backend="flit_core.buildapi"
|
4 |
|
5 |
[project]
|
6 |
+
name="mlip-arena"
|
7 |
version="0.0.1a1"
|
8 |
authors=[
|
9 |
{name="Yuan Chiang", email="[email protected]"},
|
10 |
]
|
11 |
+
description="Fair and transparent benchmark of machine-learned interatomic potentials (MLIPs), beyond basic error metrics"
|
12 |
+
readme=".github/README.md"
|
13 |
requires-python=">=3.10"
|
14 |
keywords=[
|
15 |
"pytorch",
|
|
|
66 |
"fairchem-core==1.2.0",
|
67 |
"sevenn==0.9.3.post1",
|
68 |
"orb-models==0.3.1",
|
69 |
+
"pynanoflann@git+https://github.com/dwastberg/pynanoflann#egg=af434039ae14bedcbb838a7808924d6689274168",
|
70 |
"alignn==2024.5.27",
|
71 |
"pytest",
|
72 |
+
"prefect>=3.0.4",
|
73 |
+
"deepmd-kit@git+https://github.com/deepmodeling/[email protected]"
|
74 |
]
|
75 |
mace = [
|
76 |
"mace-torch==0.3.4",
|
serve/leaderboard.py
CHANGED
@@ -7,21 +7,21 @@ import streamlit as st
|
|
7 |
from mlip_arena.models import REGISTRY as MODELS
|
8 |
from mlip_arena.tasks import REGISTRY as TASKS
|
9 |
|
|
|
10 |
DATA_DIR = Path("mlip_arena/tasks/diatomics")
|
11 |
|
12 |
-
dfs = [
|
13 |
-
|
14 |
-
|
15 |
-
|
|
|
16 |
df = pd.concat(dfs, ignore_index=True)
|
17 |
|
18 |
-
|
19 |
table = pd.DataFrame(
|
20 |
columns=[
|
21 |
"Model",
|
22 |
"Element Coverage",
|
23 |
-
# "No. of reversed forces",
|
24 |
-
# "Energy-consistent forces",
|
25 |
"Prediction",
|
26 |
"NVT",
|
27 |
"NPT",
|
@@ -39,8 +39,6 @@ for model in MODELS:
|
|
39 |
new_row = {
|
40 |
"Model": model,
|
41 |
"Element Coverage": len(rows["name"].unique()),
|
42 |
-
# "No. of reversed forces": None, # Replace with actual logic if available
|
43 |
-
# "Energy-consistent forces": None, # Replace with actual logic if available
|
44 |
"Prediction": metadata.get("prediction", None),
|
45 |
"NVT": "✅" if metadata.get("nvt", False) else "❌",
|
46 |
"NPT": "✅" if metadata.get("npt", False) else "❌",
|
@@ -122,10 +120,12 @@ for task in TASKS:
|
|
122 |
# if st.button(f"Go to task page"):
|
123 |
# st.switch_page(f"tasks/{TASKS[task]['task-page']}.py")
|
124 |
else:
|
125 |
-
st.write(
|
126 |
-
|
|
|
|
|
127 |
st.page_link(
|
128 |
f"tasks/{TASKS[task]['task-page']}.py",
|
129 |
label="Task page",
|
130 |
icon=":material/link:",
|
131 |
-
)
|
|
|
7 |
from mlip_arena.models import REGISTRY as MODELS
|
8 |
from mlip_arena.tasks import REGISTRY as TASKS
|
9 |
|
10 |
+
# Read the data
|
11 |
DATA_DIR = Path("mlip_arena/tasks/diatomics")
|
12 |
|
13 |
+
dfs = []
|
14 |
+
for model in MODELS:
|
15 |
+
fpath = DATA_DIR / MODELS[model].get("family") / "homonuclear-diatomics.json"
|
16 |
+
if fpath.exists():
|
17 |
+
dfs.append(pd.read_json(fpath))
|
18 |
df = pd.concat(dfs, ignore_index=True)
|
19 |
|
20 |
+
# Create a table
|
21 |
table = pd.DataFrame(
|
22 |
columns=[
|
23 |
"Model",
|
24 |
"Element Coverage",
|
|
|
|
|
25 |
"Prediction",
|
26 |
"NVT",
|
27 |
"NPT",
|
|
|
39 |
new_row = {
|
40 |
"Model": model,
|
41 |
"Element Coverage": len(rows["name"].unique()),
|
|
|
|
|
42 |
"Prediction": metadata.get("prediction", None),
|
43 |
"NVT": "✅" if metadata.get("nvt", False) else "❌",
|
44 |
"NPT": "✅" if metadata.get("npt", False) else "❌",
|
|
|
120 |
# if st.button(f"Go to task page"):
|
121 |
# st.switch_page(f"tasks/{TASKS[task]['task-page']}.py")
|
122 |
else:
|
123 |
+
st.write(
|
124 |
+
"Rank metrics are not available yet but the task has been implemented. Please see the following task page for more information."
|
125 |
+
)
|
126 |
+
|
127 |
st.page_link(
|
128 |
f"tasks/{TASKS[task]['task-page']}.py",
|
129 |
label="Task page",
|
130 |
icon=":material/link:",
|
131 |
+
)
|
tests/test_external_calculators.py
CHANGED
@@ -3,6 +3,8 @@ from ase import Atoms
|
|
3 |
|
4 |
from mlip_arena.models import MLIPEnum
|
5 |
|
|
|
|
|
6 |
|
7 |
@pytest.mark.parametrize("model", MLIPEnum)
|
8 |
def test_calculate(model: MLIPEnum):
|
@@ -10,7 +12,12 @@ def test_calculate(model: MLIPEnum):
|
|
10 |
if model.name == "ALIGNN":
|
11 |
pytest.xfail("ALIGNN has poor file download mechanism")
|
12 |
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
atoms = Atoms(
|
16 |
"OO",
|
|
|
3 |
|
4 |
from mlip_arena.models import MLIPEnum
|
5 |
|
6 |
+
from requests import HTTPError
|
7 |
+
from huggingface_hub.errors import LocalTokenNotFoundError
|
8 |
|
9 |
@pytest.mark.parametrize("model", MLIPEnum)
|
10 |
def test_calculate(model: MLIPEnum):
|
|
|
12 |
if model.name == "ALIGNN":
|
13 |
pytest.xfail("ALIGNN has poor file download mechanism")
|
14 |
|
15 |
+
try:
|
16 |
+
calc = MLIPEnum[model.name].value()
|
17 |
+
|
18 |
+
except (LocalTokenNotFoundError, HTTPError):
|
19 |
+
# Gracefully skip the test if HF_TOKEN is not available
|
20 |
+
pytest.skip("Skipping test because HF_TOKEN is not available for downloading the model.")
|
21 |
|
22 |
atoms = Atoms(
|
23 |
"OO",
|