Spaces:
Running
Running
File size: 5,288 Bytes
f9d3b3b 2bbd13c d72faca c19aed5 78cdc34 5b01054 f1eddde c19aed5 aecc8a6 f1eddde 2bbd13c f1eddde f9d3b3b 52c1bfb f9d3b3b aa496c2 f9d3b3b 8396dce f9d3b3b 2bbd13c 78cdc34 e6cac5c 78cdc34 2bbd13c e6cac5c 7c36d54 3b3aaa9 52c1bfb 5b01054 aa496c2 7f5709e 8396dce 2bbd13c f9d3b3b 2bbd13c f9d3b3b 9361457 3b3aaa9 f9d3b3b af4a473 9361457 ba1879f aa496c2 ba1879f aa496c2 ba1879f aa496c2 ba1879f f9d3b3b 78cdc34 5b01054 ba1879f 5b01054 d72faca 5b01054 7c36d54 5b01054 7c36d54 5b01054 78cdc34 ba1879f 78cdc34 f9d3b3b 7dfb0b7 ba1879f 7dfb0b7 f9d3b3b d937c80 7dfb0b7 f1eddde c4ff088 c19aed5 c4ff088 f1eddde |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 |
import importlib
from pathlib import Path
import pandas as pd
import streamlit as st
from mlip_arena import PKG_DIR
from mlip_arena.models import REGISTRY as MODELS
from mlip_arena.tasks import REGISTRY as TASKS
# Read the data
DATA_DIR = PKG_DIR / "tasks" /"diatomics"
dfs = []
for model in MODELS:
fpath = DATA_DIR / MODELS[model].get("family") / "homonuclear-diatomics.json"
if fpath.exists():
dfs.append(pd.read_json(fpath))
df = pd.concat(dfs, ignore_index=True)
# Create a table
table = pd.DataFrame(
columns=[
"Model",
"Element Coverage",
"Prediction",
"NVT",
"NPT",
"Training Set",
"Code",
"Paper",
"Checkpoint",
"First Release",
"License",
]
)
for model in MODELS:
rows = df[df["method"] == model]
metadata = MODELS.get(model, {})
new_row = {
"Model": model,
"Element Coverage": len(rows["name"].unique()),
"Prediction": metadata.get("prediction", None),
"NVT": "✅" if metadata.get("nvt", False) else "❌",
"NPT": "✅" if metadata.get("npt", False) else "❌",
"Training Set": metadata.get("datasets", []),
"Code": metadata.get("github", None) if metadata else None,
"Paper": metadata.get("doi", None) if metadata else None,
"Checkpoint": metadata.get("checkpoint", None),
"First Release": metadata.get("date", None),
"License": metadata.get("license", None),
}
table = pd.concat([table, pd.DataFrame([new_row])], ignore_index=True)
table.set_index("Model", inplace=True)
s = table.style.background_gradient(
cmap="PuRd", subset=["Element Coverage"], vmin=0, vmax=120
)
st.warning(
"MLIP Arena is currently in **pre-alpha**. The results are not stable. Please interpret them with care.",
icon="⚠️",
)
st.info(
"Contributions are welcome. For more information, visit https://github.com/atomind-ai/mlip-arena.",
icon="🤗",
)
st.markdown(
"""
<h1 style='text-align: center;'>⚔️ MLIP Arena Leaderboard ⚔️</h1>
> MLIP Arena is a platform for evaluating foundation machine learning interatomic potentials (MLIPs) beyond conventional energy and force error metrics. It focuses on revealing the underlying physics and chemistry learned by these models and assessing their performance in molecular dynamics (MD) simulations. The platform's benchmarks are specifically designed to evaluate the readiness and reliability of open-source, open-weight models in accurately reproducing both qualitative and quantitative behaviors of atomic systems.
### :red[Introduction]
Foundation machine learning interatomic potentials (fMLIPs), trained on extensive databases containing millions of density functional theory (DFT) calculations, have demonstrated remarkable zero-shot predictive capabilities for complex atomic interactions. These potentials derive quantum mechanical insights with high accuracy, expressivity, and generalizability, significantly outperforming classical empirical force fields while maintaining comparable computational efficiency.
However, MLIPs trained on atomic energy and force labels do not necessarily capture the correct atomic interactions, even though they often excel in error-based metrics for bulk systems. To drive further advancements in this field, it is crucial to establish mechanisms that ensure fair and transparent benchmarking practices beyond basic regression metrics.
MLIP Arena aims to provide a fair and transparent platform for benchmarking MLIPs in a crowdsourced setting. Its primary goal is to uncover the learned physics and chemistry of open-source, open-weight MLIPs. The benchmarks are designed to be agnostic to both the underlying architecture and specific training targets, such as density functionals, ensuring a cross-comparable and unbiased evaluation.
""",
unsafe_allow_html=True,
)
st.subheader(":red[Supported Models]")
st.dataframe(
s,
use_container_width=True,
column_config={
"Code": st.column_config.LinkColumn(
# validate="^https://[a-z]+\.streamlit\.app$",
width="medium",
display_text="Link",
),
"Paper": st.column_config.LinkColumn(
# validate="^https://[a-z]+\.streamlit\.app$",
width="medium",
display_text="Link",
),
},
)
# st.markdown("<h2 style='text-align: center;'>🏆 Task Ranks 🏆</h2>", unsafe_allow_html=True)
st.subheader(":red[Task Ranks]")
for task in TASKS:
if TASKS[task]["rank-page"] is None:
continue
st.subheader(task, divider=True)
task_module = importlib.import_module(f"ranks.{TASKS[task]['rank-page']}")
# Call the function from the imported module
if hasattr(task_module, "render"):
task_module.render()
# if st.button(f"Go to task page"):
# st.switch_page(f"tasks/{TASKS[task]['task-page']}.py")
else:
st.write(
"Rank metrics are not available yet but the task has been implemented. Please see the following task page for more information."
)
st.page_link(
f"tasks/{TASKS[task]['task-page']}.py",
label="Go to the associated task page",
icon=":material/link:",
)
|