Spaces:
Running
Running
merge: branch 'main' of https://huggingface.co/spaces/holistic-ai/LibVulnWatch
Browse files- README.md +3 -1
- app.py +1 -1
- assessment-results/agent_development_kit.json +1 -1
- src/about.py +32 -19
README.md
CHANGED
@@ -7,7 +7,7 @@ sdk: gradio
|
|
7 |
app_file: app.py
|
8 |
pinned: true
|
9 |
license: mit
|
10 |
-
short_description:
|
11 |
sdk_version: 5.19.0
|
12 |
---
|
13 |
|
@@ -46,3 +46,5 @@ You'll find
|
|
46 |
- the main table' columns names and properties in `src/display/utils.py`
|
47 |
- the logic to read all results and request files, then convert them in dataframe lines, in `src/leaderboard/read_evals.py`, and `src/populate.py`
|
48 |
- the logic to allow or filter submissions in `src/submission/submit.py` and `src/submission/check_validity.py`
|
|
|
|
|
|
7 |
app_file: app.py
|
8 |
pinned: true
|
9 |
license: mit
|
10 |
+
short_description: Vulnerability scores for AI libraries (ACL '25, ICML '25)
|
11 |
sdk_version: 5.19.0
|
12 |
---
|
13 |
|
|
|
46 |
- the main table' columns names and properties in `src/display/utils.py`
|
47 |
- the logic to read all results and request files, then convert them in dataframe lines, in `src/leaderboard/read_evals.py`, and `src/populate.py`
|
48 |
- the logic to allow or filter submissions in `src/submission/submit.py` and `src/submission/check_validity.py`
|
49 |
+
|
50 |
+
> **LibVulnWatch** was presented at the **ACL 2025 Student Research Workshop** and accepted to the **ICML 2025 Technical AI Governance workshop**. The system uncovers hidden security, licensing, maintenance, dependency and regulatory risks in popular AI libraries and publishes a public leaderboard for transparent ecosystem monitoring.
|
app.py
CHANGED
@@ -255,7 +255,7 @@ with demo:
|
|
255 |
citation_button = gr.Code(
|
256 |
value=CITATION_BUTTON_TEXT,
|
257 |
label=CITATION_BUTTON_LABEL,
|
258 |
-
lines=
|
259 |
elem_id="citation-button",
|
260 |
language="yaml",
|
261 |
)
|
|
|
255 |
citation_button = gr.Code(
|
256 |
value=CITATION_BUTTON_TEXT,
|
257 |
label=CITATION_BUTTON_LABEL,
|
258 |
+
lines=14,
|
259 |
elem_id="citation-button",
|
260 |
language="yaml",
|
261 |
)
|
assessment-results/agent_development_kit.json
CHANGED
@@ -8,7 +8,7 @@
|
|
8 |
"last_updated": "2024-06-07T12:00:00Z",
|
9 |
"active_maintenance": true,
|
10 |
"independently_verified": true,
|
11 |
-
"report_url": "https://github.
|
12 |
"repository_url": "https://github.com/google/adk-python",
|
13 |
"github_stars": 3800,
|
14 |
"license": "MIT",
|
|
|
8 |
"last_updated": "2024-06-07T12:00:00Z",
|
9 |
"active_maintenance": true,
|
10 |
"independently_verified": true,
|
11 |
+
"report_url": "https://981526092.github.io/LibVulnWatch/google_adk-python_v1.4.2.html",
|
12 |
"repository_url": "https://github.com/google/adk-python",
|
13 |
"github_stars": 3800,
|
14 |
"license": "MIT",
|
src/about.py
CHANGED
@@ -28,28 +28,32 @@ TITLE = """<h1 align="center" id="space-title">LibVulnWatch: Vulnerability Asses
|
|
28 |
|
29 |
# What does your leaderboard evaluate?
|
30 |
INTRODUCTION_TEXT = """
|
31 |
-
##
|
32 |
|
33 |
-
|
34 |
-
- **License Validation**: Legal risks based on license type, compatibility, and requirements
|
35 |
-
- **Security Assessment**: Vulnerability severity and patch responsiveness
|
36 |
-
- **Maintenance Health**: Sustainability and governance practices
|
37 |
-
- **Dependency Management**: Vulnerability inheritance and supply chain security
|
38 |
-
- **Regulatory Compliance**: Compliance readiness for various frameworks
|
39 |
|
40 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
"""
|
42 |
|
43 |
# Which evaluations are you running? how can people reproduce what you have?
|
44 |
LLM_BENCHMARKS_TEXT = """
|
45 |
-
##
|
|
|
|
|
46 |
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
|
52 |
-
|
53 |
"""
|
54 |
|
55 |
EVALUATION_QUEUE_TEXT = """
|
@@ -80,9 +84,18 @@ If your library shows as "FAILED" in the assessment queue, check that:
|
|
80 |
"""
|
81 |
|
82 |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
|
83 |
-
CITATION_BUTTON_TEXT = r"""@
|
84 |
-
title={LibVulnWatch:
|
85 |
-
author={
|
86 |
-
|
87 |
-
year={2025}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
88 |
}"""
|
|
|
28 |
|
29 |
# What does your leaderboard evaluate?
|
30 |
INTRODUCTION_TEXT = """
|
31 |
+
## LibVulnWatch β Continuous, Multi-Domain Risk Scoring for AI Libraries
|
32 |
|
33 |
+
_As presented at the **ACL 2025 Student Research Workshop** and the **ICML 2025 Technical AI Governance (TAIG) workshop**_, LibVulnWatch provides an evidence-based, end-to-end pipeline that uncovers **hidden vulnerabilities** in open-source AI libraries across five governance-aligned domains:
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
+
β’ **License Validation** β compatibility, provenance, obligations
|
36 |
+
β’ **Security Assessment** β CVEs, patch latency, exploit primitives
|
37 |
+
β’ **Maintenance Health** β bus-factor, release cadence, contributor diversity
|
38 |
+
β’ **Dependency Management** β transitive risk, SBOM completeness
|
39 |
+
β’ **Regulatory Compliance** β privacy/export controls, policy documentation
|
40 |
+
|
41 |
+
In the paper we apply the framework to **20 popular libraries**, achieving **88 % coverage of OpenSSF Scorecard checks** and surfacing **up to 19 previously-unreported risks per library**.
|
42 |
+
Lower scores indicate lower risk, and the **Trust Score** is the equal-weight average of the five domains.
|
43 |
"""
|
44 |
|
45 |
# Which evaluations are you running? how can people reproduce what you have?
|
46 |
LLM_BENCHMARKS_TEXT = """
|
47 |
+
## Methodology at a Glance
|
48 |
+
|
49 |
+
LibVulnWatch orchestrates a **graph of specialised agents** powered by large language models. Each agent contributes one evidence layer and writes structured findings to a shared memory:
|
50 |
|
51 |
+
1οΈβ£ **Static agents** β licence parsing, secret scanning, call-graph reachability
|
52 |
+
2οΈβ£ **Dynamic agents** β fuzzing harnesses, dependency-confusion probes, CVE replay
|
53 |
+
3οΈβ£ **Metadata agents** β GitHub mining, release-cadence modelling, community health
|
54 |
+
4οΈβ£ **Policy agents** β mapping evidence to NIST SSDF, EU AI Act, and related frameworks
|
55 |
|
56 |
+
The aggregator agent converts raw findings into 0β10 scores per domain, producing a reproducible JSON result that is **88 % compatible with OpenSSF Scorecard checks**. All artefacts (SBOMs, logs, annotated evidence) are archived and linked in the public report.
|
57 |
"""
|
58 |
|
59 |
EVALUATION_QUEUE_TEXT = """
|
|
|
84 |
"""
|
85 |
|
86 |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
|
87 |
+
CITATION_BUTTON_TEXT = r"""@inproceedings{wu2025libvulnwatch,
|
88 |
+
title={LibVulnWatch: A Deep Assessment Agent System and Leaderboard for Uncovering Hidden Vulnerabilities in Open-Source {AI} Libraries},
|
89 |
+
author={Zekun Wu and Seonglae Cho and Umar Mohammed and CRISTIAN ENRIQUE MUNOZ VILLALOBOS and Kleyton Da Costa and Xin Guan and Theo King and Ze Wang and Emre Kazim and Adriano Koshiyama},
|
90 |
+
booktitle={ACL 2025 Student Research Workshop},
|
91 |
+
year={2025},
|
92 |
+
url={https://openreview.net/forum?id=yQzYEAL0BT}
|
93 |
+
}
|
94 |
+
|
95 |
+
@inproceedings{anonymous2025libvulnwatch,
|
96 |
+
title={LibVulnWatch: A Deep Assessment Agent System and Leaderboard for Uncovering Hidden Vulnerabilities in Open-Source {AI} Libraries},
|
97 |
+
author={Zekun Wu and Seonglae Cho and Umar Mohammed and CRISTIAN ENRIQUE MUNOZ VILLALOBOS and Kleyton Da Costa and Xin Guan and Theo King and Ze Wang and Emre Kazim and Adriano Koshiyama},
|
98 |
+
booktitle={ICML Workshop on Technical AI Governance (TAIG)},
|
99 |
+
year={2025},
|
100 |
+
url={https://openreview.net/forum?id=MHhrr8QHgR}
|
101 |
}"""
|