seonglae-holistic commited on
Commit
3c3ce5c
Β·
2 Parent(s): fdddab8 f198cca

merge: branch 'main' of https://huggingface.co/spaces/holistic-ai/LibVulnWatch

Browse files
README.md CHANGED
@@ -7,7 +7,7 @@ sdk: gradio
7
  app_file: app.py
8
  pinned: true
9
  license: mit
10
- short_description: Duplicate this leaderboard to initialize your own!
11
  sdk_version: 5.19.0
12
  ---
13
 
@@ -46,3 +46,5 @@ You'll find
46
  - the main table' columns names and properties in `src/display/utils.py`
47
  - the logic to read all results and request files, then convert them in dataframe lines, in `src/leaderboard/read_evals.py`, and `src/populate.py`
48
  - the logic to allow or filter submissions in `src/submission/submit.py` and `src/submission/check_validity.py`
 
 
 
7
  app_file: app.py
8
  pinned: true
9
  license: mit
10
+ short_description: Vulnerability scores for AI libraries (ACL '25, ICML '25)
11
  sdk_version: 5.19.0
12
  ---
13
 
 
46
  - the main table' columns names and properties in `src/display/utils.py`
47
  - the logic to read all results and request files, then convert them in dataframe lines, in `src/leaderboard/read_evals.py`, and `src/populate.py`
48
  - the logic to allow or filter submissions in `src/submission/submit.py` and `src/submission/check_validity.py`
49
+
50
+ > **LibVulnWatch** was presented at the **ACL 2025 Student Research Workshop** and accepted to the **ICML 2025 Technical AI Governance workshop**. The system uncovers hidden security, licensing, maintenance, dependency and regulatory risks in popular AI libraries and publishes a public leaderboard for transparent ecosystem monitoring.
app.py CHANGED
@@ -255,7 +255,7 @@ with demo:
255
  citation_button = gr.Code(
256
  value=CITATION_BUTTON_TEXT,
257
  label=CITATION_BUTTON_LABEL,
258
- lines=6,
259
  elem_id="citation-button",
260
  language="yaml",
261
  )
 
255
  citation_button = gr.Code(
256
  value=CITATION_BUTTON_TEXT,
257
  label=CITATION_BUTTON_LABEL,
258
+ lines=14,
259
  elem_id="citation-button",
260
  language="yaml",
261
  )
assessment-results/agent_development_kit.json CHANGED
@@ -8,7 +8,7 @@
8
  "last_updated": "2024-06-07T12:00:00Z",
9
  "active_maintenance": true,
10
  "independently_verified": true,
11
- "report_url": "https://github.com/981526092/LibVulnWatch/raw/main/report/google_adk-python_v1.4.2.html",
12
  "repository_url": "https://github.com/google/adk-python",
13
  "github_stars": 3800,
14
  "license": "MIT",
 
8
  "last_updated": "2024-06-07T12:00:00Z",
9
  "active_maintenance": true,
10
  "independently_verified": true,
11
+ "report_url": "https://981526092.github.io/LibVulnWatch/google_adk-python_v1.4.2.html",
12
  "repository_url": "https://github.com/google/adk-python",
13
  "github_stars": 3800,
14
  "license": "MIT",
src/about.py CHANGED
@@ -28,28 +28,32 @@ TITLE = """<h1 align="center" id="space-title">LibVulnWatch: Vulnerability Asses
28
 
29
  # What does your leaderboard evaluate?
30
  INTRODUCTION_TEXT = """
31
- ## Systematic Vulnerability Assessment and Leaderboard Tracking for Open-Source AI Libraries
32
 
33
- This leaderboard provides continuous vulnerability assessment for open-source AI libraries across five critical risk domains:
34
- - **License Validation**: Legal risks based on license type, compatibility, and requirements
35
- - **Security Assessment**: Vulnerability severity and patch responsiveness
36
- - **Maintenance Health**: Sustainability and governance practices
37
- - **Dependency Management**: Vulnerability inheritance and supply chain security
38
- - **Regulatory Compliance**: Compliance readiness for various frameworks
39
 
40
- Lower scores indicate fewer vulnerabilities and lower risk. The Trust Score is an equal-weighted average of all five domains, providing a balanced assessment of overall library trustworthiness.
 
 
 
 
 
 
 
41
  """
42
 
43
  # Which evaluations are you running? how can people reproduce what you have?
44
  LLM_BENCHMARKS_TEXT = """
45
- ## How LibVulnWatch Works
 
 
46
 
47
- Our assessment methodology evaluates libraries through:
48
- 1. **Static Analysis**: Code review, license parsing, and documentation examination
49
- 2. **Dynamic Analysis**: Vulnerability scanning, dependency checking, and API testing
50
- 3. **Metadata Analysis**: Repository metrics, contributor patterns, and release cadence
51
 
52
- Each library receives a risk score (0-10) in each domain, with lower scores indicating lower risk.
53
  """
54
 
55
  EVALUATION_QUEUE_TEXT = """
@@ -80,9 +84,18 @@ If your library shows as "FAILED" in the assessment queue, check that:
80
  """
81
 
82
  CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
83
- CITATION_BUTTON_TEXT = r"""@article{LibVulnWatch2025,
84
- title={LibVulnWatch: Systematic Vulnerability Assessment and Leaderboard Tracking for Open-Source AI Libraries},
85
- author={First Author and Second Author},
86
- journal={ICML 2025 Technical AI Governance Workshop},
87
- year={2025}
 
 
 
 
 
 
 
 
 
88
  }"""
 
28
 
29
  # What does your leaderboard evaluate?
30
  INTRODUCTION_TEXT = """
31
+ ## LibVulnWatch – Continuous, Multi-Domain Risk Scoring for AI Libraries
32
 
33
+ _As presented at the **ACL 2025 Student Research Workshop** and the **ICML 2025 Technical AI Governance (TAIG) workshop**_, LibVulnWatch provides an evidence-based, end-to-end pipeline that uncovers **hidden vulnerabilities** in open-source AI libraries across five governance-aligned domains:
 
 
 
 
 
34
 
35
+ β€’ **License Validation** – compatibility, provenance, obligations
36
+ β€’ **Security Assessment** – CVEs, patch latency, exploit primitives
37
+ β€’ **Maintenance Health** – bus-factor, release cadence, contributor diversity
38
+ β€’ **Dependency Management** – transitive risk, SBOM completeness
39
+ β€’ **Regulatory Compliance** – privacy/export controls, policy documentation
40
+
41
+ In the paper we apply the framework to **20 popular libraries**, achieving **88 % coverage of OpenSSF Scorecard checks** and surfacing **up to 19 previously-unreported risks per library**.
42
+ Lower scores indicate lower risk, and the **Trust Score** is the equal-weight average of the five domains.
43
  """
44
 
45
  # Which evaluations are you running? how can people reproduce what you have?
46
  LLM_BENCHMARKS_TEXT = """
47
+ ## Methodology at a Glance
48
+
49
+ LibVulnWatch orchestrates a **graph of specialised agents** powered by large language models. Each agent contributes one evidence layer and writes structured findings to a shared memory:
50
 
51
+ 1️⃣ **Static agents** – licence parsing, secret scanning, call-graph reachability
52
+ 2️⃣ **Dynamic agents** – fuzzing harnesses, dependency-confusion probes, CVE replay
53
+ 3️⃣ **Metadata agents** – GitHub mining, release-cadence modelling, community health
54
+ 4️⃣ **Policy agents** – mapping evidence to NIST SSDF, EU AI Act, and related frameworks
55
 
56
+ The aggregator agent converts raw findings into 0–10 scores per domain, producing a reproducible JSON result that is **88 % compatible with OpenSSF Scorecard checks**. All artefacts (SBOMs, logs, annotated evidence) are archived and linked in the public report.
57
  """
58
 
59
  EVALUATION_QUEUE_TEXT = """
 
84
  """
85
 
86
  CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
87
+ CITATION_BUTTON_TEXT = r"""@inproceedings{wu2025libvulnwatch,
88
+ title={LibVulnWatch: A Deep Assessment Agent System and Leaderboard for Uncovering Hidden Vulnerabilities in Open-Source {AI} Libraries},
89
+ author={Zekun Wu and Seonglae Cho and Umar Mohammed and CRISTIAN ENRIQUE MUNOZ VILLALOBOS and Kleyton Da Costa and Xin Guan and Theo King and Ze Wang and Emre Kazim and Adriano Koshiyama},
90
+ booktitle={ACL 2025 Student Research Workshop},
91
+ year={2025},
92
+ url={https://openreview.net/forum?id=yQzYEAL0BT}
93
+ }
94
+
95
+ @inproceedings{anonymous2025libvulnwatch,
96
+ title={LibVulnWatch: A Deep Assessment Agent System and Leaderboard for Uncovering Hidden Vulnerabilities in Open-Source {AI} Libraries},
97
+ author={Zekun Wu and Seonglae Cho and Umar Mohammed and CRISTIAN ENRIQUE MUNOZ VILLALOBOS and Kleyton Da Costa and Xin Guan and Theo King and Ze Wang and Emre Kazim and Adriano Koshiyama},
98
+ booktitle={ICML Workshop on Technical AI Governance (TAIG)},
99
+ year={2025},
100
+ url={https://openreview.net/forum?id=MHhrr8QHgR}
101
  }"""