Spaces:

Agents-MCP-Hackathon
/

VulnBuster

Running

File size: 17,061 Bytes

---
title: VulnBuster
emoji: 🛡️
colorFrom: red
colorTo: purple
sdk: docker
app_file: start.sh
pinned: true
tags:
- agent-demo-track
- security
- mcp
- vulnerability-scanner
- ai-agent
short_description: 'AI Security Agent: Multi-MCP Code Vulnerability Scanner'
license: mit
authors:
- name: zjkarina
  url: https://huggingface.co/zjkarina
- name: brtbrr
  url: https://huggingface.co/brtbrr
- name: RustemX
  url: https://huggingface.co/RustemX
- name: R0m9n
  url: https://huggingface.co/R0m9n
---

# 🛡️ VulnBuster

**An intelligent AI agent demonstrating automated code security auditing through orchestrated MCP services.**

VulnBuster showcases an agentic approach to vulnerability scanning by combining multiple security tools in a single, intelligent interface. The agent automatically analyzes code using various scanners, correlates findings, and provides AI-powered remediation suggestions.

## 🎯 Agentic Demo Features

- **🤖 Intelligent Agent Orchestration**: AI agent coordinates multiple MCP security scanners
- **🔄 Automated Workflow**: Upload code → Multi-tool analysis → AI-powered fixes
- **🧠 Context-Aware Analysis**: Agent understands scan results and provides meaningful insights  
- **⚡ Real-time Processing**: Live analysis with immediate feedback and suggestions
- **🎛️ Multi-Scanner Integration**: Bandit, Detect Secrets, Semgrep, Pip Audit, and Circle Test

## 🎥 Video Demo

[▶️ Watch VulnBuster Demo](https://youtu.be/kAy1c7rCmSw)

*Video demonstration showing the agentic workflow and real-world usage scenarios*

## 🚀 Quick Start

1. **Upload your code file** (Python, JavaScript, Java, Go, Ruby)
2. **Select scanners** or let the agent choose automatically
3. **Review security findings** with AI analysis
4. **Download fixed code** with automatic remediation

## 👤 Authors

- [zjkarina](https://huggingface.co/zjkarina)
- [brtbrr](https://huggingface.co/brtbrr)
- [RustemX](https://huggingface.co/RustemX)
- [R0m9n](https://huggingface.co/R0m9n)

## 🛠️ Integrated Security Tools

VulnBuster orchestrates five specialized MCP servers, each focusing on different aspects of code security. The AI agent intelligently coordinates these tools to provide comprehensive vulnerability analysis.

### 🔒 Bandit Security Scanner
**Repository**: [PyCQA/bandit](https://github.com/PyCQA/bandit)  
**Specialization**: Python-specific security analysis

Bandit is a security linter designed to find common security issues in Python code. Our MCP integration enables:

- **Static Code Analysis**: Detects hardcoded passwords, SQL injection patterns, shell injection risks
- **Security Profiles**: Specialized scans for Shell Injection, SQL Injection, Crypto vulnerabilities
- **Baseline Management**: Creates security baselines for tracking new vulnerabilities over time
- **Severity & Confidence Levels**: Configurable thresholds (low/medium/high) for precise reporting

**Agent Integration**: The agent automatically selects appropriate Bandit profiles based on code patterns and adjusts severity levels based on the development context.

### 🔍 Detect Secrets Scanner  
**Repository**: [Yelp/detect-secrets](https://github.com/Yelp/detect-secrets)  
**Specialization**: Secret and credential detection

A security tool that prevents secrets from getting checked into your codebase. Our enhanced MCP server provides:

- **Entropy-Based Detection**: Configurable base64 and hex entropy limits for secret identification
- **Plugin Architecture**: Multiple detection plugins for API keys, passwords, private keys, tokens
- **Smart Filtering**: Excludes false positives while maintaining high detection accuracy
- **Baseline Support**: Tracks known secrets to focus on new leaks
- **Word List Integration**: Custom dictionaries for domain-specific secret patterns

**Agent Integration**: The agent fine-tunes entropy thresholds based on code type and implements intelligent filtering to reduce false positives in legitimate base64/hex content.

### 🛡️ Semgrep Scanner
**Website**: [semgrep.dev](https://semgrep.dev)  
**Specialization**: Advanced static analysis with custom rules

Semgrep is a powerful static analysis tool that finds bugs, security vulnerabilities, and enforces code standards. Our MCP implementation offers:

- **Multi-Language Support**: Python, JavaScript, Java, Go, Ruby, and 20+ other languages  
- **Rule-Based Analysis**: Extensive rule sets from the Semgrep community (p/default, p/security)
- **Pattern Matching**: Advanced syntax-aware pattern matching for complex vulnerability detection
- **Custom Rules**: Support for organization-specific security policies and coding standards
- **Performance**: Fast scanning with minimal false positives

**Agent Integration**: The agent automatically selects appropriate rule sets based on detected programming languages and adjusts analysis depth based on file types and project context.

### 📦 Pip Audit Scanner
**Repository**: [pypa/pip-audit](https://github.com/pypa/pip-audit/tree/main)  
**Specialization**: Python dependency vulnerability scanning

Pip-audit is the official Python Packaging Authority tool for auditing Python environments against known vulnerabilities. Features include:

- **CVE Database**: Scans against the Python Package Index (PyPI) vulnerability database
- **Requirements Analysis**: Processes requirements.txt, Pipfile.lock, and installed packages
- **Vulnerability Fixing**: Suggests specific version upgrades to resolve security issues
- **Supply Chain Security**: Identifies compromised or malicious packages in dependency trees
- **Integration Support**: Works with virtual environments, Docker containers, and CI/CD pipelines

**Agent Integration**: The agent correlates dependency vulnerabilities with code usage patterns, prioritizing fixes based on actual code paths and exposure risk.

### 📋 Circle Test Scanner
**Platform**: [White Circle AI](https://huggingface.co/whitecircle-ai)  
**Specialization**: AI safety and policy compliance

Powered by White Circle's advanced AI safety platform, this scanner focuses on security policy compliance:

- **12 Security Policies**: Comprehensive checks covering SPDX licensing, credential exposure, deprecated APIs
- **Code Quality Gates**: Detects TODO/FIXME tags, debug statements, and development artifacts in production code  
- **Path Security**: Validates file operations, prevents path traversal vulnerabilities
- **Cryptographic Standards**: Enforces modern cryptographic practices, detects weak algorithms (MD5, etc.)
- **Container Security**: Checks file permissions, environment variable handling
- **Supply Chain Policies**: Validates dependency pinning, production environment separation

**Agent Integration**: The agent uses Circle Test as a final compliance layer, ensuring that all code changes meet organizational security standards and regulatory requirements.

## 🎛️ Agent Orchestration Workflow

```mermaid
graph TB
    A[Code Upload] --> B[VulnBuster AI Agent]
    B --> C[Language Detection]
    C --> D[Tool Selection & Configuration]
    
    D --> E[🔒 Bandit<br/>Python Security]
    D --> F[🔍 Detect Secrets<br/>Credential Scan]  
    D --> G[🛡️ Semgrep<br/>Multi-Language Analysis]
    D --> H[📦 Pip Audit<br/>Dependency Check]
    D --> I[📋 Circle Test<br/>Policy Compliance]
    
    E --> J[AI Correlation Engine]
    F --> J
    G --> J
    H --> J  
    I --> J
    
    J --> K[Vulnerability Prioritization]
    K --> L[Automated Fix Generation]
    L --> M[Remediated Code Output]
```

## 🎛️ Agent Architecture

```mermaid
graph TB
    A[User Input] --> B[VulnBuster Agent]
    B --> C[MCP Scanner 1]
    B --> D[MCP Scanner 2]
    B --> E[MCP Scanner N]
    C --> F[AI Analysis Engine]
    D --> F
    E --> F
    F --> G[Remediation Suggestions]
    F --> H[Fixed Code Output]
```

The agent intelligently:
1. **Analyzes** incoming code
2. **Selects** appropriate scanners
3. **Coordinates** parallel scanning
4. **Correlates** findings across tools
5. **Generates** fix recommendations
6. **Produces** remediated code

## 📊 Advanced Usage Examples

### Example 1: Multi-Layer Python Security Analysis
```python
# Vulnerable Python code
import subprocess
import pickle
import sqlite3

# Multiple security issues for demonstration
API_KEY = "sk_live_51H1h2K3L4M5N6O7P8Q9R0S1T2U3V4W5X6Y7Z8"  # Detect Secrets
password = "admin123"  # Bandit B105

def execute_command(user_input):
    subprocess.call(f"ls {user_input}", shell=True)  # Bandit B602

def load_data(data):
    return pickle.loads(data)  # Bandit B301

def query_db(user_id):
    conn = sqlite3.connect('users.db')
    query = f"SELECT * FROM users WHERE id = {user_id}"  # Semgrep: SQL injection
    return conn.execute(query).fetchall()

# TODO: Fix authentication system  # Circle Test Policy #3
```

**Agent Analysis Results**:
- **Bandit**: 3 high-severity issues (B105, B602, B301)
- **Detect Secrets**: 1 API key detected with high entropy
- **Semgrep**: SQL injection vulnerability identified
- **Circle Test**: TODO comment flagged, production code quality violation
- **Agent Remediation**: Generates secure alternatives with proper input validation

### Example 2: JavaScript/Node.js Security Scan
```javascript
// Vulnerable Node.js code
const express = require('express');
const fs = require('fs');

const app = express();
const API_SECRET = 'abc123def456';  // Detect Secrets

app.get('/file/:filename', (req, res) => {
    // Path traversal vulnerability - Semgrep detection
    const filepath = `/uploads/${req.params.filename}`;
    fs.readFile(filepath, (err, data) => {
        if (err) throw err;
        res.send(data);
    });
});
```

**Agent Response**:
- **Semgrep**: Path traversal vulnerability in file handler
- **Detect Secrets**: Hardcoded API secret detection
- **Circle Test**: Missing input validation policies
- **Agent Fix**: Implements path sanitization and secure secret management

### Example 3: Dependency Vulnerability Assessment
```txt
# requirements.txt with vulnerable packages
Django==2.0.0           # Known CVE vulnerabilities
requests==2.18.4        # Outdated version
Pillow>=5.0.0,<6.0.0   # Version range instead of pinned
pycrypto==2.6.1        # Deprecated cryptographic library
```

**Comprehensive Analysis**:
- **Pip Audit**: 4 vulnerable packages identified with specific CVE numbers
- **Circle Test**: Policy violations for unpinned dependencies and deprecated crypto
- **Agent Resolution**: Suggests exact secure versions and modern alternatives
- **Supply Chain Risk**: Analyzes dependency trees for transitive vulnerabilities

### Example 4: Enterprise Policy Compliance Check
```python
#!/usr/bin/env python3
# Missing SPDX license identifier - Circle Test Policy #1

import os
import hashlib

def authenticate_user(username, password):
    # MD5 usage flagged by Circle Test Policy #13
    password_hash = hashlib.md5(password.encode()).hexdigest()
    
    # Hardcoded production URL - Circle Test Policy #11
    auth_server = "https://prod-auth.company.com/api/login"
    
    # TODO: Implement proper session management - Policy #3
    return True

# Debug code left in production - Circle Test Policy #14
import pdb; pdb.set_trace()
```

**Policy Compliance Results**:
- **Circle Test**: 4 policy violations detected
- **Bandit**: MD5 usage and hardcoded values flagged
- **Agent Remediation**: Implements SPDX headers, modern crypto, environment variables, removes debug code

## 🚀 Real-World Impact

VulnBuster's agent-driven approach provides:

- **95% Faster Analysis**: Parallel scanning reduces analysis time from hours to minutes
- **Cross-Tool Correlation**: Identifies vulnerability chains missed by individual tools  
- **Context-Aware Fixes**: Generates fixes that maintain code functionality while improving security
- **Compliance Automation**: Ensures adherence to security policies across development lifecycle
- **Learning System**: Agent improves recommendations based on codebase patterns and fix acceptance rates

## 🌐 MCP Integration

Connect VulnBuster to your IDE using MCP:

```json
{
  "mcpServers": {
    "vulnbuster": {
      "command": "npx",
      "args": [
        "-y", 
        "mcp-remote", 
        "https://agents-mcp-hackathon-vulnbuster.hf.space/gradio_api/mcp/sse",
        "--transport", 
        "sse-only"
      ]
    }
  }
}
```

## 🔍 Comprehensive Vulnerability Coverage

VulnBuster's multi-scanner approach provides comprehensive security coverage across different layers:

### 🔒 Code-Level Vulnerabilities (Bandit + Semgrep)
- **Injection Attacks**: SQL injection, command injection, code injection via `eval()`/`exec()`
- **Cryptographic Issues**: Weak algorithms (MD5, SHA1), hardcoded encryption keys
- **Unsafe Functions**: Use of `pickle`, `marshal`, `yaml.load()` without safe parameters
- **Path Traversal**: Unsafe file operations, directory traversal vulnerabilities
- **XML External Entities (XXE)**: Insecure XML parsing configurations
- **Deserialization**: Unsafe object deserialization patterns

### 🔍 Secret & Credential Leaks (Detect Secrets)
- **API Keys**: AWS, Google Cloud, Azure access keys and tokens
- **Authentication Tokens**: JWT tokens, OAuth tokens, session cookies
- **Database Credentials**: Passwords, connection strings, database URLs
- **Private Keys**: SSH keys, SSL certificates, PGP keys
- **High-Entropy Strings**: Base64/hex encoded secrets with configurable thresholds
- **Custom Patterns**: Domain-specific secrets using word lists and regex patterns

### 📦 Supply Chain Vulnerabilities (Pip Audit)
- **Known CVEs**: Direct dependencies with published security advisories
- **Transitive Dependencies**: Vulnerabilities in dependencies of dependencies
- **Malicious Packages**: Typosquatting and compromised package detection
- **Version Pinning**: Outdated packages with available security updates
- **License Compliance**: Incompatible or problematic package licenses

### 📋 Policy & Compliance Violations (Circle Test)
- **License Compliance**: Missing or non-approved SPDX license identifiers
- **Code Quality**: TODO/FIXME comments in production code
- **Development Artifacts**: Debug statements, test code in production
- **Insecure Communication**: HTTP URLs without proper validation
- **Data Exposure**: Logging sensitive information without masking
- **Deprecated APIs**: Usage of functions marked as deprecated
- **File System Security**: Overly permissive file permissions (0o777)
- **Environment Security**: Runtime environment variable modifications

### 🛡️ Multi-Language Support (Semgrep)
| Language | Vulnerability Types | Coverage |
|----------|-------------------|----------|
| **Python** | Injection, Crypto, Deserialization | Comprehensive |
| **JavaScript/Node.js** | XSS, Prototype pollution, Path traversal | Full |
| **Java** | Injection, XXE, Deserialization | Extensive |
| **Go** | Race conditions, Crypto, Input validation | Growing |
| **Ruby** | Injection, Mass assignment, Crypto | Good |
| **PHP** | Injection, File inclusion, Crypto | Basic |

### 🎯 Risk Prioritization Matrix

The agent automatically prioritizes vulnerabilities based on:

| Severity | Exploitability | Business Impact | Examples |
|----------|---------------|-----------------|----------|
| **Critical** | Remote + High | Data breach | SQL injection in auth system |
| **High** | Remote + Medium | Service disruption | Command injection in API |
| **Medium** | Local + High | Information leak | Hardcoded credentials |
| **Low** | Local + Low | Code quality | TODO comments, deprecated APIs |

### 🔄 Continuous Monitoring Capabilities

- **Baseline Tracking**: Monitors new vulnerabilities against established security baselines
- **Regression Detection**: Identifies when previously fixed issues reappear
- **Trend Analysis**: Tracks vulnerability patterns and improvement metrics
- **Policy Evolution**: Adapts to new security standards and organizational requirements

## 🛡️ Local Development

```bash
# Clone and run
git clone https://huggingface.co/spaces/Agents-MCP-Hackathon/VulnBuster
cd VulnBuster

# Setup environment
echo "NEBIUS_API_KEY=your_api_key_here" > .env

# Build and run
docker build -t vulnbuster .
docker run -p 7860:7860 --env-file .env vulnbuster
```

## 🏗️ Technical Architecture

- **Frontend**: Gradio web interface with file upload and real-time results
- **Backend**: FastAPI with async processing for concurrent scanner execution
- **Agent Framework**: Agno with Nebius LLM for intelligent analysis and correlation
- **MCP Servers**: 5 specialized security scanners with standardized interfaces
- **Containerization**: Single Docker image with all dependencies and services
- **Communication**: HTTP/SSE for MCP protocol, JSON for data exchange

**Tags:** `agent-demo-track`

**Note**: This tool provides static analysis and should be used as part of a comprehensive security strategy. The AI agent assists with remediation but human review is recommended for production code.