doc-mcp / README.md
mdabidhussain's picture
added about tab and updated readme.md
409063f

A newer version of the Gradio SDK is available: 5.34.1

Upgrade
metadata
title: Doc Mcp
emoji: πŸ“ƒ
colorFrom: yellow
colorTo: pink
sdk: gradio
sdk_version: 5.33.0
python_version: 3.13
app_file: app.py
pinned: false
license: mit
short_description: 'RAG on documentations for your agent '

Doc-MCP πŸ“š

Transform GitHub documentation repositories into accessible MCP (Model Context Protocol) servers for AI agents

Hackathon Track: mcp-server-track

🎯 What is Doc-MCP?

Doc-MCP ingests markdown documentation from GitHub repositories and creates MCP servers that provide easy access to documentation context for AI agents. Just point it at any GitHub repo with markdown docs, and get an intelligent Q&A interface powered by vector search.

πŸ› οΈ Available MCP Tools

πŸ“‹ Documentation Query Tools

get_available_docs_repo

List all available ingested repositories

  • Returns: Array of repository names that have been processed and are available for querying
  • Usage: Get a list of documentation repositories before making queries

make_query

Search documentation with AI-powered semantic search

  • Parameters:
    • repo (string): Repository name to search in
    • mode (string): Search strategy - "default", "text_search", or "hybrid"
    • query (string): Natural language question about the documentation
  • Returns: AI-generated response with source citations and metadata
  • Usage: Ask questions about specific documentation repositories

πŸ“ GitHub File Operations Tools

list_repository_files

Scan and list files in a GitHub repository

  • Parameters:
    • repo_url (string): GitHub repository URL or owner/repo format
    • branch (string, optional): Branch name (default: "main")
    • extensions (string, optional): Comma-separated file extensions (default: ".md,.mdx")
  • Returns: JSON with file list and repository metadata
  • Usage: Discover available documentation files before ingestion

get_single_file

Retrieve content of a specific file from GitHub

  • Parameters:
    • repo_url (string): GitHub repository URL or owner/repo format
    • file_path (string): Path to the specific file in the repository
    • branch (string, optional): Branch name (default: "main")
  • Returns: JSON with file content, metadata, and GitHub URLs
  • Usage: Fetch individual documentation files for processing or review

get_multiple_files

Retrieve multiple files from GitHub in one request

  • Parameters:
    • repo_url (string): GitHub repository URL or owner/repo format
    • file_paths_str (string): Comma-separated list of file paths
    • branch (string, optional): Branch name (default: "main")
  • Returns: JSON with all file contents, success/failure counts, and metadata
  • Usage: Batch fetch multiple documentation files efficiently

✨ Key Features

  • GitHub Integration: Fetch markdown files directly from any GitHub repository
  • Vector Search: Uses MongoDB Atlas with Nebius AI embeddings for semantic search
  • MCP Server: Exposes documentation as MCP endpoints for AI agents
  • Smart Q&A: Ask questions about documentation with source citations
  • Repository Management: Track multiple repositories and their statistics

🎯 MCP Server Configuration

Add this configuration to your MCP client (Cursor, Windsurf, Cline):

{
  "mcpServers": {
    "doc-mcp": {
      "url": "https://agents-mcp-hackathon-doc-mcp.hf.space/gradio_api/mcp/sse"
    }
  }
}

πŸš€ Quick Start

  1. Setup Environment:
# Clone and install
git clone https://github.com/yourusername/doc-mcp.git
cd doc-mcp
uv sync

# Configure environment
cp .env.example .env
# Add your GITHUB_API_KEY, NEBIUS_API_KEY and MONGODB_URI
  1. Run the App:
python main.py
# Open http://localhost:7860
  1. Ingest Documentation:

    • Enter a GitHub repo URL (e.g., gradio-app/gradio)
    • Select markdown files to process
    • Load files and generate vector embeddings
  2. Query Documentation:

    • Select your repository
    • Ask questions about the documentation
    • Get answers with source citations

Workflow

  • Input GitHub URL
  • Scan for markdown files
  • Select files to process
  • Generate embeddings and Store in vector database
  • Ask questions
  • Search similar content
  • Generate contextual answers
  • Show sources and citations

πŸ› οΈ Technology Stack

  • Interface: Gradio
  • Vector Store: MongoDB Atlas with vector search
  • Embeddings: Nebius AI (BAAI/bge-en-icl)
  • LLM: Nebius LLM (Llama-3.3-70B-Instruct)
  • Document Processing: LlamaIndex

πŸ“Ή Demo Video

Doc-MCP Demo - GitHub Documentation RAG System

Click the image above to watch the full demo on YouTube


Transform your documentation into intelligent, accessible knowledge for AI agents! πŸš€