GraphRAG API

This README provides a detailed guide on the api.py file, which serves as the API interface for the GraphRAG (Graph Retrieval-Augmented Generation) system. GraphRAG is a powerful tool that combines graph-based knowledge representation with retrieval-augmented generation techniques to provide context-aware responses to queries.

Overview
Setup
API Endpoints
Data Models
Core Functionality
Usage Examples
Configuration
Troubleshooting

Overview

The api.py file implements a FastAPI-based server that provides various endpoints for interacting with the GraphRAG system. It supports different types of queries, including direct chat, GraphRAG-specific queries, DuckDuckGo searches, and a combined full-model search.

Key features:

Multiple query types (local and global searches)
Context caching for improved performance
Background tasks for long-running operations
Customizable settings through environment variables and config files
Integration with external services (e.g., Ollama for LLM interactions)

Setup

Install dependencies:
```
pip install -r requirements.txt
```

Set up environment variables: Create a .env file in the indexing directory with the following variables:

LLM_API_BASE=<your_llm_api_base_url>
LLM_MODEL=<your_llm_model>
LLM_PROVIDER=<llm_provider>
EMBEDDINGS_API_BASE=<your_embeddings_api_base_url>
EMBEDDINGS_MODEL=<your_embeddings_model>
EMBEDDINGS_PROVIDER=<embeddings_provider>
INPUT_DIR=./indexing/output
ROOT_DIR=indexing
API_PORT=8012

Run the API server:

python api.py --host 0.0.0.0 --port 8012

API Endpoints

`/v1/chat/completions` (POST)

Main endpoint for chat completions. Supports different models:

direct-chat: Direct interaction with the LLM
graphrag-local-search:latest: Local search using GraphRAG
graphrag-global-search:latest: Global search using GraphRAG
duckduckgo-search:latest: Web search using DuckDuckGo
full-model:latest: Combined search using all available models

`/v1/prompt_tune` (POST)

Initiates prompt tuning process in the background.

`/v1/prompt_tune_status` (GET)

Retrieves the status and logs of the prompt tuning process.

`/v1/index` (POST)

Starts the indexing process for GraphRAG in the background.

`/v1/index_status` (GET)

Retrieves the status and logs of the indexing process.

`/health` (GET)

Health check endpoint.

`/v1/models` (GET)

Lists available models.

Data Models

The API uses several Pydantic models for request and response handling:

Message: Represents a chat message with role and content.
QueryOptions: Options for GraphRAG queries, including query type, preset, and community level.
ChatCompletionRequest: Request model for chat completions.
ChatCompletionResponse: Response model for chat completions.
PromptTuneRequest: Request model for prompt tuning.
IndexingRequest: Request model for indexing.

Core Functionality

Context Loading

The load_context function loads necessary data for GraphRAG queries, including entities, relationships, reports, text units, and covariates.

Search Engine Setup

setup_search_engines initializes both local and global search engines using the loaded context data.

Query Execution

Different query types are handled by separate functions:

run_direct_chat: Sends queries directly to the LLM.
run_graphrag_query: Executes GraphRAG queries (local or global).
run_duckduckgo_search: Performs web searches using DuckDuckGo.
run_full_model_search: Combines results from all search types.

Background Tasks

Long-running tasks like prompt tuning and indexing are executed as background tasks to prevent blocking the API.

Usage Examples

Sending a GraphRAG Query

import requests

url = "http://localhost:8012/v1/chat/completions"
payload = {
    "model": "graphrag-local-search:latest",
    "messages": [{"role": "user", "content": "What is GraphRAG?"}],
    "query_options": {
        "query_type": "local-search",
        "selected_folder": "your_indexed_folder",
        "community_level": 2,
        "response_type": "Multiple Paragraphs"
    }
}
response = requests.post(url, json=payload)
print(response.json())

Starting Indexing Process

import requests

url = "http://localhost:8012/v1/index"
payload = {
    "llm_model": "your_llm_model",
    "embed_model": "your_embed_model",
    "root": "./indexing",
    "verbose": True,
    "emit": ["parquet", "csv"]
}
response = requests.post(url, json=payload)
print(response.json())

Configuration

The API can be configured through:

Environment variables
A config.yaml file (path specified by GRAPHRAG_CONFIG environment variable)
Command-line arguments when starting the server

Key configuration options:

llm_model: The language model to use
embedding_model: The embedding model for vector representations
community_level: Depth of community analysis in GraphRAG
token_limit: Maximum tokens for context
api_key: API key for LLM service
api_base: Base URL for LLM API
api_type: Type of API (e.g., "openai")

Troubleshooting

If you encounter connection errors with Ollama, ensure the service is running and accessible.
For "context loading failed" errors, check that the indexed data is present in the specified output folder.
If prompt tuning or indexing processes fail, review the logs using the respective status endpoints.
For performance issues, consider adjusting the community_level and token_limit settings.

For more detailed information on GraphRAG's indexing and querying processes, refer to the official GraphRAG documentation.

Spaces:

sichaolong
/

graph-rag-local-ui-scl

Runtime error

GraphRAG API

Table of Contents

Overview

Setup

API Endpoints

`/v1/chat/completions` (POST)

`/v1/prompt_tune` (POST)

`/v1/prompt_tune_status` (GET)

`/v1/index` (POST)

`/v1/index_status` (GET)

`/health` (GET)

`/v1/models` (GET)

Data Models

Core Functionality

Context Loading

Search Engine Setup

Query Execution

Background Tasks

Usage Examples

Sending a GraphRAG Query

Starting Indexing Process

Configuration

Troubleshooting

GraphRAG API

Table of Contents

Overview

Setup

API Endpoints

/v1/chat/completions (POST)

/v1/prompt_tune (POST)

/v1/prompt_tune_status (GET)

/v1/index (POST)

/v1/index_status (GET)

/health (GET)

/v1/models (GET)

Data Models

Core Functionality

Context Loading

Search Engine Setup

Query Execution

Background Tasks

Usage Examples

Sending a GraphRAG Query

Starting Indexing Process

Configuration

Troubleshooting

`/v1/chat/completions` (POST)

`/v1/prompt_tune` (POST)

`/v1/prompt_tune_status` (GET)

`/v1/index` (POST)

`/v1/index_status` (GET)

`/health` (GET)

`/v1/models` (GET)