LPX
commited on
Commit
·
5268185
1
Parent(s):
4feb2ac
update: README
Browse files
README.md
CHANGED
@@ -17,4 +17,173 @@ models:
|
|
17 |
license: mit
|
18 |
---
|
19 |
|
20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
license: mit
|
18 |
---
|
19 |
|
20 |
+
## Functions Available for LLM Calls via MCP
|
21 |
+
|
22 |
+
This document outlines the functions available for programmatic invocation by LLMs through the MCP (Multi-Cloud Platform) server, as defined in `mcp-deepfake-forensics/app_mcp.py`.
|
23 |
+
|
24 |
+
## 1. `predict_with_ensemble`
|
25 |
+
|
26 |
+
### Description
|
27 |
+
This function processes an uploaded image to predict whether it is AI-generated or real, utilizing an ensemble of deepfake detection models and advanced forensic analysis techniques. It also incorporates intelligent agents for context inference, weight management, and anomaly detection.
|
28 |
+
|
29 |
+
### API Names
|
30 |
+
- `predict`
|
31 |
+
- `augment_then_predict` (This API name triggers image augmentation before prediction)
|
32 |
+
|
33 |
+
### Parameters
|
34 |
+
- `img` (PIL Image): The input image to be analyzed. This can be uploaded by the user or captured via webcam.
|
35 |
+
- `confidence_threshold` (float): A value between 0.0 and 1.0 (default: 0.7) that determines the confidence level required for a model to label an image as "AI" or "REAL". If neither score meets this threshold, the label will be "UNCERTAIN".
|
36 |
+
- `augment_methods` (list of str): A list of augmentation methods to apply to the image before prediction. Possible values include: "rotate", "add_noise", "sharpen". If empty, no augmentation is applied.
|
37 |
+
- `rotate_degrees` (float): The maximum degree by which to rotate the image (default: 0), if "rotate" is included in `augment_methods`.
|
38 |
+
- `noise_level` (float): The level of noise to add to the image (default: 0), if "add_noise" is included in `augment_methods`.
|
39 |
+
- `sharpen_strength` (float): The strength of the sharpening effect to apply (default: 0), if "sharpen" is included in `augment_methods`.
|
40 |
+
|
41 |
+
### Returns
|
42 |
+
- `img_pil` (PIL Image): The processed image (original or augmented).
|
43 |
+
- `cleaned_forensics_images` (list of PIL Image): A list of images generated by various forensic analysis techniques (ELA, gradient, minmax, bitplane). These include:
|
44 |
+
- Original augmented image
|
45 |
+
- ELA analysis (multiple passes)
|
46 |
+
- Gradient processing (multiple variations)
|
47 |
+
- MinMax processing (multiple variations)
|
48 |
+
- Bit Plane extraction
|
49 |
+
- `table_rows` (list of lists): A list of lists representing the model predictions, suitable for display in a Gradio Dataframe. Each inner list contains: Model Name, Contributor, AI Score, Real Score, and Label.
|
50 |
+
- `json_results` (str): A JSON string containing the raw model prediction results for debugging purposes.
|
51 |
+
- `consensus_html` (str): An HTML string representing the final consensus label ("AI", "REAL", or "UNCERTAIN"), styled with color.
|
52 |
+
|
53 |
+
## 2. `wavelet_blocking_noise_estimation`
|
54 |
+
|
55 |
+
### Description
|
56 |
+
Analyzes image noise patterns using wavelet decomposition. This tool helps detect compression artifacts and artificial noise patterns that may indicate image manipulation. Higher noise levels in specific regions can reveal areas of potential tampering.
|
57 |
+
|
58 |
+
### API Name
|
59 |
+
- `tool_waveletnoise`
|
60 |
+
|
61 |
+
### Parameters
|
62 |
+
- `image` (PIL Image): The input image to analyze.
|
63 |
+
- `block_size` (int): The size of the blocks for wavelet analysis (default: 8, range: 1-32).
|
64 |
+
|
65 |
+
### Returns
|
66 |
+
- `output_image` (PIL Image): An image visualizing the noise patterns.
|
67 |
+
|
68 |
+
## 3. `bit_plane_extractor`
|
69 |
+
|
70 |
+
### Description
|
71 |
+
Extracts and visualizes individual bit planes from different color channels. This forensic tool helps identify hidden patterns and artifacts in image data that may indicate manipulation. Different bit planes can reveal inconsistencies in image processing or editing.
|
72 |
+
|
73 |
+
### API Name
|
74 |
+
- `tool_bitplane`
|
75 |
+
|
76 |
+
### Parameters
|
77 |
+
- `image` (PIL Image): The input image to analyze.
|
78 |
+
- `channel` (str): The color channel to extract the bit plane from. Possible values: "Luminance", "Red", "Green", "Blue", "RGB Norm" (default: "Luminance").
|
79 |
+
- `bit_plane` (int): The bit plane index to extract (0-7, default: 0).
|
80 |
+
- `filter_type` (str): A filter to apply to the extracted bit plane. Possible values: "Disabled", "Median", "Gaussian" (default: "Disabled").
|
81 |
+
|
82 |
+
### Returns
|
83 |
+
- `output_image` (PIL Image): An image visualizing the extracted bit plane.
|
84 |
+
|
85 |
+
## 4. `ELA`
|
86 |
+
|
87 |
+
### Description
|
88 |
+
Performs Error Level Analysis to detect re-saved JPEG images, which can indicate tampering. ELA highlights areas of an image that have different compression levels.
|
89 |
+
|
90 |
+
### API Name
|
91 |
+
- `tool_ela`
|
92 |
+
|
93 |
+
### Parameters
|
94 |
+
- `img` (PIL Image): Input image to analyze.
|
95 |
+
- `quality` (int): JPEG compression quality (1-100, default: 75).
|
96 |
+
- `scale` (int): Output multiplicative gain (1-100, default: 50).
|
97 |
+
- `contrast` (int): Output tonality compression (0-100, default: 20).
|
98 |
+
- `linear` (bool): Whether to use linear difference (default: False).
|
99 |
+
- `grayscale` (bool): Whether to output grayscale image (default: False).
|
100 |
+
|
101 |
+
### Returns
|
102 |
+
- `processed_ela_image` (PIL Image): The processed ELA image.
|
103 |
+
|
104 |
+
## 5. `gradient_processing`
|
105 |
+
|
106 |
+
### Description
|
107 |
+
Applies gradient filters to an image to enhance edges and transitions, which can reveal inconsistencies due to manipulation.
|
108 |
+
|
109 |
+
### API Name
|
110 |
+
- `tool_gradient_processing`
|
111 |
+
|
112 |
+
### Parameters
|
113 |
+
- `image` (PIL Image): The input image to analyze.
|
114 |
+
- `intensity` (int): Intensity of the gradient effect (0-100, default: 90).
|
115 |
+
- `blue_mode` (str): Mode for the blue channel. Possible values: "Abs", "None", "Flat", "Norm" (default: "Abs").
|
116 |
+
- `invert` (bool): Whether to invert the gradients (default: False).
|
117 |
+
- `equalize` (bool): Whether to equalize the histogram (default: False).
|
118 |
+
|
119 |
+
### Returns
|
120 |
+
- `gradient_image` (PIL Image): The image with gradient processing applied.
|
121 |
+
|
122 |
+
## 6. `minmax_process`
|
123 |
+
|
124 |
+
### Description
|
125 |
+
Analyzes local pixel value deviations to detect subtle changes in image data, often indicative of digital forgeries.
|
126 |
+
|
127 |
+
### API Name
|
128 |
+
- `tool_minmax_processing`
|
129 |
+
|
130 |
+
### Parameters
|
131 |
+
- `image` (PIL Image): The input image to analyze.
|
132 |
+
- `channel` (int): The color channel to process. Possible values: 0 (Grayscale), 1 (Blue), 2 (Green), 3 (Red), 4 (RGB Norm) (default: 4).
|
133 |
+
- `radius` (int): The radius for local pixel analysis (0-10, default: 2).
|
134 |
+
|
135 |
+
### Returns
|
136 |
+
- `minmax_image` (PIL Image): The image with minmax processing applied.
|
137 |
+
|
138 |
+
---
|
139 |
+
|
140 |
+
# Behind the Scenes: Image Prediction Flow
|
141 |
+
|
142 |
+
When you upload an image for analysis and click the "Predict" or "Augment & Predict" button, the following steps occur:
|
143 |
+
|
144 |
+
### 1. Image Pre-processing and Agent Initialization
|
145 |
+
|
146 |
+
* **Image Conversion**: The input image is first ensured to be a PIL (Pillow) Image object. If it's a NumPy array, it's converted.
|
147 |
+
* **Agent Setup**: Several intelligent agents are initialized to assist in the process:
|
148 |
+
* `EnsembleMonitorAgent`: Monitors the performance of individual models.
|
149 |
+
* `ModelWeightManager`: Manages and adjusts the weights of different models.
|
150 |
+
* `WeightOptimizationAgent`: Optimizes model weights based on performance.
|
151 |
+
* `SystemHealthAgent`: Monitors the system's resource usage (e.g., memory, GPU).
|
152 |
+
* `ContextualIntelligenceAgent`: Infers context tags from the image to aid in weight adjustment.
|
153 |
+
* `ForensicAnomalyDetectionAgent`: Analyzes forensic outputs for signs of manipulation.
|
154 |
+
* **System Health Monitoring**: The `SystemHealthAgent` performs an initial check of system resources.
|
155 |
+
* **Image Augmentation (Optional)**: If you select augmentation methods (rotate, add noise, sharpen), the image is augmented accordingly. Otherwise, the original image is used.
|
156 |
+
|
157 |
+
### 2. Initial Model Predictions
|
158 |
+
|
159 |
+
* **Individual Model Inference**: The augmented (or original) image is passed through each of the registered deepfake detection models (`model_1` through `model_7`).
|
160 |
+
* **Performance Monitoring**: For each model, the `EnsembleMonitorAgent` tracks its prediction label, confidence score, and inference time.
|
161 |
+
* **Result Collection**: The raw prediction results (AI Score, Real Score, predicted Label) from each model are stored.
|
162 |
+
|
163 |
+
### 3. Smart Agent Processing and Weighted Consensus
|
164 |
+
|
165 |
+
* **Contextual Intelligence**: The `ContextualIntelligenceAgent` analyzes the image's metadata (width, height, mode) and the raw model predictions to infer relevant context tags (e.g., "generated by Midjourney", "likely real photo"). This helps in making more informed decisions about model reliability.
|
166 |
+
* **Dynamic Weight Adjustment**: The `ModelWeightManager` adjusts the influence (weights) of each individual model's prediction. This adjustment takes into account the initial model predictions, their confidence scores, and the detected context tags.
|
167 |
+
* **Weighted Consensus Calculation**: A final prediction label ("AI", "REAL", or "UNCERTAIN") is determined by combining the individual model predictions using their adjusted weights. Models with higher confidence and relevance to the detected context contribute more to the final decision.
|
168 |
+
* **Performance Analysis (for Optimization)**: The `WeightOptimizationAgent` analyzes the final consensus label to continually improve the weight adjustment strategy for future predictions.
|
169 |
+
|
170 |
+
### 4. Forensic Processing
|
171 |
+
|
172 |
+
* **Multiple Forensic Techniques**: The original image is subjected to various forensic analysis techniques to reveal hidden artifacts that might indicate manipulation:
|
173 |
+
* **Gradient Processing**: Highlights edges and transitions in the image.
|
174 |
+
* **MinMax Processing**: Reveals deviations in local pixel values.
|
175 |
+
* **ELA (Error Level Analysis)**: Performed in multiple passes (grayscale and color, with varying contrast) to detect areas of different compression levels, which can suggest tampering.
|
176 |
+
* **Forensic Anomaly Detection**: The `ForensicAnomalyDetectionAgent` analyzes the outputs of these forensic tools and their descriptions to identify potential anomalies or inconsistencies that could indicate image manipulation.
|
177 |
+
|
178 |
+
### 5. Data Logging and Output Generation
|
179 |
+
|
180 |
+
* **Inference Data Logging**: All relevant data from the current prediction, including original image, inference parameters, individual model predictions, ensemble output, forensic images, and agent monitoring data, is logged to a Hugging Face dataset for continuous improvement and analysis.
|
181 |
+
* **Output Preparation**: The results are formatted for display in the Gradio interface:
|
182 |
+
* The processed image (augmented or original) is prepared.
|
183 |
+
* The forensic analysis images are collected for display in a gallery.
|
184 |
+
* A table summarizing each model's prediction (Model, Contributor, AI Score, Real Score, Label) is generated.
|
185 |
+
* The raw JSON output of model results is prepared for debugging.
|
186 |
+
* The final consensus label is prepared with appropriate styling.
|
187 |
+
* **Data Type Conversion**: Numerical values (like AI Score, Real Score) are converted to standard Python floats to ensure proper JSON serialization.
|
188 |
+
|
189 |
+
Finally, all these prepared outputs are returned to the Gradio interface for you to view.
|