refactor: update augmentation methods in full_prediction function and revise README for clarity on parameters and new functionality
Browse files
README.md
CHANGED
@@ -19,24 +19,22 @@ license: mit
|
|
19 |
|
20 |
## Functions Available for LLM Calls via MCP
|
21 |
|
22 |
-
This document outlines the functions available for programmatic invocation by LLMs through the MCP (Multi-Cloud Platform) server, as defined in `mcp-deepfake-forensics/
|
23 |
|
24 |
-
## 1. `
|
25 |
|
26 |
### Description
|
27 |
This function processes an uploaded image to predict whether it is AI-generated or real, utilizing an ensemble of deepfake detection models and advanced forensic analysis techniques. It also incorporates intelligent agents for context inference, weight management, and anomaly detection.
|
28 |
|
29 |
### API Names
|
30 |
- `predict`
|
31 |
-
- `augment_then_predict` (This API name triggers image augmentation before prediction)
|
32 |
|
33 |
### Parameters
|
34 |
-
- `img` (
|
35 |
- `confidence_threshold` (float): A value between 0.0 and 1.0 (default: 0.7) that determines the confidence level required for a model to label an image as "AI" or "REAL". If neither score meets this threshold, the label will be "UNCERTAIN".
|
36 |
-
- `
|
37 |
-
- `
|
38 |
-
- `
|
39 |
-
- `sharpen_strength` (float): The strength of the sharpening effect to apply (default: 0), if "sharpen" is included in `augment_methods`.
|
40 |
|
41 |
### Returns
|
42 |
- `img_pil` (PIL Image): The processed image (original or augmented).
|
@@ -50,7 +48,7 @@ This function processes an uploaded image to predict whether it is AI-generated
|
|
50 |
- `json_results` (str): A JSON string containing the raw model prediction results for debugging purposes.
|
51 |
- `consensus_html` (str): An HTML string representing the final consensus label ("AI", "REAL", or "UNCERTAIN"), styled with color.
|
52 |
|
53 |
-
## 2. `
|
54 |
|
55 |
### Description
|
56 |
Analyzes image noise patterns using wavelet decomposition. This tool helps detect compression artifacts and artificial noise patterns that may indicate image manipulation. Higher noise levels in specific regions can reveal areas of potential tampering.
|
@@ -133,17 +131,49 @@ Analyzes local pixel value deviations to detect subtle changes in image data, of
|
|
133 |
- `radius` (int): The radius for local pixel analysis (0-10, default: 2).
|
134 |
|
135 |
### Returns
|
136 |
-
- `minmax_image` (PIL Image): The image with minmax processing applied.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
137 |
|
138 |
---
|
139 |
|
140 |
# Behind the Scenes: Image Prediction Flow
|
141 |
|
142 |
-
When you upload an image for analysis and click the "Predict"
|
143 |
|
144 |
### 1. Image Pre-processing and Agent Initialization
|
145 |
|
146 |
-
* **Image Conversion**: The input image is first ensured to be a PIL (Pillow) Image object. If it's a NumPy array, it's converted.
|
147 |
* **Agent Setup**: Several intelligent agents are initialized to assist in the process:
|
148 |
* `EnsembleMonitorAgent`: Monitors the performance of individual models.
|
149 |
* `ModelWeightManager`: Manages and adjusts the weights of different models.
|
@@ -152,7 +182,7 @@ When you upload an image for analysis and click the "Predict" or "Augment & Pred
|
|
152 |
* `ContextualIntelligenceAgent`: Infers context tags from the image to aid in weight adjustment.
|
153 |
* `ForensicAnomalyDetectionAgent`: Analyzes forensic outputs for signs of manipulation.
|
154 |
* **System Health Monitoring**: The `SystemHealthAgent` performs an initial check of system resources.
|
155 |
-
* **Image Augmentation (Optional)**: If
|
156 |
|
157 |
### 2. Initial Model Predictions
|
158 |
|
@@ -163,7 +193,7 @@ When you upload an image for analysis and click the "Predict" or "Augment & Pred
|
|
163 |
### 3. Smart Agent Processing and Weighted Consensus
|
164 |
|
165 |
* **Contextual Intelligence**: The `ContextualIntelligenceAgent` analyzes the image's metadata (width, height, mode) and the raw model predictions to infer relevant context tags (e.g., "generated by Midjourney", "likely real photo"). This helps in making more informed decisions about model reliability.
|
166 |
-
* **Dynamic Weight Adjustment**: The `ModelWeightManager` adjusts the influence (weights) of each individual model's prediction. This adjustment takes into account the initial model predictions, their confidence scores, and the detected context tags.
|
167 |
* **Weighted Consensus Calculation**: A final prediction label ("AI", "REAL", or "UNCERTAIN") is determined by combining the individual model predictions using their adjusted weights. Models with higher confidence and relevance to the detected context contribute more to the final decision.
|
168 |
* **Performance Analysis (for Optimization)**: The `WeightOptimizationAgent` analyzes the final consensus label to continually improve the weight adjustment strategy for future predictions.
|
169 |
|
@@ -173,6 +203,8 @@ When you upload an image for analysis and click the "Predict" or "Augment & Pred
|
|
173 |
* **Gradient Processing**: Highlights edges and transitions in the image.
|
174 |
* **MinMax Processing**: Reveals deviations in local pixel values.
|
175 |
* **ELA (Error Level Analysis)**: Performed in multiple passes (grayscale and color, with varying contrast) to detect areas of different compression levels, which can suggest tampering.
|
|
|
|
|
176 |
* **Forensic Anomaly Detection**: The `ForensicAnomalyDetectionAgent` analyzes the outputs of these forensic tools and their descriptions to identify potential anomalies or inconsistencies that could indicate image manipulation.
|
177 |
|
178 |
### 5. Data Logging and Output Generation
|
@@ -204,6 +236,7 @@ When you upload an image for analysis and click the "Predict" or "Augment & Pred
|
|
204 |
| [x] Set up data logging to Hugging Face | β
Completed | Medium | Continuous improvement pipeline |
|
205 |
| [x] Create system health monitoring | β
Completed | Medium | Resource usage tracking |
|
206 |
| [x] Implement contextual intelligence analysis | β
Completed | Medium | Context tag inference system |
|
|
|
207 |
| [ ] Implement real-time model performance monitoring | π· In Progress | High | Add live metrics dashboard |
|
208 |
| [ ] Add support for video deepfake detection | Pending | Medium | Extend current image-based system |
|
209 |
| [ ] Optimize forensic analysis processing speed | π· In Progress | High | Current ELA processing is slow |
|
@@ -262,7 +295,6 @@ Here's the updated table with an additional column providing **instructions on h
|
|
262 |
- **Priority**: High (Critical), Medium (Important), Low (Nice to have)
|
263 |
- **Status**: π· In-Progress, β
Completed, π» Blocked
|
264 |
|
265 |
-
|
266 |
---
|
267 |
|
268 |
### **Hybrid Input Table for AI Content Detection (Planned)**
|
|
|
19 |
|
20 |
## Functions Available for LLM Calls via MCP
|
21 |
|
22 |
+
This document outlines the functions available for programmatic invocation by LLMs through the MCP (Multi-Cloud Platform) server, as defined in `mcp-deepfake-forensics/app.py`.
|
23 |
|
24 |
+
## 1. `full_prediction`
|
25 |
|
26 |
### Description
|
27 |
This function processes an uploaded image to predict whether it is AI-generated or real, utilizing an ensemble of deepfake detection models and advanced forensic analysis techniques. It also incorporates intelligent agents for context inference, weight management, and anomaly detection.
|
28 |
|
29 |
### API Names
|
30 |
- `predict`
|
|
|
31 |
|
32 |
### Parameters
|
33 |
+
- `img` (str): The input image to be analyzed, provided as a file path.
|
34 |
- `confidence_threshold` (float): A value between 0.0 and 1.0 (default: 0.7) that determines the confidence level required for a model to label an image as "AI" or "REAL". If neither score meets this threshold, the label will be "UNCERTAIN".
|
35 |
+
- `rotate_degrees` (float): The maximum degree by which to rotate the image (default: 0). If greater than 0, "rotate" augmentation is applied.
|
36 |
+
- `noise_level` (float): The level of noise to add to the image (default: 0). If greater than 0, "add_noise" augmentation is applied.
|
37 |
+
- `sharpen_strength` (float): The strength of the sharpening effect to apply (default: 0). If greater than 0, "sharpen" augmentation is applied.
|
|
|
38 |
|
39 |
### Returns
|
40 |
- `img_pil` (PIL Image): The processed image (original or augmented).
|
|
|
48 |
- `json_results` (str): A JSON string containing the raw model prediction results for debugging purposes.
|
49 |
- `consensus_html` (str): An HTML string representing the final consensus label ("AI", "REAL", or "UNCERTAIN"), styled with color.
|
50 |
|
51 |
+
## 2. `noise_estimation`
|
52 |
|
53 |
### Description
|
54 |
Analyzes image noise patterns using wavelet decomposition. This tool helps detect compression artifacts and artificial noise patterns that may indicate image manipulation. Higher noise levels in specific regions can reveal areas of potential tampering.
|
|
|
131 |
- `radius` (int): The radius for local pixel analysis (0-10, default: 2).
|
132 |
|
133 |
### Returns
|
134 |
+
- `minmax_image` (PIL Image): The image with minmax processing applied.
|
135 |
+
|
136 |
+
## 7. `augment_image_interface`
|
137 |
+
|
138 |
+
### Description
|
139 |
+
Applies various augmentation techniques to an image.
|
140 |
+
|
141 |
+
### API Name
|
142 |
+
- `augment_image`
|
143 |
+
|
144 |
+
### Parameters
|
145 |
+
- `img` (PIL Image): The input image to augment.
|
146 |
+
- `augment_methods` (list of str): A list of augmentation methods to apply. Possible values: "rotate", "add_noise", "sharpen".
|
147 |
+
- `rotate_degrees` (float): The degrees to rotate the image (0-360).
|
148 |
+
- `noise_level` (float): The level of noise to add (0-100).
|
149 |
+
- `sharpen_strength` (float): The strength of the sharpening effect (0-200).
|
150 |
+
|
151 |
+
### Returns
|
152 |
+
- `augmented_img` (PIL Image): The augmented image.
|
153 |
+
|
154 |
+
## 8. `community_forensics_preview`
|
155 |
+
|
156 |
+
### Description
|
157 |
+
Provides a quick and simple prediction using our strongest model.
|
158 |
+
|
159 |
+
### API Name
|
160 |
+
- `quick_predict`
|
161 |
+
|
162 |
+
### Parameters
|
163 |
+
- `img` (str): The input image to analyze, provided as a file path.
|
164 |
+
|
165 |
+
### Returns
|
166 |
+
- (HTML): An HTML output from the loaded Gradio Space.
|
167 |
|
168 |
---
|
169 |
|
170 |
# Behind the Scenes: Image Prediction Flow
|
171 |
|
172 |
+
When you upload an image for analysis and click the "Predict" button, the following steps occur:
|
173 |
|
174 |
### 1. Image Pre-processing and Agent Initialization
|
175 |
|
176 |
+
* **Image Conversion**: The input image is first ensured to be a PIL (Pillow) Image object. If it's a file path, it's loaded and converted to PIL. If it's a NumPy array, it's converted. The image is then ensured to be in RGB format.
|
177 |
* **Agent Setup**: Several intelligent agents are initialized to assist in the process:
|
178 |
* `EnsembleMonitorAgent`: Monitors the performance of individual models.
|
179 |
* `ModelWeightManager`: Manages and adjusts the weights of different models.
|
|
|
182 |
* `ContextualIntelligenceAgent`: Infers context tags from the image to aid in weight adjustment.
|
183 |
* `ForensicAnomalyDetectionAgent`: Analyzes forensic outputs for signs of manipulation.
|
184 |
* **System Health Monitoring**: The `SystemHealthAgent` performs an initial check of system resources.
|
185 |
+
* **Image Augmentation (Optional)**: If `rotate_degrees`, `noise_level`, or `sharpen_strength` are provided, the image is augmented accordingly using "rotate", "add_noise", and "sharpen" methods internally. Otherwise, the original image is used.
|
186 |
|
187 |
### 2. Initial Model Predictions
|
188 |
|
|
|
193 |
### 3. Smart Agent Processing and Weighted Consensus
|
194 |
|
195 |
* **Contextual Intelligence**: The `ContextualIntelligenceAgent` analyzes the image's metadata (width, height, mode) and the raw model predictions to infer relevant context tags (e.g., "generated by Midjourney", "likely real photo"). This helps in making more informed decisions about model reliability.
|
196 |
+
* **Dynamic Weight Adjustment**: The `ModelWeightManager` adjusts the influence (weights) of each individual model's prediction. This adjustment takes into account the initial model predictions, their confidence scores, and the detected context tags. Note that `simple_prediction` (Community Forensics model) is given a significantly higher base weight.
|
197 |
* **Weighted Consensus Calculation**: A final prediction label ("AI", "REAL", or "UNCERTAIN") is determined by combining the individual model predictions using their adjusted weights. Models with higher confidence and relevance to the detected context contribute more to the final decision.
|
198 |
* **Performance Analysis (for Optimization)**: The `WeightOptimizationAgent` analyzes the final consensus label to continually improve the weight adjustment strategy for future predictions.
|
199 |
|
|
|
203 |
* **Gradient Processing**: Highlights edges and transitions in the image.
|
204 |
* **MinMax Processing**: Reveals deviations in local pixel values.
|
205 |
* **ELA (Error Level Analysis)**: Performed in multiple passes (grayscale and color, with varying contrast) to detect areas of different compression levels, which can suggest tampering.
|
206 |
+
* **Bit Plane Extraction**: Extracts and visualizes individual bit planes.
|
207 |
+
* **Wavelet-Based Noise Analysis**: Analyzes noise patterns using wavelet decomposition.
|
208 |
* **Forensic Anomaly Detection**: The `ForensicAnomalyDetectionAgent` analyzes the outputs of these forensic tools and their descriptions to identify potential anomalies or inconsistencies that could indicate image manipulation.
|
209 |
|
210 |
### 5. Data Logging and Output Generation
|
|
|
236 |
| [x] Set up data logging to Hugging Face | β
Completed | Medium | Continuous improvement pipeline |
|
237 |
| [x] Create system health monitoring | β
Completed | Medium | Resource usage tracking |
|
238 |
| [x] Implement contextual intelligence analysis | β
Completed | Medium | Context tag inference system |
|
239 |
+
| [x] Expose `augment_image` as a Gradio interface | β
Completed | Medium | New "Image Augmentation" tab added |
|
240 |
| [ ] Implement real-time model performance monitoring | π· In Progress | High | Add live metrics dashboard |
|
241 |
| [ ] Add support for video deepfake detection | Pending | Medium | Extend current image-based system |
|
242 |
| [ ] Optimize forensic analysis processing speed | π· In Progress | High | Current ELA processing is slow |
|
|
|
295 |
- **Priority**: High (Critical), Medium (Important), Low (Nice to have)
|
296 |
- **Status**: π· In-Progress, β
Completed, π» Blocked
|
297 |
|
|
|
298 |
---
|
299 |
|
300 |
### **Hybrid Input Table for AI Content Detection (Planned)**
|
app.py
CHANGED
@@ -364,7 +364,7 @@ def full_prediction(img, confidence_threshold, rotate_degrees, noise_level, shar
|
|
364 |
consensus_html = f"<b><span style='color:{'red' if final_prediction_label == 'AI' else ('green' if final_prediction_label == 'REAL' else 'orange')}'>{final_prediction_label}</span></b>"
|
365 |
inference_params = {
|
366 |
"confidence_threshold": confidence_threshold,
|
367 |
-
"augment_methods":
|
368 |
"rotate_degrees": rotate_degrees,
|
369 |
"noise_level": noise_level,
|
370 |
"sharpen_strength": sharpen_strength,
|
|
|
364 |
consensus_html = f"<b><span style='color:{'red' if final_prediction_label == 'AI' else ('green' if final_prediction_label == 'REAL' else 'orange')}'>{final_prediction_label}</span></b>"
|
365 |
inference_params = {
|
366 |
"confidence_threshold": confidence_threshold,
|
367 |
+
"augment_methods": ["rotate", "add_noise", "sharpen"],
|
368 |
"rotate_degrees": rotate_degrees,
|
369 |
"noise_level": noise_level,
|
370 |
"sharpen_strength": sharpen_strength,
|