LPX55 commited on
Commit
e82c297
Β·
1 Parent(s): c3c2134

refactor: update augmentation methods in full_prediction function and revise README for clarity on parameters and new functionality

Browse files
Files changed (2) hide show
  1. README.md +47 -15
  2. app.py +1 -1
README.md CHANGED
@@ -19,24 +19,22 @@ license: mit
19
 
20
  ## Functions Available for LLM Calls via MCP
21
 
22
- This document outlines the functions available for programmatic invocation by LLMs through the MCP (Multi-Cloud Platform) server, as defined in `mcp-deepfake-forensics/app_mcp.py`.
23
 
24
- ## 1. `predict_with_ensemble`
25
 
26
  ### Description
27
  This function processes an uploaded image to predict whether it is AI-generated or real, utilizing an ensemble of deepfake detection models and advanced forensic analysis techniques. It also incorporates intelligent agents for context inference, weight management, and anomaly detection.
28
 
29
  ### API Names
30
  - `predict`
31
- - `augment_then_predict` (This API name triggers image augmentation before prediction)
32
 
33
  ### Parameters
34
- - `img` (PIL Image): The input image to be analyzed. This can be uploaded by the user or captured via webcam.
35
  - `confidence_threshold` (float): A value between 0.0 and 1.0 (default: 0.7) that determines the confidence level required for a model to label an image as "AI" or "REAL". If neither score meets this threshold, the label will be "UNCERTAIN".
36
- - `augment_methods` (list of str): A list of augmentation methods to apply to the image before prediction. Possible values include: "rotate", "add_noise", "sharpen". If empty, no augmentation is applied.
37
- - `rotate_degrees` (float): The maximum degree by which to rotate the image (default: 0), if "rotate" is included in `augment_methods`.
38
- - `noise_level` (float): The level of noise to add to the image (default: 0), if "add_noise" is included in `augment_methods`.
39
- - `sharpen_strength` (float): The strength of the sharpening effect to apply (default: 0), if "sharpen" is included in `augment_methods`.
40
 
41
  ### Returns
42
  - `img_pil` (PIL Image): The processed image (original or augmented).
@@ -50,7 +48,7 @@ This function processes an uploaded image to predict whether it is AI-generated
50
  - `json_results` (str): A JSON string containing the raw model prediction results for debugging purposes.
51
  - `consensus_html` (str): An HTML string representing the final consensus label ("AI", "REAL", or "UNCERTAIN"), styled with color.
52
 
53
- ## 2. `wavelet_blocking_noise_estimation`
54
 
55
  ### Description
56
  Analyzes image noise patterns using wavelet decomposition. This tool helps detect compression artifacts and artificial noise patterns that may indicate image manipulation. Higher noise levels in specific regions can reveal areas of potential tampering.
@@ -133,17 +131,49 @@ Analyzes local pixel value deviations to detect subtle changes in image data, of
133
  - `radius` (int): The radius for local pixel analysis (0-10, default: 2).
134
 
135
  ### Returns
136
- - `minmax_image` (PIL Image): The image with minmax processing applied.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
137
 
138
  ---
139
 
140
  # Behind the Scenes: Image Prediction Flow
141
 
142
- When you upload an image for analysis and click the "Predict" or "Augment & Predict" button, the following steps occur:
143
 
144
  ### 1. Image Pre-processing and Agent Initialization
145
 
146
- * **Image Conversion**: The input image is first ensured to be a PIL (Pillow) Image object. If it's a NumPy array, it's converted.
147
  * **Agent Setup**: Several intelligent agents are initialized to assist in the process:
148
  * `EnsembleMonitorAgent`: Monitors the performance of individual models.
149
  * `ModelWeightManager`: Manages and adjusts the weights of different models.
@@ -152,7 +182,7 @@ When you upload an image for analysis and click the "Predict" or "Augment & Pred
152
  * `ContextualIntelligenceAgent`: Infers context tags from the image to aid in weight adjustment.
153
  * `ForensicAnomalyDetectionAgent`: Analyzes forensic outputs for signs of manipulation.
154
  * **System Health Monitoring**: The `SystemHealthAgent` performs an initial check of system resources.
155
- * **Image Augmentation (Optional)**: If you select augmentation methods (rotate, add noise, sharpen), the image is augmented accordingly. Otherwise, the original image is used.
156
 
157
  ### 2. Initial Model Predictions
158
 
@@ -163,7 +193,7 @@ When you upload an image for analysis and click the "Predict" or "Augment & Pred
163
  ### 3. Smart Agent Processing and Weighted Consensus
164
 
165
  * **Contextual Intelligence**: The `ContextualIntelligenceAgent` analyzes the image's metadata (width, height, mode) and the raw model predictions to infer relevant context tags (e.g., "generated by Midjourney", "likely real photo"). This helps in making more informed decisions about model reliability.
166
- * **Dynamic Weight Adjustment**: The `ModelWeightManager` adjusts the influence (weights) of each individual model's prediction. This adjustment takes into account the initial model predictions, their confidence scores, and the detected context tags.
167
  * **Weighted Consensus Calculation**: A final prediction label ("AI", "REAL", or "UNCERTAIN") is determined by combining the individual model predictions using their adjusted weights. Models with higher confidence and relevance to the detected context contribute more to the final decision.
168
  * **Performance Analysis (for Optimization)**: The `WeightOptimizationAgent` analyzes the final consensus label to continually improve the weight adjustment strategy for future predictions.
169
 
@@ -173,6 +203,8 @@ When you upload an image for analysis and click the "Predict" or "Augment & Pred
173
  * **Gradient Processing**: Highlights edges and transitions in the image.
174
  * **MinMax Processing**: Reveals deviations in local pixel values.
175
  * **ELA (Error Level Analysis)**: Performed in multiple passes (grayscale and color, with varying contrast) to detect areas of different compression levels, which can suggest tampering.
 
 
176
  * **Forensic Anomaly Detection**: The `ForensicAnomalyDetectionAgent` analyzes the outputs of these forensic tools and their descriptions to identify potential anomalies or inconsistencies that could indicate image manipulation.
177
 
178
  ### 5. Data Logging and Output Generation
@@ -204,6 +236,7 @@ When you upload an image for analysis and click the "Predict" or "Augment & Pred
204
  | [x] Set up data logging to Hugging Face | βœ… Completed | Medium | Continuous improvement pipeline |
205
  | [x] Create system health monitoring | βœ… Completed | Medium | Resource usage tracking |
206
  | [x] Implement contextual intelligence analysis | βœ… Completed | Medium | Context tag inference system |
 
207
  | [ ] Implement real-time model performance monitoring | πŸ”· In Progress | High | Add live metrics dashboard |
208
  | [ ] Add support for video deepfake detection | Pending | Medium | Extend current image-based system |
209
  | [ ] Optimize forensic analysis processing speed | πŸ”· In Progress | High | Current ELA processing is slow |
@@ -262,7 +295,6 @@ Here's the updated table with an additional column providing **instructions on h
262
  - **Priority**: High (Critical), Medium (Important), Low (Nice to have)
263
  - **Status**: πŸ”· In-Progress, βœ… Completed, πŸ”» Blocked
264
 
265
-
266
  ---
267
 
268
  ### **Hybrid Input Table for AI Content Detection (Planned)**
 
19
 
20
  ## Functions Available for LLM Calls via MCP
21
 
22
+ This document outlines the functions available for programmatic invocation by LLMs through the MCP (Multi-Cloud Platform) server, as defined in `mcp-deepfake-forensics/app.py`.
23
 
24
+ ## 1. `full_prediction`
25
 
26
  ### Description
27
  This function processes an uploaded image to predict whether it is AI-generated or real, utilizing an ensemble of deepfake detection models and advanced forensic analysis techniques. It also incorporates intelligent agents for context inference, weight management, and anomaly detection.
28
 
29
  ### API Names
30
  - `predict`
 
31
 
32
  ### Parameters
33
+ - `img` (str): The input image to be analyzed, provided as a file path.
34
  - `confidence_threshold` (float): A value between 0.0 and 1.0 (default: 0.7) that determines the confidence level required for a model to label an image as "AI" or "REAL". If neither score meets this threshold, the label will be "UNCERTAIN".
35
+ - `rotate_degrees` (float): The maximum degree by which to rotate the image (default: 0). If greater than 0, "rotate" augmentation is applied.
36
+ - `noise_level` (float): The level of noise to add to the image (default: 0). If greater than 0, "add_noise" augmentation is applied.
37
+ - `sharpen_strength` (float): The strength of the sharpening effect to apply (default: 0). If greater than 0, "sharpen" augmentation is applied.
 
38
 
39
  ### Returns
40
  - `img_pil` (PIL Image): The processed image (original or augmented).
 
48
  - `json_results` (str): A JSON string containing the raw model prediction results for debugging purposes.
49
  - `consensus_html` (str): An HTML string representing the final consensus label ("AI", "REAL", or "UNCERTAIN"), styled with color.
50
 
51
+ ## 2. `noise_estimation`
52
 
53
  ### Description
54
  Analyzes image noise patterns using wavelet decomposition. This tool helps detect compression artifacts and artificial noise patterns that may indicate image manipulation. Higher noise levels in specific regions can reveal areas of potential tampering.
 
131
  - `radius` (int): The radius for local pixel analysis (0-10, default: 2).
132
 
133
  ### Returns
134
+ - `minmax_image` (PIL Image): The image with minmax processing applied.
135
+
136
+ ## 7. `augment_image_interface`
137
+
138
+ ### Description
139
+ Applies various augmentation techniques to an image.
140
+
141
+ ### API Name
142
+ - `augment_image`
143
+
144
+ ### Parameters
145
+ - `img` (PIL Image): The input image to augment.
146
+ - `augment_methods` (list of str): A list of augmentation methods to apply. Possible values: "rotate", "add_noise", "sharpen".
147
+ - `rotate_degrees` (float): The degrees to rotate the image (0-360).
148
+ - `noise_level` (float): The level of noise to add (0-100).
149
+ - `sharpen_strength` (float): The strength of the sharpening effect (0-200).
150
+
151
+ ### Returns
152
+ - `augmented_img` (PIL Image): The augmented image.
153
+
154
+ ## 8. `community_forensics_preview`
155
+
156
+ ### Description
157
+ Provides a quick and simple prediction using our strongest model.
158
+
159
+ ### API Name
160
+ - `quick_predict`
161
+
162
+ ### Parameters
163
+ - `img` (str): The input image to analyze, provided as a file path.
164
+
165
+ ### Returns
166
+ - (HTML): An HTML output from the loaded Gradio Space.
167
 
168
  ---
169
 
170
  # Behind the Scenes: Image Prediction Flow
171
 
172
+ When you upload an image for analysis and click the "Predict" button, the following steps occur:
173
 
174
  ### 1. Image Pre-processing and Agent Initialization
175
 
176
+ * **Image Conversion**: The input image is first ensured to be a PIL (Pillow) Image object. If it's a file path, it's loaded and converted to PIL. If it's a NumPy array, it's converted. The image is then ensured to be in RGB format.
177
  * **Agent Setup**: Several intelligent agents are initialized to assist in the process:
178
  * `EnsembleMonitorAgent`: Monitors the performance of individual models.
179
  * `ModelWeightManager`: Manages and adjusts the weights of different models.
 
182
  * `ContextualIntelligenceAgent`: Infers context tags from the image to aid in weight adjustment.
183
  * `ForensicAnomalyDetectionAgent`: Analyzes forensic outputs for signs of manipulation.
184
  * **System Health Monitoring**: The `SystemHealthAgent` performs an initial check of system resources.
185
+ * **Image Augmentation (Optional)**: If `rotate_degrees`, `noise_level`, or `sharpen_strength` are provided, the image is augmented accordingly using "rotate", "add_noise", and "sharpen" methods internally. Otherwise, the original image is used.
186
 
187
  ### 2. Initial Model Predictions
188
 
 
193
  ### 3. Smart Agent Processing and Weighted Consensus
194
 
195
  * **Contextual Intelligence**: The `ContextualIntelligenceAgent` analyzes the image's metadata (width, height, mode) and the raw model predictions to infer relevant context tags (e.g., "generated by Midjourney", "likely real photo"). This helps in making more informed decisions about model reliability.
196
+ * **Dynamic Weight Adjustment**: The `ModelWeightManager` adjusts the influence (weights) of each individual model's prediction. This adjustment takes into account the initial model predictions, their confidence scores, and the detected context tags. Note that `simple_prediction` (Community Forensics model) is given a significantly higher base weight.
197
  * **Weighted Consensus Calculation**: A final prediction label ("AI", "REAL", or "UNCERTAIN") is determined by combining the individual model predictions using their adjusted weights. Models with higher confidence and relevance to the detected context contribute more to the final decision.
198
  * **Performance Analysis (for Optimization)**: The `WeightOptimizationAgent` analyzes the final consensus label to continually improve the weight adjustment strategy for future predictions.
199
 
 
203
  * **Gradient Processing**: Highlights edges and transitions in the image.
204
  * **MinMax Processing**: Reveals deviations in local pixel values.
205
  * **ELA (Error Level Analysis)**: Performed in multiple passes (grayscale and color, with varying contrast) to detect areas of different compression levels, which can suggest tampering.
206
+ * **Bit Plane Extraction**: Extracts and visualizes individual bit planes.
207
+ * **Wavelet-Based Noise Analysis**: Analyzes noise patterns using wavelet decomposition.
208
  * **Forensic Anomaly Detection**: The `ForensicAnomalyDetectionAgent` analyzes the outputs of these forensic tools and their descriptions to identify potential anomalies or inconsistencies that could indicate image manipulation.
209
 
210
  ### 5. Data Logging and Output Generation
 
236
  | [x] Set up data logging to Hugging Face | βœ… Completed | Medium | Continuous improvement pipeline |
237
  | [x] Create system health monitoring | βœ… Completed | Medium | Resource usage tracking |
238
  | [x] Implement contextual intelligence analysis | βœ… Completed | Medium | Context tag inference system |
239
+ | [x] Expose `augment_image` as a Gradio interface | βœ… Completed | Medium | New "Image Augmentation" tab added |
240
  | [ ] Implement real-time model performance monitoring | πŸ”· In Progress | High | Add live metrics dashboard |
241
  | [ ] Add support for video deepfake detection | Pending | Medium | Extend current image-based system |
242
  | [ ] Optimize forensic analysis processing speed | πŸ”· In Progress | High | Current ELA processing is slow |
 
295
  - **Priority**: High (Critical), Medium (Important), Low (Nice to have)
296
  - **Status**: πŸ”· In-Progress, βœ… Completed, πŸ”» Blocked
297
 
 
298
  ---
299
 
300
  ### **Hybrid Input Table for AI Content Detection (Planned)**
app.py CHANGED
@@ -364,7 +364,7 @@ def full_prediction(img, confidence_threshold, rotate_degrees, noise_level, shar
364
  consensus_html = f"<b><span style='color:{'red' if final_prediction_label == 'AI' else ('green' if final_prediction_label == 'REAL' else 'orange')}'>{final_prediction_label}</span></b>"
365
  inference_params = {
366
  "confidence_threshold": confidence_threshold,
367
- "augment_methods": augment_methods,
368
  "rotate_degrees": rotate_degrees,
369
  "noise_level": noise_level,
370
  "sharpen_strength": sharpen_strength,
 
364
  consensus_html = f"<b><span style='color:{'red' if final_prediction_label == 'AI' else ('green' if final_prediction_label == 'REAL' else 'orange')}'>{final_prediction_label}</span></b>"
365
  inference_params = {
366
  "confidence_threshold": confidence_threshold,
367
+ "augment_methods": ["rotate", "add_noise", "sharpen"],
368
  "rotate_degrees": rotate_degrees,
369
  "noise_level": noise_level,
370
  "sharpen_strength": sharpen_strength,