LPX55 commited on
Commit
b88f9dc
Β·
1 Parent(s): bc355a9

docs: expand README with roadmap, feature status, and AI content detection tools; update requirements for transformers

Browse files
Files changed (4) hide show
  1. README.md +116 -1
  2. forensics/__init__.py +2 -2
  3. forensics/exif.py +10 -10
  4. requirements.txt +1 -1
README.md CHANGED
@@ -186,4 +186,119 @@ When you upload an image for analysis and click the "Predict" or "Augment & Pred
186
  * The final consensus label is prepared with appropriate styling.
187
  * **Data Type Conversion**: Numerical values (like AI Score, Real Score) are converted to standard Python floats to ensure proper JSON serialization.
188
 
189
- Finally, all these prepared outputs are returned to the Gradio interface for you to view.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
186
  * The final consensus label is prepared with appropriate styling.
187
  * **Data Type Conversion**: Numerical values (like AI Score, Real Score) are converted to standard Python floats to ensure proper JSON serialization.
188
 
189
+ ---
190
+
191
+ ## Roadmap & Features
192
+
193
+ ### In Progress & Pending Tasks
194
+
195
+ | Task | Status | Priority | Notes |
196
+ |------|--------|----------|-------|
197
+ | [x] Set up basic ensemble model architecture | βœ… Completed | High | Core framework established |
198
+ | [x] Implement initial forensic analysis tools | βœ… Completed | High | ELA, Gradient, MinMax processing |
199
+ | [x] Create intelligent agent system | βœ… Completed | High | All monitoring agents implemented |
200
+ | [x] Refactor Gradio interface for MCP | βœ… Completed | Medium | User-friendly web interface |
201
+ | [x] Integrate multiple deepfake detection models | βœ… Completed | High | 7 models successfully integrated |
202
+ | [x] Implement weighted consensus algorithm | βœ… Completed | High | Dynamic weight adjustment working |
203
+ | [x] Add image augmentation capabilities | βœ… Completed | Medium | Rotation, noise, sharpening features |
204
+ | [x] Set up data logging to Hugging Face | βœ… Completed | Medium | Continuous improvement pipeline |
205
+ | [x] Create system health monitoring | βœ… Completed | Medium | Resource usage tracking |
206
+ | [x] Implement contextual intelligence analysis | βœ… Completed | Medium | Context tag inference system |
207
+ | [ ] Implement real-time model performance monitoring | πŸ”· In Progress | High | Add live metrics dashboard |
208
+ | [ ] Add support for video deepfake detection | Pending | Medium | Extend current image-based system |
209
+ | [ ] Optimize forensic analysis processing speed | πŸ”· In Progress | High | Current ELA processing is slow |
210
+ | [ ] Implement batch processing for multiple images | πŸ”· In Progress | Medium | Improve throughput for bulk analysis |
211
+ | [ ] Add model confidence threshold configuration | Pending | Low | Allow users to adjust sensitivity |
212
+ | [ ] Create test suite | Pending | High | Unit tests for all agents and models |
213
+ | [ ] Implement model versioning and rollback | Pending | Medium | Track model performance over time |
214
+ | [ ] Add export functionality for analysis reports | Pending | Low | PDF/CSV export options |
215
+ | [ ] Optimize memory usage for large images | πŸ”· In Progress | High | Handle 4K+ resolution images |
216
+ | [ ] Add support for additional forensic techniques | πŸ”· In Progress | Medium | Consider adding noise analysis |
217
+ | [ ] Implement user authentication system | Pending | Low | For enterprise deployment |
218
+ | [ ] Create API documentation | πŸ”· In Progress | Medium | OpenAPI/Swagger specs |
219
+ | [ ] Add model ensemble validation metrics | Pending | High | Cross-validation for weight optimization |
220
+ | [ ] Implement caching for repeated analyses | Pending | Medium | Reduce redundant processing |
221
+ | [ ] Add support for custom model integration | Pending | Low | Plugin architecture for new models |
222
+
223
+ ### Legend
224
+ - **Priority**: High (Critical), Medium (Important), Low (Nice to have)
225
+ - **Status**: Pending, πŸ”· In Progress, βœ… Completed, πŸ”» Blocked
226
+
227
+ ---
228
+
229
+ Digital Forensics Implementation
230
+
231
+
232
+ Here's the updated table with an additional column providing **instructions on how to use these tools with vision LLMs** (e.g., CLIP, Vision Transformers, or CNNs) for effective AI content detection:
233
+
234
+ ---
235
+
236
+ ### **Top 20 Tools for AI Content Detection (with Vision LLM Integration Guidance)**
237
+
238
+ | Status | Rank | Tool/Algorithm | Reason | **Agent Guidance / Instructions** |
239
+ |--------|------|----------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------|
240
+ | βœ… | 1 | Noise Separation | Detect synthetic noise patterns absent in natural images. | Train the LLM on noise-separated image patches to recognize AI-specific noise textures (e.g., overly smooth or missing thermal noise). |
241
+ | πŸ”· | 2 | EXIF Full Dump | AI-generated images lack valid metadata (e.g., camera model, geolocation). | Input the image *and its metadata as text* to a **multimodal LLM** (e.g., image + metadata caption). Flag inconsistencies (e.g., missing GPS, invalid timestamps). |
242
+ | βœ… | 3 | Error Level Analysis (ELA) | Reveals compression artifacts unique to AI-generated images. | Preprocess images via ELA before input to the LLM. Train the model to detect high-error regions indicative of synthetic content. |
243
+ | πŸ”· | 4 | JPEG Ghost Maps | Identifies compression history anomalies. | Use ghost maps as a separate input channel (e.g., overlay ELA results on the RGB image) to train the LLM on synthetic vs. natural compression traces. |
244
+ | πŸ”· | 5 | Copy-Move Forgery | AI models often clone/reuse elements. | Train the LLM to detect duplicated regions via frequency analysis or gradient-based saliency maps (e.g., using a Siamese network to compare image segments). |
245
+ | βœ… | 6 | Channel Histograms | Skewed color distributions in AI-generated images. | Feed the **histogram plots** as additional input (e.g., as a grayscale image) to highlight unnatural color profiles in the LLM. |
246
+ | πŸ”· | 7 | Pixel Statistics | Unnatural RGB value deviations in AI-generated images. | Train the LLM on datasets with metadata tags indicating mean/max/min RGB values, using these stats as part of the training signal. |
247
+ | πŸ”· | 8 | JPEG Quality Estimation | AI-generated content may have atypical JPEG quality settings. | Preprocess the image to expose JPEG quality artifacts (e.g., blockiness) and train the LLM to identify these patterns via loss functions tuned to compression. |
248
+ | πŸ”· | 9 | Resampling Detection | AI tools may upscale/rotate images, leaving subpixel-level artifacts. | Use **frequency analysis** modules in the LLM (e.g., Fourier-transformed images) to detect MoirΓ© patterns or grid distortions from resampling. |
249
+ | βœ… | 10 | PCA Projection | Highlights synthetic color distributions. | Apply PCA to reduce color dimensions and input the 2D/3D projection to the LLM as a simplified feature space. |
250
+ | βœ… | 11 | Bit Planes Values | Detect synthetic noise patterns absent in natural images. | Analyze individual bit planes (e.g., bit plane 1–8) and feed the binary images to the LLM to train on AI-specific bit-plane anomalies. |
251
+ | πŸ”· | 12 | Median Filtering Traces | AI pre/post-processing steps mimic median filtering. | Train the LLM on synthetically filtered images to recognize AI-applied diffusion artifacts. |
252
+ | βœ… | 13 | Wavelet Threshold | Identifies AI-generated texture inconsistencies. | Use wavelet-decomposed images as input channels to the LLM to isolate synthetic textures vs. natural textures. |
253
+ | βœ… | 14 | Frequency Split | AI may generate unnatural gradients or sharpness. | Separate high/low frequencies and train the LLM to detect missing high-frequency content in AI-generated regions (e.g., over-smoothed edges). |
254
+ | πŸ”· | 15 | PRNU Identification | Absence of sensor-specific noise in AI-generated images. | Train the LLM on PRNU-noise databases to detect the absence or mismatch of sensor-specific noise in unlabeled images. |
255
+ | πŸ”· | 16 | EXIF Tampering Detection | AI may falsify metadata. | Flag images with inconsistent Exif hashes (e.g., mismatched EXIF/visual content) and use metadata tags as training labels. |
256
+ | πŸ”· | 17 | Composite Splicing | AI-generated images often stitch elements with inconsistencies. | Use **edge-aware models** (e.g., CRFL-like architectures) to detect lighting/shadow mismatches in spliced regions. |
257
+ | πŸ”· | 18 | RGB/HSV Plots | AI-generated images have unnatural color distributions. | Input RGB/HSV channel plots as 1D signals to the LLM's classifier head, along with the original image. |
258
+ | πŸ”· | 19 | Dead/Hot Pixel Analysis | Absence of sensor-level imperfections in AI-generated images. | Use pre-trained sensor noise databases to train the LLM to flag images missing dead/hot pixels. |
259
+ | πŸ”· | 20 | File Digest (Hashing) | Compare to known AI-generated image hashes for rapid detection. | Use hash values as binary tags in a training dataset (e.g., "hash matches known AI model" β†’ label as synthetic). |
260
+
261
+ ### Legend
262
+ - **Priority**: High (Critical), Medium (Important), Low (Nice to have)
263
+ - **Status**: πŸ”· In-Progress, βœ… Completed, πŸ”» Blocked
264
+
265
+
266
+ ---
267
+
268
+ ### **Hybrid Input Table for AI Content Detection (Planned)**
269
+
270
+ | **Strategy #** | **Description** | **Input Components** | **Agent Guidance / Instructions** |
271
+ |----------------|----------------------------------------------------------------------------------|--------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------|
272
+ | 1 | Combine ELA (Error Level Analysis) with RGB images for texture discrimination. | ELA-processed image + original RGB image (stacked as 4D tensor). | Use a **multi-input CNN** to process ELA maps and RGB images in parallel, or concatenate them into a 6-channel input (3 RGB + 3 ELA). |
273
+ | 2 | Use metadata (Exif) and visual content as a **multimodal pair**. | Visual image + Exif metadata (as text caption). | Feed the image and metadata text into a **multimodal LLM** (e.g., CLIP or MMBT). Use a cross-attention module to align metadata with visual features. |
274
+ | 3 | Add **histogram plots** as a 1D auxiliary input for color distribution analysis. | Image (3D input) + histogram plots (1D vector or 2D grayscale image). | Train a **dual-stream model** (CNN for image + LSTM/Transformer for histogram data) to learn the relationship between visual and statistical features. |
275
+ | 4 | Combine **frequency split images** (high/low) with RGB for texture detection. | High-frequency image + low-frequency image + RGB image (as 3+3+3 input channels). | Use a **frequency-aware CNN** to process each frequency band with separate filters, then merge features for classification. |
276
+ | 5 | Train a model on **bit planes values** alongside the original image. | Bit plane images (binary black-and-white layers) + original RGB image. | Stack or concatenate bit plane images with RGB channels before inputting to the LLM. For example, combine 3 bit planes with 3 RGB channels. |
277
+ | 6 | Use **PRNU noise maps** and visual features to detect synthetic content. | PRNU-noise map (grayscale) + RGB image (3D input). | Train a **Siamese network** to compare PRNU maps with real-world noise databases. If PRNU is absent or mismatched, flag the image as synthetic. |
278
+ | 7 | Stack **hex-editor-derived metadata** (e.g., file header signatures) as a channel. | Hex-derived binary patterns (encoded as 1D or 2D data) + RGB image. | Use a **transformer with 1D hex embeddings** as a metadata input, cross-attending with a ViT (Vision Transformer) for RGB analysis. |
279
+ | 8 | Add **dead/hot pixel detection maps** as a mask to highlight sensor artifacts. | Dead/hot pixel mask (binary 2D map) + RGB image. | Concatenate the mask with the RGB image as a 4th channel. Train a U-Net-style model to detect synthetic regions where the mask lacks sensor patterns. |
280
+ | 9 | Use **PCA-reduced color projections** as a simplified input for LLMs. | PCA-transformed color embeddings (2D/3D projection) + original image. | Train a **transformer** to learn how PCA-projected color distributions differ between natural and synthetic images. |
281
+ | 10 | Integrate **wavelet-decomposed subbands** with RGB for texture discrimination. | Wavelet subbands (LL, LH, HL, HH) + RGB image (stacked as 7D input). | Design a **wavelet-aware CNN** to process each subband separately before global pooling and classification. |
282
+
283
+ ---
284
+
285
+ ### **Key Integration Tips for Hybrid Inputs**
286
+ 1. **Multimodal Models**
287
+ - Use models like **CLIP**, **BLIP**, or **MBT** to align metadata (text) with visual features (images).
288
+ - For example: Combine a **ViT** (for image processing) with a **Transformer** (for Exif metadata or histograms).
289
+
290
+ 2. **Feature Fusion Techniques**
291
+ - **Early fusion**: Concatenate inputs (e.g., ELA + RGB) before the first layer.
292
+ - **Late fusion**: Process inputs separately and merge features before final classification.
293
+ - **Cross-modal attention**: Use cross-attention to align metadata with visual features (e.g., Exif text and PRNU noise maps).
294
+
295
+ 3. **Preprocessing for Hybrid Inputs**
296
+ - Normalize metadata and image data to the same scale (e.g., 0–1).
297
+ - Convert 1D histogram data into 2D images (e.g., heatmap-like plots) for consistent input formats.
298
+
299
+ 4. **Loss Functions for Hybrid Tasks**
300
+ - Use **multi-task loss** (e.g., classification + regression) if metadata is involved.
301
+ - For consistency checks (e.g., metadata vs. visual content), use **triplet loss** or **contrastive loss**.
302
+
303
+ ---
304
+
forensics/__init__.py CHANGED
@@ -1,6 +1,6 @@
1
  from .bitplane import bit_plane_extractor
2
  from .ela import ELA
3
- from .exif import exif_full_dump
4
  from .gradient import gradient_processing
5
  from .minmax import minmax_process
6
  from .wavelet import wavelet_blocking_noise_estimation
@@ -8,7 +8,7 @@ from .wavelet import wavelet_blocking_noise_estimation
8
  __all__ = [
9
  'bit_plane_extractor',
10
  'ELA',
11
- 'exif_full_dump',
12
  'gradient_processing',
13
  'minmax_process',
14
  'wavelet_blocking_noise_estimation'
 
1
  from .bitplane import bit_plane_extractor
2
  from .ela import ELA
3
+ # from .exif import exif_full_dump
4
  from .gradient import gradient_processing
5
  from .minmax import minmax_process
6
  from .wavelet import wavelet_blocking_noise_estimation
 
8
  __all__ = [
9
  'bit_plane_extractor',
10
  'ELA',
11
+ # 'exif_full_dump',
12
  'gradient_processing',
13
  'minmax_process',
14
  'wavelet_blocking_noise_estimation'
forensics/exif.py CHANGED
@@ -1,11 +1,11 @@
1
- import tempfile
2
- import exiftool
3
- from PIL import Image
4
 
5
- def exif_full_dump(image: Image.Image) -> dict:
6
- """Extract all EXIF metadata from an image using exiftool."""
7
- with tempfile.NamedTemporaryFile(suffix='.jpg', delete=True) as tmp:
8
- image.save(tmp.name)
9
- with exiftool.ExifTool() as et:
10
- metadata = et.get_metadata(tmp.name)
11
- return metadata
 
1
+ # import tempfile
2
+ # import exiftool
3
+ # from PIL import Image
4
 
5
+ # def exif_full_dump(image: Image.Image) -> dict:
6
+ # """Extract all EXIF metadata from an image using exiftool."""
7
+ # with tempfile.NamedTemporaryFile(suffix='.jpg', delete=True) as tmp:
8
+ # image.save(tmp.name)
9
+ # with exiftool.ExifTool() as et:
10
+ # metadata = et.get_metadata(tmp.name)
11
+ # return metadata
requirements.txt CHANGED
@@ -1,7 +1,7 @@
1
  --index-url https://download.pytorch.org/whl/nightly/cpu
2
 
3
  # Core ML/AI libraries
4
- transformers>=4.48.2
5
  torch
6
  torchvision
7
  torchaudio
 
1
  --index-url https://download.pytorch.org/whl/nightly/cpu
2
 
3
  # Core ML/AI libraries
4
+ transformers
5
  torch
6
  torchvision
7
  torchaudio