mcp-deepfake-forensics

Running

App Files Files Community

LPX55 commited on Jun 11

Commit

4ab021f

2 Parent(s): e1eac06 0d5ecbc

Merge branch 'main' of https://huggingface.co/spaces/LPX55/mcp-deepfake-forensics

Browse files

Files changed (10) hide show

.gitattributes +5 -0
README.md +51 -2
app.py +1 -1
preview/.gitkeep +0 -0
preview/1.png +3 -0
preview/127.0.0.1_7860__.png +3 -0
preview/2.png +3 -0
preview/3.png +3 -0
preview/4.png +3 -0
preview/graph.png +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,8 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+preview/1.png filter=lfs diff=lfs merge=lfs -text
+preview/127.0.0.1_7860__.png filter=lfs diff=lfs merge=lfs -text
+preview/2.png filter=lfs diff=lfs merge=lfs -text
+preview/3.png filter=lfs diff=lfs merge=lfs -text
+preview/4.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -15,8 +15,59 @@ models:
 - cmckinle/sdxl-flux-detector
 - Organika/sdxl-detector
 license: mit
 ---
 ## Functions Available for LLM Calls via MCP
 This document outlines the functions available for programmatic invocation by LLMs through the MCP (Multi-Cloud Platform) server, as defined in `mcp-deepfake-forensics/app.py`.
@@ -336,8 +387,6 @@ Here's the updated table with an additional column providing **instructions on h
    - Use **multi-task loss** (e.g., classification + regression) if metadata is involved.
    - For consistency checks (e.g., metadata vs. visual content), use **triplet loss** or **contrastive loss**.
----
 ---
 ### **Overview of Multi-Model Consensus Methods in ML**
 | **Method**               | **Category**               | **Description**                                  | **Key Advantages**                                | **Key Limitations**                                          | **Weaknesses**                          | **Strengths**                                                                 |

 - cmckinle/sdxl-flux-detector
 - Organika/sdxl-detector
 license: mit
+tags:
+  - mcp-server-track
+  - ai-agents
+  - leaderboards
+  - incentivized-contests
+  - Agents-MCP-Hackathon
 ---
+# The Detection Dilemma: The Degentic Games
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/639daf827270667011153fbc/_1wlvHrYhfKyn-7lMQhsN.png)
+The cat-and-mouse game between digital forgery and detection reached a tipping point early last year after years of escalating concern and anxiety. The most ambitious, expensive, and resource-intensive detection model was launched with actually impressive results. Impressive… for an embarassing two to three weeks.
+Then came the knockout punches. New SOTA models emerging every few weeks, in every imaginageable domain -- image, audio, video, music. Generated images are now at a level of realism that to an untrained eye, its unable to discern if its real or fake. [TO-DO: Add Citation to the study]
+And let's be honest: we saw this coming. When has humanity ever resisted accelerating technology that promises... *interesting* applications? As the ancients wisely tweeted: 🔞 drives innovation.
+It's time for a reset. Quit crying and get ready. Didn't you hear? The long awaited Degentic Games is starting soon, and your model sucks.
+## Re-Thinking Detection
+### 1. **Shift away from the belief that more data leads to better results. Rather, focus on insight-driven and "quality over quantity" datasets in training.**
+* **Move Away from Terabyte-Scale Datasets**: Focus on **quality over quantity** by curating a smaller, highly diverse, and **labeled dataset** emphasizing edge cases and the latest AI generations.
+* **Active Learning**: Implement active learning techniques to iteratively select the most informative samples for human labeling, reducing dataset size while maintaining effectiveness.
+### 2. **Efficient Model Architectures**
+* **Adopt Lightweight, State-of-the-Art Models**: Explore models designed for efficiency like MobileNet, EfficientNet, or recent advancements in vision transformers (ViTs) tailored for forensic analysis.
+* **Transfer Learning with Fine-Tuning**: Leverage pre-trained models fine-tuned on your curated dataset to leverage general knowledge while adapting to specific AI image detection tasks.
+### 3. **Multi-Modal and Hybrid Approaches**
+* **Combine Image Forensics with Metadata Analysis**: Integrate insights from image processing with metadata (e.g., EXIF, XMP) for a more robust detection framework.
+* **Incorporate Knowledge Graphs for AI Model Identification**: If feasible, build or utilize knowledge graphs mapping known AI models to their generation signatures for targeted detection.
+### 4. **Continuous Learning and Update Mechanism**
+* **Online Learning or Incremental Training**: Implement a system that can incrementally update the model with new, strategically selected samples, adapting to new AI generation techniques.
+* **Community-Driven Updates**: Establish a feedback loop with users/community to report undetected AI images, fueling model updates.
+### 5. **Evaluation and Validation**
+* **Robust Validation Protocols**: Regularly test against unseen, diverse datasets including novel AI generations not present during training.
+* **Benchmark Against State-of-the-Art**: Periodically compare performance with newly published detection models or techniques.
+### Core Roadmap
+[x] Project Introduction
+[ ] Agents Released into Wild
+[ ] Whitepaper / Arxiv Release
+[ ] Public Participation
 ## Functions Available for LLM Calls via MCP
 This document outlines the functions available for programmatic invocation by LLMs through the MCP (Multi-Cloud Platform) server, as defined in `mcp-deepfake-forensics/app.py`.
    - Use **multi-task loss** (e.g., classification + regression) if metadata is involved.
    - For consistency checks (e.g., metadata vs. visual content), use **triplet loss** or **contrastive loss**.
 ---
 ### **Overview of Multi-Model Consensus Methods in ML**
 | **Method**               | **Category**               | **Description**                                  | **Key Advantages**                                | **Key Limitations**                                          | **Weaknesses**                          | **Strengths**                                                                 |

app.py CHANGED Viewed

@@ -521,7 +521,7 @@ detection_model_eval_playground = gr.Interface(
         gr.JSON(label="Raw Model Results", visible=False),
         gr.Markdown(label="Consensus", value="")
     ],
-    title="Multi-Model Ensemble + Agentic Coordinated Deepfake Detection",
     description="The detection of AI-generated images has entered a critical inflection point. While existing solutions struggle with outdated datasets and inflated claims, our approach prioritizes agility, community collaboration, and an offensive approach to deepfake detection.",
     api_name="predict",
     live=True  # Enable streaming

         gr.JSON(label="Raw Model Results", visible=False),
         gr.Markdown(label="Consensus", value="")
     ],
+    title="Multi-Model Ensemble + Agentic Coordinated Deepfake Detection (Paper in Progress)",
     description="The detection of AI-generated images has entered a critical inflection point. While existing solutions struggle with outdated datasets and inflated claims, our approach prioritizes agility, community collaboration, and an offensive approach to deepfake detection.",
     api_name="predict",
     live=True  # Enable streaming