LPX55 commited on
Commit
14bff6c
·
verified ·
1 Parent(s): 7cf9278

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -16
README.md CHANGED
@@ -23,8 +23,8 @@ Vision Transformer (ViT) model fine-tuned for detecting AI-generated images in f
23
  - **Finetuned from:** timm/vit_small_patch16_384.augreg_in21k_ft_in1k
24
 
25
  ### Model Sources
26
- - **Repository:** [GitHub link to code]
27
- - **Paper:** [Link to relevant paper or cite arXiv:2411.04125]
28
 
29
  ## Uses
30
  ### Direct Use
@@ -33,18 +33,12 @@ Detect AI-generated images in:
33
  - Digital forensic investigations
34
  - Media authenticity verification
35
 
36
- ### Out-of-Scope Use
37
- - Detecting videos or text content
38
- - Identifying generative model architectures (use Transformers-based detectors instead)
39
-
40
  ## Bias, Risks, and Limitations
41
  - **Performance variance:** Accuracy drops 15-20% on diffusion-generated images vs GAN-generated
42
  - **Geometric artifacts:** Struggles with rotated/flipped synthetic images
43
  - **Data bias:** Trained primarily on LAION and COCO derivatives ([source][2411.04125v1.pdf])
 
44
 
45
- ### Recommendations
46
- - Combine with error-level analysis for improved robustness
47
- - Update model quarterly to address new generator architectures
48
 
49
  ## How to Use
50
  ```python
@@ -60,8 +54,8 @@ predicted_class = outputs.logits.argmax(-1)
60
 
61
  ## Training Details
62
  ### Training Data
63
- - 50,000 images from 15+ generators (matching [2411.04125v1.pdf] Table 3 coverage)
64
- - Balanced real/fake split (25k real from COCO, 25k synthetic from Stable Diffusion variants)
65
 
66
  ### Training Hyperparameters
67
  - **Framework:** PyTorch 2.0
@@ -81,11 +75,7 @@ predicted_class = outputs.logits.argmax(-1)
81
  | AUC-ROC | 0.992 |
82
  | FP Rate | 2.1% |
83
 
84
- ## Technical Specifications
85
- ### Model Architecture
86
- - ViT-Small with 16x16 patch embeddings
87
- - 384x384 input resolution
88
- - 12 transformer layers
89
 
90
  ## Citation
91
  **BibTeX:**
 
23
  - **Finetuned from:** timm/vit_small_patch16_384.augreg_in21k_ft_in1k
24
 
25
  ### Model Sources
26
+ - **Repository:** [JeongsooP/Community-Forensics](https://github.com/JeongsooP/Community-Forensics)
27
+ - **Paper:** [arXiv:2411.04125](https://arxiv.org/pdf/2411.04125)
28
 
29
  ## Uses
30
  ### Direct Use
 
33
  - Digital forensic investigations
34
  - Media authenticity verification
35
 
 
 
 
 
36
  ## Bias, Risks, and Limitations
37
  - **Performance variance:** Accuracy drops 15-20% on diffusion-generated images vs GAN-generated
38
  - **Geometric artifacts:** Struggles with rotated/flipped synthetic images
39
  - **Data bias:** Trained primarily on LAION and COCO derivatives ([source][2411.04125v1.pdf])
40
+ - **ADDED BY UPLOADER**: Model is already out of date, fails to detect images on newer generation models.
41
 
 
 
 
42
 
43
  ## How to Use
44
  ```python
 
54
 
55
  ## Training Details
56
  ### Training Data
57
+ - 2.7mil images from 15+ generators, 4600+ models
58
+ - Over 1.15TB worth of images
59
 
60
  ### Training Hyperparameters
61
  - **Framework:** PyTorch 2.0
 
75
  | AUC-ROC | 0.992 |
76
  | FP Rate | 2.1% |
77
 
78
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/639daf827270667011153fbc/g-dLzxLBw1RAuiplvFCxh.png)
 
 
 
 
79
 
80
  ## Citation
81
  **BibTeX:**