File size: 2,707 Bytes
2570e4e b4efb25 73f5f27 b4efb25 7cf9278 b4efb25 e68f079 b4efb25 0e28b67 b4efb25 9309ac4 8635999 9309ac4 e888dd5 8635999 b4efb25 8635999 b4efb25 8635999 2e76bb1 9309ac4 b4efb25 9309ac4 14bff6c b4efb25 14bff6c b4efb25 2e76bb1 b4efb25 9309ac4 2e76bb1 14bff6c 2e76bb1 9309ac4 2e76bb1 b4efb25 2e76bb1 9309ac4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
---
license: mit
pipeline_tag: image-classification
tags:
- image-classification
- timm
- transformers
- detection
- deepfake
- forensics
- deepfake_detection
- community
- opensight
base_model:
- timm/vit_small_patch16_384.augreg_in21k_ft_in1k
library_name: transformers
---
# Trained on 2.7M samples across 4,803 generators (see Training Data)
**Uploaded for community validation as part of OpenSight** - An upcoming open-source framework for adaptive deepfake detection.
**Project OpenSight HF Spaces coming soon with an eval playground and eventually a leaderboard. Preview:**

## Model Details
### Model Description
Vision Transformer (ViT) model trained on the largest dataset to-date for detecting AI-generated images in forensic applications.
- **Developed by:** Jeongsoo Park and Andrew Owens, University of Michigan
- **Model type:** Vision Transformer (ViT-Small)
- **License:** MIT (compatible with CreativeML OpenRAIL-M referenced in [2411.04125v1.pdf])
- **Finetuned from:** timm/vit_small_patch16_384.augreg_in21k_ft_in1k
- **Adapted for HF** inference compatibility by AI Without Borders.
**HF Space will be open sourced shortly showcasing various ways to run ultra-fast inference. Make sure to follow us for updates, as we will be releasing a slew of projects in the coming weeks.**
### Links
- **Repository:** [JeongsooP/Community-Forensics](https://github.com/JeongsooP/Community-Forensics)
- **Paper:** [arXiv:2411.04125](https://arxiv.org/pdf/2411.04125)
## Training Details
### Training Data
- 2.7mil images from 15+ generators, 4600+ models
- Over 1.15TB worth of images
### Training Hyperparameters
- **Framework:** PyTorch 2.0
- **Precision:** bf16 mixed
- **Optimizer:** AdamW (lr=5e-5)
- **Epochs:** 10
- **Batch Size:** 32
## Evaluation
### Unverified Testing Results
- Only unverified because we currently lack resources to evaluate a dataset over 1.4T large.
| Metric | Value |
|---------------|-------|
| Accuracy | 97.2% |
| F1 Score | 0.968 |
| AUC-ROC | 0.992 |
| FP Rate | 2.1% |

## Re-sampled and refined dataset
- **Coming soon™**
## Citation
**BibTeX:**
```bibtex
@misc{park2024communityforensics,
title={Community Forensics: Using Thousands of Generators to Train Fake Image Detectors},
author={Jeongsoo Park and Andrew Owens},
year={2024},
eprint={2411.04125},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.04125},
}
``` |