File size: 4,922 Bytes
424919d
 
 
 
 
 
 
1318c17
 
424919d
 
 
 
 
 
220979a
424919d
 
 
220979a
424919d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
# Distil-DIRE
Distil-DIRE is a lightweight version of DIRE, which can be used for real-time applications. Instead of calculating DIRE image directly, Distl-DIRE aims to reconstruct the features of corresponding DIRE image forwared by a image-net pretrained classifier with one-step noise of DDIM inversion. ([Paper Link](https://arxiv.org/abs/2406.00856))
![overview](distil.png)

### Pretrained DistilDIRE Checkpoints
| Dataset | Model | Link |
| --- | --- | --- |
| CelebA-HQ | 224x224 | [link](./celebahq-distil-dire-34e.pth) |
| ImageNet | 224x224 | [link](./imagenet-distil-dire-11e.pth) |




## Pretrained ADM diffusion model
We use image-net pretrained unconditional ADM diffusion model for feature reconstruction. You can download the pretrained model from the following link:
https://openaipublic.blob.core.windows.net/diffusion/jul-2021/256x256_diffusion_uncond.pt

or you can use the following script to download the model:
```bash

wget https://openaipublic.blob.core.windows.net/diffusion/jul-2021/256x256_diffusion_uncond.pt -O models/256x256-adm.pt

```

## Data Preparation
Before training the model on your own dataset, you need to prepare the dataset in the following format:
```bash

mydataset/train|val|test

└── images

    β”œβ”€β”€ fakes

    β”‚   └──img1.png...

    β”œβ”€β”€ reals

        └──rimg1.png...

```

After preparing the dataset, you can calculate the epsilons and dire images for the dataset using the following script:
```bash

bash compute_dire_eps.sh

```

After running the script, you will have the following directory structure:
```bash

mydataset/train|val|test

└── images

    β”œβ”€β”€ fakes

    β”‚   └──img1.png...

    β”œβ”€β”€ reals

        └──rimg1.png...

└── eps

    β”œβ”€β”€ fakes

    β”‚   └──img1.pt...

    β”œβ”€β”€ reals

        └──rimg1.pt...

└── dire

    β”œβ”€β”€ fakes

    β”‚   └──img1.png...

    β”œβ”€β”€ reals

        └──rimg1.png...

``` 
For eps and dire calculation we set the DDIM steps to 20. This should be same when inference.

### Train
For training Distil-DIRE, be sure to have `datasets` directory in the root of the project and your dataset inside the `datasets` directory. 
```

torchrun --standalone --nproc_per_node 8 -m train --batch 128 --exp_name truemedia-distil-dire-mydataset --datasets mydataset --datasets_test mytestset --epoch 100 --lr 1e-4



```

#### Fine-tuning
You can also fine-tune the model on your own dataset. For fine-tuning, you need to provide the path to the pretrained model. 
```bash

torchrun --standalone --nproc_per_node 8 -m train --batch 128 --exp_name truemedia-distil-dire-mydataset --datasets mydataset --datasets_test mytestset --epoch 100 --lr 1e-4 --pretrained_weights YOUR_PRETRAINED_MODEL_PATH

```
 

### Test
For testing the model, you can use the following script:
```bash

python3 -m test --test True --datasets mydataset --pretrained_weights YOUR_PRETRAINED_MODEL_PATH

```


### with Docker 
```

export DOCKER_REGISTRY="YOUR_NAME" # Put your Docker Hub username here  

export DATE=`date +%Y%m%d` # Get the current date



# Build the Docker image for development

docker build -t "$DOCKER_REGISTRY/distil-dire:dev-$DATE" -f Dockerfile .





# Push your docker image to docker hub

docker login

docker push "$DOCKER_REGISTRY/distil-dire:dev-$DATE"



```


# Devl env 
```

export WORKDIR="YOUR_WORKDIR" # Put your working directory here

docker run --gpus=all --name=truemedia_gpu_all_distildire -v "$WORKDIR:/workspace/" -ti -e  "$DOCKER_REGISTRY/distil-dire:dev-$DATE"



# work inside the container (/workspace)

```

### Note
* This repo runs on ADM diffusion model (256x256, unconditional) trained on ImageNet 1k dataset and ResNet-50 classifier trained on ImageNet 1k dataset. 
* Minimum requirements: 1 GPU, 10GB VRAM


## Licenses
This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 license.
You are free to read, share, and modify this code as long as you keep the original author attribution and non-commercial license.
Please see [this site](https://creativecommons.org/licenses/by-nc/4.0/) for detailed legal terms.

## Acknowledgments
Our code is developed based on [DIRE](https://github.com/ZhendongWang6/DIRE), [guided-diffusion](https://github.com/openai/guided-diffusion) and [CNNDetection](https://github.com/peterwang512/CNNDetection). Thanks for their sharing codes and models.

## Citation
If you find this work useful for your research, please cite our paper:
```

@misc{lim2024distildire,

      title={DistilDIRE: A Small, Fast, Cheap and Lightweight Diffusion Synthesized Deepfake Detection}, 

      author={Yewon Lim and Changyeon Lee and Aerin Kim and Oren Etzioni},

      year={2024},

      eprint={2406.00856},

      archivePrefix={arXiv},

      primaryClass={cs.CV}

}

```