Image Classification
Transformers
English
art
benjaminStreltzin commited on
Commit
9a456c1
·
2 Parent(s): 267cf9d 22af25c
Files changed (2) hide show
  1. README.md +99 -3
  2. requirements.txt +8 -0
README.md CHANGED
@@ -1,3 +1,99 @@
1
- ---
2
- license: unknown
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: unknown
3
+ language:
4
+ - en
5
+ metrics:
6
+ - accuracy
7
+ - precision
8
+ - f1
9
+ - recall
10
+ tags:
11
+ - art
12
+ base_model: google/vit-base-patch16-224
13
+ datasets:
14
+ - DataScienceProject/Art_Images_Ai_And_Real_
15
+ pipeline_tag: image-classification
16
+ library_name: transformers
17
+ ---
18
+
19
+ ### Model Card for Model ID
20
+ This model is designed for classifying images as either 'real' or 'fake-Ai generated' using a Convolutional Neural Network (CNN) combined with Error Level Analysis (ELA).
21
+
22
+ Our goal is to accurately classify the source of the image with at least 85% accuracy and achieve at least 80% in the Recall test.
23
+
24
+ ### Model Description
25
+
26
+ This model leverages the Vision Transformer (ViT) architecture, which applies self-attention mechanisms to process images.
27
+ The model classifies images into two categories: 'real ' and 'fake - ai generated'.
28
+ It captures intricate patterns and features that help in distinguishing between the two categories without the need for Convolutional Neural Networks (CNNs).
29
+
30
+ ### Direct Use
31
+
32
+ This model can be used to classify images as 'real art' or 'fake art' based on visual features learned by the Vision Transformer.
33
+
34
+
35
+ ### Out-of-Scope Use
36
+
37
+ The model may not perform well on images outside the scope of art or where the visual characteristics are drastically different from those in the training dataset.
38
+
39
+
40
+ ### Recommendations
41
+
42
+ Run the traning code on pc with an nvidia gpu better then rtx 3060 and at least 6 core cpu / use google collab.
43
+
44
+
45
+ ## How to Get Started with the Model
46
+
47
+ Prepare Data: Organize your images into appropriate folders and run the code.
48
+
49
+ ## model architecture
50
+
51
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66d6f1b3b50e35e1709bfdf7/RhONF2ZsQi_aVqyyk17yK.png)
52
+
53
+ ## Training Details
54
+
55
+ -Dataset: DataScienceProject/Art_Images_Ai_And_Real_
56
+
57
+ Preprocessing: Images are resized, converted to 'rgb' format , transformed into tensor and stored in special torch dataset.
58
+
59
+
60
+ #### Training Hyperparameters
61
+
62
+ optimizer = optim.Adam(model.parameters(), lr=0.001)
63
+ num_epochs = 10
64
+ criterion = nn.CrossEntropyLoss()
65
+
66
+ ## Evaluation
67
+
68
+ The model takes 15-20 minutes to run , based on our dataset , equipped with the following pc hardware: cpu :i9 13900 ,ram: 32gb , gpu: rtx 3080
69
+ your mileage may vary.
70
+
71
+ ### Testing Data, Factors & Metrics
72
+
73
+ -precision
74
+ -recall
75
+ -f1
76
+ -confusion_matrix
77
+ -accuracy
78
+
79
+
80
+ ### Results
81
+
82
+ -test accuracy = 0.92
83
+
84
+ -precision = 0.893
85
+
86
+ -recall = 0.957
87
+
88
+ -f1 = 0.924
89
+
90
+ -
91
+
92
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66d6f1b3b50e35e1709bfdf7/UYTV1X3AqFM50EFojMbn9.png)
93
+
94
+
95
+
96
+ #### Summary
97
+
98
+ This model is by far the best of what we tried (CNN , Resnet , CNN + ELA).
99
+
requirements.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ torch
2
+ torchvision
3
+ transformers
4
+ Pillow
5
+ pandas
6
+ scikit-learn
7
+ matplotlib
8
+ seaborn