DataScienceProject
/

Vit

Image Classification

Model card Files Files and versions Community

benjaminStreltzin commited on Sep 12, 2024

Commit

9a456c1

·

2 Parent(s): 267cf9d 22af25c

update

Files changed (2) hide show

README.md +99 -3
requirements.txt +8 -0

README.md CHANGED Viewed

@@ -1,3 +1,99 @@
----
-license: unknown
----

+---
+license: unknown
+language:
+- en
+metrics:
+- accuracy
+- precision
+- f1
+- recall
+tags:
+- art
+base_model: google/vit-base-patch16-224
+datasets:
+- DataScienceProject/Art_Images_Ai_And_Real_
+pipeline_tag: image-classification
+library_name: transformers
+---
+### Model Card for Model ID
+This model is designed for classifying images as either 'real' or 'fake-Ai generated' using a Convolutional Neural Network (CNN) combined with Error Level Analysis (ELA).
+Our goal is to accurately classify the source of the image with at least 85% accuracy and achieve at least 80% in the Recall test.
+### Model Description
+This model leverages the Vision Transformer (ViT) architecture, which applies self-attention mechanisms to process images.
+The model classifies images into two categories: 'real ' and 'fake - ai generated'.
+It captures intricate patterns and features that help in distinguishing between the two categories without the need for Convolutional Neural Networks (CNNs).
+### Direct Use
+This model can be used to classify images as 'real art' or 'fake art' based on visual features learned by the Vision Transformer.
+### Out-of-Scope Use
+The model may not perform well on images outside the scope of art or where the visual characteristics are drastically different from those in the training dataset.
+### Recommendations
+Run the traning code on pc with an nvidia gpu better then rtx 3060 and at least 6 core cpu / use google collab.
+## How to Get Started with the Model
+Prepare Data: Organize your images into appropriate folders and run the code.
+## model architecture
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/66d6f1b3b50e35e1709bfdf7/RhONF2ZsQi_aVqyyk17yK.png)
+## Training Details
+-Dataset: DataScienceProject/Art_Images_Ai_And_Real_
+Preprocessing: Images are resized, converted to 'rgb' format , transformed into tensor  and stored in special torch dataset.
+#### Training Hyperparameters
+optimizer = optim.Adam(model.parameters(), lr=0.001)
+num_epochs = 10
+criterion = nn.CrossEntropyLoss()
+## Evaluation
+The model takes 15-20 minutes to run , based on our dataset , equipped with the following pc hardware: cpu :i9 13900 ,ram: 32gb  , gpu: rtx 3080
+your mileage may vary.
+### Testing Data, Factors & Metrics
+-precision
+-recall
+-f1
+-confusion_matrix
+-accuracy
+### Results
+-test accuracy = 0.92
+-precision = 0.893
+-recall = 0.957
+-f1 = 0.924
+-
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/66d6f1b3b50e35e1709bfdf7/UYTV1X3AqFM50EFojMbn9.png)
+#### Summary
+This model is by far the best of what we tried (CNN , Resnet , CNN + ELA).

requirements.txt ADDED Viewed

	@@ -0,0 +1,8 @@

+torch
+torchvision
+transformers
+Pillow
+pandas
+scikit-learn
+matplotlib
+seaborn