Rudra Rahul Chothe commited on
Commit
e6ab28d
Β·
verified Β·
1 Parent(s): 0acafe1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +122 -117
README.md CHANGED
@@ -1,117 +1,122 @@
1
- ---
2
- language: en
3
- license: mit
4
- tags:
5
- - image-search
6
- - machine-learning
7
- ---
8
-
9
- ## Image Similarity Search Engine
10
- A deep learning-based image similarity search engine that uses EfficientNetB0 for feature extraction and FAISS for fast similarity search. The application provides a web interface built with Streamlit for easy interaction.
11
-
12
- Features
13
- - Deep Feature Extraction: Uses EfficientNetB0 (pre-trained on ImageNet) to extract meaningful features from images
14
- - Fast Similarity Search: Implements FAISS for efficient nearest-neighbor search
15
- - Interactive Web Interface: User-friendly interface built with Streamlit
16
- - Real-time Processing: Shows progress and time estimates during feature extraction
17
- - Scalable Architecture: Designed to handle large image datasets efficiently
18
-
19
- ## Installation
20
- ## Prerequisites
21
-
22
- Python 3.8 or higher
23
- pip package manager
24
-
25
- ## Setup
26
-
27
- 1. Clone the repository:
28
- ```
29
- git clone https://github.com/yourusername/image-similarity-search.git
30
- cd image-similarity-search
31
- ```
32
- 2. Create and activate a virtual environment:
33
- ```
34
- python -m venv venv
35
- source venv/bin/activate # On Windows use: venv\Scripts\activate
36
- ```
37
- 3. Install required packages:
38
- ```
39
- pip install -r requirements.txt
40
- ```
41
-
42
- ## Project Structure
43
- ```
44
- image-similarity-search/
45
- β”œβ”€β”€ data/
46
- β”‚ β”œβ”€β”€ images/ # Directory for train dataset images
47
- β”‚ β”œβ”€β”€ sample-test-images/ # Directory for test dataset images
48
- β”‚ └── embeddings.pkl # Pre-computed image embeddings
49
- β”œβ”€β”€ src/
50
- β”‚ β”œβ”€β”€ feature_extractor.py # EfficientNetB0 feature extraction
51
- β”‚ β”œβ”€β”€ preprocessing.py # Image preprocessing and embedding computation
52
- β”‚ β”œβ”€β”€ similarity_search.py # FAISS-based similarity search
53
- β”‚ └── main.py # Streamlit web interface
54
- β”œβ”€β”€ requirements.txt
55
- β”œβ”€β”€ README.md
56
- └── .gitignore
57
- ```
58
- ## Usage
59
-
60
- 1. **Prepare Your Dataset:**
61
- Get training image dataset from drive:
62
- ```
63
- https://drive.google.com/file/d/1U2PljA7NE57jcSSzPs21ZurdIPXdYZtN/view?usp=drive_link
64
- ```
65
- Place your image dataset in the data/images directory
66
- Supported formats: JPG, JPEG, PNG
67
-
68
- 2. **Generate Embeddings:**
69
- ```
70
- python -m src.preprocessing
71
- ```
72
-
73
- **This will**:
74
- - Process all images in the dataset
75
- - Show progress and time estimates
76
- - Save embeddings to data/embeddings.pkl
77
-
78
- 3. **Run the Web Interface:**
79
- ```
80
- streamlit run src/main.py
81
- ```
82
-
83
- 4. Using the Interface:
84
-
85
- - Upload a query image using the file uploader
86
- - Click "Search Similar Images"
87
- - View the most similar images from your dataset
88
-
89
-
90
-
91
- ## Technical Details
92
- **Feature Extraction**
93
- - Uses EfficientNetB0 without top layers
94
- - Input image size: 224x224 pixels
95
- - Output feature dimension: 1280
96
-
97
- **Similarity Search**
98
- - Uses FAISS IndexFlatL2 for L2 distance-based search
99
- - Returns top-k most similar images (default k=5)
100
-
101
- **Web Interface**
102
- - Responsive design with Streamlit
103
- - Displays original and similar images with similarity scores
104
- - Progress tracking during processing
105
-
106
- **Dependencies**
107
- - TensorFlow 2.x
108
- - FAISS-cpu (or FAISS-gpu for GPU support)
109
- - Streamlit
110
- - Pillow
111
- - NumPy
112
- - tqdm
113
-
114
- **Performance**
115
- - Feature extraction: ~1 second per image on CPU
116
- - Similarity search: Near real-time for datasets up to 100k images
117
- - Memory usage depends on dataset size (approximately 5KB per image embedding)
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - image-search
6
+ - machine-learning
7
+ title: Image Similarity Search Engine
8
+ sdk: streamlit
9
+ emoji: πŸ’»
10
+ colorFrom: blue
11
+ colorTo: pink
12
+ ---
13
+
14
+ ## Image Similarity Search Engine
15
+ A deep learning-based image similarity search engine that uses EfficientNetB0 for feature extraction and FAISS for fast similarity search. The application provides a web interface built with Streamlit for easy interaction.
16
+
17
+ Features
18
+ - Deep Feature Extraction: Uses EfficientNetB0 (pre-trained on ImageNet) to extract meaningful features from images
19
+ - Fast Similarity Search: Implements FAISS for efficient nearest-neighbor search
20
+ - Interactive Web Interface: User-friendly interface built with Streamlit
21
+ - Real-time Processing: Shows progress and time estimates during feature extraction
22
+ - Scalable Architecture: Designed to handle large image datasets efficiently
23
+
24
+ ## Installation
25
+ ## Prerequisites
26
+
27
+ Python 3.8 or higher
28
+ pip package manager
29
+
30
+ ## Setup
31
+
32
+ 1. Clone the repository:
33
+ ```
34
+ git clone https://github.com/yourusername/image-similarity-search.git
35
+ cd image-similarity-search
36
+ ```
37
+ 2. Create and activate a virtual environment:
38
+ ```
39
+ python -m venv venv
40
+ source venv/bin/activate # On Windows use: venv\Scripts\activate
41
+ ```
42
+ 3. Install required packages:
43
+ ```
44
+ pip install -r requirements.txt
45
+ ```
46
+
47
+ ## Project Structure
48
+ ```
49
+ image-similarity-search/
50
+ β”œβ”€β”€ data/
51
+ β”‚ β”œβ”€β”€ images/ # Directory for train dataset images
52
+ β”‚ β”œβ”€β”€ sample-test-images/ # Directory for test dataset images
53
+ β”‚ └── embeddings.pkl # Pre-computed image embeddings
54
+ β”œβ”€β”€ src/
55
+ β”‚ β”œβ”€β”€ feature_extractor.py # EfficientNetB0 feature extraction
56
+ β”‚ β”œβ”€β”€ preprocessing.py # Image preprocessing and embedding computation
57
+ β”‚ β”œβ”€β”€ similarity_search.py # FAISS-based similarity search
58
+ β”‚ └── main.py # Streamlit web interface
59
+ β”œβ”€β”€ requirements.txt
60
+ β”œβ”€β”€ README.md
61
+ └── .gitignore
62
+ ```
63
+ ## Usage
64
+
65
+ 1. **Prepare Your Dataset:**
66
+ Get training image dataset from drive:
67
+ ```
68
+ https://drive.google.com/file/d/1U2PljA7NE57jcSSzPs21ZurdIPXdYZtN/view?usp=drive_link
69
+ ```
70
+ Place your image dataset in the data/images directory
71
+ Supported formats: JPG, JPEG, PNG
72
+
73
+ 2. **Generate Embeddings:**
74
+ ```
75
+ python -m src.preprocessing
76
+ ```
77
+
78
+ **This will**:
79
+ - Process all images in the dataset
80
+ - Show progress and time estimates
81
+ - Save embeddings to data/embeddings.pkl
82
+
83
+ 3. **Run the Web Interface:**
84
+ ```
85
+ streamlit run src/main.py
86
+ ```
87
+
88
+ 4. Using the Interface:
89
+
90
+ - Upload a query image using the file uploader
91
+ - Click "Search Similar Images"
92
+ - View the most similar images from your dataset
93
+
94
+
95
+
96
+ ## Technical Details
97
+ **Feature Extraction**
98
+ - Uses EfficientNetB0 without top layers
99
+ - Input image size: 224x224 pixels
100
+ - Output feature dimension: 1280
101
+
102
+ **Similarity Search**
103
+ - Uses FAISS IndexFlatL2 for L2 distance-based search
104
+ - Returns top-k most similar images (default k=5)
105
+
106
+ **Web Interface**
107
+ - Responsive design with Streamlit
108
+ - Displays original and similar images with similarity scores
109
+ - Progress tracking during processing
110
+
111
+ **Dependencies**
112
+ - TensorFlow 2.x
113
+ - FAISS-cpu (or FAISS-gpu for GPU support)
114
+ - Streamlit
115
+ - Pillow
116
+ - NumPy
117
+ - tqdm
118
+
119
+ **Performance**
120
+ - Feature extraction: ~1 second per image on CPU
121
+ - Similarity search: Near real-time for datasets up to 100k images
122
+ - Memory usage depends on dataset size (approximately 5KB per image embedding)