NyashaK commited on
Commit
c703dbb
·
1 Parent(s): ba5c215

Publish app on gradio

Browse files
Files changed (3) hide show
  1. README.md +53 -14
  2. app.py +424 -0
  3. requirements.txt +4 -0
README.md CHANGED
@@ -1,14 +1,53 @@
1
- ---
2
- title: DocOCR2JSON
3
- emoji: 🐠
4
- colorFrom: gray
5
- colorTo: gray
6
- sdk: gradio
7
- sdk_version: 5.31.0
8
- app_file: app.py
9
- pinned: false
10
- license: mit
11
- short_description: Extract text from images/docs to JSON via OCR
12
- ---
13
-
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Zim Docs OCR-to-JSON Extractor
2
+
3
+ ## Overview
4
+
5
+ Welcome to the **Zim Docs OCR-to-JSON Extractor**! This is a powerful and user-friendly web application built with Gradio, designed to help you upload scanned documents (PDFs) or images (PNG, JPG, etc.). It then uses a vision AI model to perform Optical Character Recognition (OCR) and extract structured information into a JSON format. This tool aims to streamline your process of digitizing and organizing data from various document types, such as **driver's licenses, passports, national ID cards, invoices, receipts, and more.**
6
+
7
+ ## Requirements
8
+
9
+ To use this application, you'll need:
10
+
11
+ * Python 3.7+
12
+ * Gradio
13
+ * Gradio-PDF (`gradio_pdf`)
14
+ * Requests
15
+ * PyMuPDF (`fitz`)
16
+ * An API Key from [OpenRouter.ai](https://openrouter.ai/) (or any other service compatible with the OpenAI chat completions API format).
17
+ * You should set this key as an environment variable named `API_KEY`. The Python script uses `os.getenv("API_KEY")` to retrieve this key. If you're using Hugging Face Spaces, you can set this as a "Secret".
18
+
19
+ ## Running the Application
20
+
21
+ **On Hugging Face Spaces:**
22
+
23
+ This application is designed for deployment on Hugging Face Spaces.
24
+ 1. Ensure your `requirements.txt` file in your Hugging Face Space repository lists all necessary dependencies (e.g., `gradio`, `gradio_pdf`, `requests`, `PyMuPDF`).
25
+ 2. You should configure your `API_KEY` as a "Secret" in your Hugging Face Space settings. The application will then retrieve it using `os.getenv("API_KEY")`.
26
+ 3. Once deployed, you can access the application via the URL provided by your Hugging Face Space.
27
+ * **Live Demo:** You can try out a live demo of this application at: [Demo](https://huggingface.co/spaces/NyashaK/DocOCR2JSON)
28
+
29
+ **For Local Development/Testing (Optional):**
30
+
31
+ If you wish to run the application on your local machine:
32
+ 1. Make sure you have all dependencies listed under "Requirements" installed in your local Python environment (e.g., by running `pip install gradio gradio_pdf requests PyMuPDF`).
33
+ 2. Set the `API_KEY` environment variable on your local system.
34
+ 3. You can then run the application using the command:
35
+ ```bash
36
+ python app.py
37
+ ```
38
+ Replace `app.py` with the actual name of your Python file. It will typically be available at `http://127.0.0.1:7860`.
39
+
40
+ ## How to Use
41
+
42
+ 1. **Access the Application:** Open the URL of your Hugging Face Space where the application is deployed (see Live Demo link above), or your local URL if running it locally.
43
+ 2. **Upload Your Document:**
44
+ * Drag and drop a supported file (PDF, PNG, JPG, etc.) into the designated upload area.
45
+ * Alternatively, click on the upload area to open your file browser and select the document.
46
+ 3. **View Preview:**
47
+ * Once you've uploaded a file, the "Document Preview" tab will attempt to display the image or the first page of your PDF.
48
+ 4. **Check Extracted Data:**
49
+ * The application will automatically process your document.
50
+ * Switch to the "Extracted Data (JSON)" tab to view the structured information extracted by the AI model.
51
+ * If any errors occur during processing (e.g., unsupported file type, API issue), an error message will be displayed in the JSON output area.
52
+
53
+
app.py ADDED
@@ -0,0 +1,424 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ from gradio_pdf import PDF
3
+ import base64
4
+ import requests
5
+ import json
6
+ import re
7
+ import fitz
8
+ import os
9
+
10
+
11
+ API_KEY = os.getenv("API_KEY")
12
+ IMAGE_MODEL = "opengvlab/internvl3-14b:free"
13
+
14
+
15
+ def extract_json_from_code_block(text):
16
+ if not isinstance(text, str):
17
+ return {"error": "Invalid input: text must be a string."}
18
+ try:
19
+ # Standard Markdown code block
20
+ match = re.search(r"```json\s*(\{.*?\})\s*```", text, re.DOTALL)
21
+ if match:
22
+ json_str = match.group(1)
23
+ else:
24
+ json_match = re.search(r"^\s*(\{.*?\})\s*$", text, re.DOTALL)
25
+ if json_match:
26
+ json_str = json_match.group(1)
27
+ else:
28
+ first_brace = text.find('{')
29
+ last_brace = text.rfind('}')
30
+ if first_brace != -1 and last_brace != -1 and last_brace > first_brace:
31
+ json_str = text[first_brace:last_brace + 1]
32
+ else:
33
+ return {"error": "No JSON block or discernible JSON object found in response."}
34
+
35
+ # Attempt to fix common issues like trailing commas before parsing
36
+ json_str_fixed = re.sub(r',\s*([\}\]])', r'\1', json_str)
37
+
38
+ return json.loads(json_str_fixed)
39
+ except json.JSONDecodeError as e:
40
+ return {"error": f"Invalid JSON in model response: {str(e)}", "problematic_snippet (approx)": json_str_fixed,
41
+ "raw_output": text}
42
+ except Exception as e:
43
+ return {"error": f"An unexpected error occurred during JSON extraction: {str(e)}", "raw_output": text}
44
+
45
+
46
+ def convert_pdf_to_image(pdf_path, page_number=0):
47
+ try:
48
+ if not os.path.exists(pdf_path):
49
+ print(f"Error: PDF file not found at {pdf_path}")
50
+ return None
51
+ doc = fitz.open(pdf_path)
52
+ if not doc.page_count > 0:
53
+ doc.close()
54
+ print(f"Warning: PDF '{os.path.basename(pdf_path)}' has no pages.")
55
+ return None
56
+ if page_number >= doc.page_count:
57
+ page_number = doc.page_count - 1
58
+ print(f"Warning: Requested page {page_number + 1} out of bounds. Using last page ({page_number + 1}).")
59
+
60
+ page = doc.load_page(page_number)
61
+ pix = page.get_pixmap(dpi=200)
62
+
63
+ base_name = os.path.splitext(os.path.basename(pdf_path))[0]
64
+ safe_base_name = re.sub(r'[^\w\-_]', '_', base_name)
65
+ temp_image_path = f"temp_page_{safe_base_name}_{page_number}.png"
66
+
67
+ pix.save(temp_image_path)
68
+ doc.close()
69
+ return temp_image_path
70
+ except Exception as e:
71
+ print(f"Error converting PDF '{os.path.basename(pdf_path)}' to image: {e}")
72
+ return None
73
+
74
+
75
+ def process_document_with_vision_model(image_path):
76
+ if image_path is None:
77
+ return {"error": "No image provided for vision model processing (image_path is None)."}
78
+ if not os.path.exists(image_path):
79
+ return {"error": f"Image file does not exist at path: {image_path}"}
80
+
81
+ try:
82
+ with open(image_path, "rb") as f:
83
+ encoded_image = base64.b64encode(f.read()).decode("utf-8")
84
+
85
+ data_url = f"data:image/png;base64,{encoded_image}"
86
+
87
+ prompt = f"""You are a highly capable AI assistant specialized in document analysis and data extraction.
88
+ Your mission is to meticulously examine the provided image, identify the type of document, and extract all pertinent information into a structured JSON format.
89
+ Your entire response must be a **single, valid JSON object**. Do not include any introductory or concluding text outside of this JSON.
90
+ (Your detailed prompt structure here - ensure it's the same as your working version)
91
+ """
92
+ payload = {
93
+ "model": IMAGE_MODEL,
94
+ "messages": [{"role": "user", "content": [{"type": "text", "text": prompt},
95
+ {"type": "image_url", "image_url": {"url": data_url}}]}],
96
+ "max_tokens": 4096
97
+ }
98
+ headers = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}
99
+ response = requests.post("https://openrouter.ai/api/v1/chat/completions", headers=headers, json=payload,
100
+ timeout=120) # Added timeout
101
+ response.raise_for_status()
102
+ result = response.json()
103
+
104
+ if "choices" in result and len(result["choices"]) > 0 and "message" in result["choices"][0] and "content" in \
105
+ result["choices"][0]["message"]:
106
+ model_raw_output = result["choices"][0]["message"]["content"]
107
+ return extract_json_from_code_block(model_raw_output)
108
+ else:
109
+ print(f"Unexpected API response format: {json.dumps(result, indent=2)}")
110
+ return {"error": "Unexpected API response format from vision model.", "raw_api_response": result}
111
+
112
+ except requests.exceptions.Timeout:
113
+ print("Network Error: Request to OpenRouter API timed out.")
114
+ return {"error": "Network Error: Request to OpenRouter API timed out."}
115
+ except requests.exceptions.RequestException as e:
116
+ print(f"Network Error: {str(e)}")
117
+ return {"error": f"Network Error: {str(e)}"}
118
+ except Exception as e:
119
+ print(
120
+ f"General Error in vision model processing for {os.path.basename(image_path if image_path else 'No Image Path')}: {str(e)}")
121
+ return {"error": f"General Error in vision model processing: {str(e)}"}
122
+
123
+
124
+ # --- Custom CSS for a Modern Dark UI ---
125
+ inspired_dark_css = """
126
+ /* Overall App Container */
127
+ .gradio-container {
128
+ font-family: 'Inter', sans-serif;
129
+ background-color: var(--neutral-950, #0c0c0f); /* Very dark background */
130
+ padding: 0; /* Remove default padding if using full-width sections */
131
+ }
132
+
133
+ /* Main Title Area */
134
+ #app-header {
135
+ background-color: var(--neutral-900, #121218);
136
+ padding: 20px 30px;
137
+ border-bottom: 1px solid var(--neutral-800, #2a2a38);
138
+ margin-bottom: 0px; /* Spacing after header */
139
+ }
140
+ #app-title {
141
+ text-align: center;
142
+ color: var(--primary-400, #A78BFA);
143
+ margin-bottom: 2px;
144
+ font-size: 28px !important;
145
+ font-weight: 600;
146
+ }
147
+ #app-subtitle {
148
+ text-align: center;
149
+ color: var(--neutral-400, #888898);
150
+ margin-top: 0px;
151
+ font-size: 16px !important;
152
+ font-weight: 400;
153
+ }
154
+
155
+ /* Main content row styling */
156
+ #main-content-row {
157
+ padding: 20px 30px; /* Add padding around the main content columns */
158
+ gap: 30px; /* Space between columns */
159
+ }
160
+
161
+ /* "Node" or "Block" Styling for Columns/Sections */
162
+ .input-block, .output-block-column {
163
+ background-color: var(--neutral-900, #121218); /* Slightly lighter than page bg */
164
+ border-radius: 12px;
165
+ padding: 25px;
166
+ border: 1px solid var(--neutral-800, #2a2a38);
167
+ box-shadow: 0 4px 12px rgba(0,0,0, 0.2); /* Subtle shadow for depth */
168
+ height: 100%; /* Make blocks in a row take same height if desired */
169
+ }
170
+ .input-block h4, .output-block-column h4 { /* Section Headers */
171
+ color: var(--neutral-200, #e0e0e0);
172
+ margin-top: 0;
173
+ margin-bottom: 20px;
174
+ font-size: 18px;
175
+ border-bottom: 1px solid var(--neutral-700, #3a3a48);
176
+ padding-bottom: 10px;
177
+ }
178
+
179
+ /* File Input Area */
180
+ .file-input-box > div[data-testid="block-label"] { display: none; } /* Hide default label if custom header is used */
181
+ .file-input-box .upload-box, .file-input-box > .svelte- যাহ코 > .upload-box { /* Target Gradio's file input */
182
+ border: 2px dashed var(--primary-600, #7C3AED);
183
+ background-color: var(--neutral-800, #1a1a22);
184
+ border-radius: 8px;
185
+ padding: 30px;
186
+ color: var(--neutral-300, #c0c0c0);
187
+ }
188
+ .file-input-box .upload-box:hover, .file-input-box > .svelte- যাহ코 > .upload-box:hover {
189
+ background-color: var(--neutral-700, #22222a);
190
+ border-color: var(--primary-500, #8B5CF6);
191
+ }
192
+ .input-block .input-guidance p { /* Styling for help text */
193
+ font-size: 0.85em;
194
+ color: var(--neutral-400, #888898);
195
+ text-align: center;
196
+ margin-top: 15px;
197
+ }
198
+
199
+
200
+ /* Output Tabs Styling */
201
+ .output-block-column .gr-tabs { margin-top: -10px; } /* Adjust if needed */
202
+ .output-block-column .gr-tabs .tab-nav button { /* Tab buttons */
203
+ background-color: transparent !important;
204
+ color: var(--neutral-400, #888898) !important;
205
+ border-radius: 6px 6px 0 0 !important;
206
+ padding: 10px 18px !important;
207
+ border-bottom: 2px solid transparent !important;
208
+ }
209
+ .output-block-column .gr-tabs .tab-nav button.selected { /* Selected tab button */
210
+ color: var(--primary-400, #A78BFA) !important;
211
+ border-bottom: 2px solid var(--primary-400, #A78BFA) !important;
212
+ background-color: var(--neutral-800, #1a1a22) !important; /* Slight bg for selected tab */
213
+ }
214
+ .tab-item-content { /* Content area within each tab */
215
+ background-color: var(--neutral-850, #16161c); /* Slightly different from block bg for depth */
216
+ padding: 20px;
217
+ border-radius: 0 0 8px 8px;
218
+ min-height: 400px; /* Ensure tabs have some content height */
219
+ border: 1px solid var(--neutral-750, #30303c);
220
+ border-top: none;
221
+ }
222
+
223
+ /* Preview Output (PDF/Image) Styling within Tab */
224
+ .preview-output-container { /* Specific container for PDF/Image */
225
+ display: flex;
226
+ align-items: center;
227
+ justify-content: center;
228
+ width: 100%;
229
+ height: 100%; /* Takes height from .tab-item-content */
230
+ }
231
+ .preview-output-container img, .preview-output-container iframe {
232
+ max-width: 100%;
233
+ max-height: 500px; /* Max height for preview */
234
+ object-fit: contain;
235
+ border-radius: 4px;
236
+ background-color: var(--neutral-100, #f0f0f0); /* Light bg for image/pdf itself for visibility */
237
+ }
238
+
239
+ /* JSON Output Styling within Tab */
240
+ .json-output-container .gr-json, .json-output-container .gr-code {
241
+ background-color: var(--neutral-900, #0e0e12) !important; /* Darker for code/json */
242
+ border: 1px solid var(--neutral-700, #3a3a48) !important;
243
+ color: var(--neutral-200, #e0e0e0) !important;
244
+ padding: 15px !important;
245
+ border-radius: 6px !important;
246
+ height: 100% !important;
247
+ font-size: 0.9em !important;
248
+ }
249
+ /* Attempt to make JSON content more readable */
250
+ .json-output-container .gr-json span { color: inherit !important; }
251
+ .json-output-container .gr-json .str { color: #90EE90 !important; } /* LightGreen strings */
252
+ .json-output-container .gr-json .num { color: #ADD8E6 !important; } /* LightBlue numbers */
253
+ .json-output-container .gr-json .bool { color: #FFB6C1 !important; } /* LightPink booleans */
254
+ .json-output-container .gr-json .null { color: #D3D3D3 !important; } /* LightGray nulls */
255
+ .json-output-container .gr-json .key { color: #FFD700 !important; } /* Gold keys */
256
+
257
+ footer{display:none !important}
258
+ """
259
+
260
+
261
+ app_theme = gr.themes.Monochrome(
262
+ primary_hue=gr.themes.Color(
263
+ c50='#F5F3FF', c100='#EDE9FE', c200='#DDD6FE', c300='#C4B5FD', c400='#A78BFA',
264
+ c500='#8B5CF6', c600='#7C3AED', c700='#6D28D9', c800='#5B21B6', c900='#4C1D95',
265
+ c950='#3B0B7D'
266
+ ),
267
+ secondary_hue="purple",
268
+ neutral_hue="slate",
269
+ radius_size=gr.themes.sizes.radius_md,
270
+ font=[gr.themes.GoogleFont("Inter"), "system-ui", "sans-serif"],
271
+ font_mono=[gr.themes.GoogleFont("Fira Code"), "monospace"]
272
+ ).set()
273
+
274
+ app_theme = gr.themes.Monochrome(
275
+ primary_hue=gr.themes.Color("#F5F3FF", "#EDE9FE", "#DDD6FE", "#C4B5FD", "#A78BFA", "#8B5CF6", "#7C3AED", "#6D28D9",
276
+ "#5B21B6", "#4C1D95", "#3B0B7D"),
277
+ secondary_hue=gr.themes.Color("#F5F3FF", "#EDE9FE", "#DDD6FE", "#C4B5FD", "#A78BFA", "#8B5CF6", "#7C3AED",
278
+ "#6D28D9", "#5B21B6", "#4C1D95", "#3B0B7D"), # Align with primary
279
+ neutral_hue=gr.themes.colors.slate,
280
+ radius_size=gr.themes.sizes.radius_md,
281
+ font=[gr.themes.GoogleFont("Inter"), "system-ui", "sans-serif"],
282
+ font_mono=[gr.themes.GoogleFont("Fira Code"), "monospace"],
283
+ )
284
+
285
+ with gr.Blocks(
286
+ theme=app_theme,
287
+ css=inspired_dark_css,
288
+ title="Zimbabwean Document AI Extractor"
289
+ ) as app:
290
+ with gr.Column(elem_id="app-header", scale=0):
291
+ gr.Markdown("<h1 id='app-title'>Zim Docs Optical Character Recognition (OCR)-JSON</h1>", elem_id="title_md")
292
+ gr.Markdown("<h3 id='app-subtitle'>Effortlessly convert scanned documents and images into ready-to-use JSON data. </h3>",
293
+ elem_id="subtitle_md")
294
+
295
+ with gr.Row(elem_id="main-content-row", equal_height=True):
296
+ with gr.Column(scale=1, min_width=400, elem_classes=["input-block"]):
297
+ gr.Markdown("<h4>📂 OCR → JSON</h4>")
298
+ file_input = gr.File(
299
+ label="Drag & Drop or Click to Upload (PDF, PNG, JPG)",
300
+ file_types=[".pdf", ".png", ".jpg", ".jpeg", ".bmp", ".gif"],
301
+ type="filepath",
302
+ elem_classes=["file-input-box"]
303
+ )
304
+
305
+ with gr.Group(elem_classes=["input-guidance"]):
306
+ gr.Markdown(
307
+ """
308
+ <p>Supported: PDF, PNG, JPG, JPEG, BMP, GIF.<br>
309
+ For optimal results, ensure the document image is clear and well-lit.</p>
310
+ """
311
+ )
312
+
313
+
314
+ with gr.Column(scale=2, min_width=600, elem_classes=["output-block-column"]):
315
+ gr.Markdown("<h4>Extraction Results</h4>")
316
+ with gr.Tabs(elem_id="output_tabs"):
317
+ with gr.TabItem("📄 Document Preview", elem_id="preview_tab", elem_classes=["tab-item-content"]):
318
+ with gr.Group(elem_classes=["preview-output-container"]):
319
+ pdf_output = PDF(visible=False, show_label=False, elem_classes=["preview-output-item"])
320
+ image_output = gr.Image(visible=False, show_label=False, show_share_button=False,
321
+ show_download_button=True, elem_classes=["preview-output-item"])
322
+ no_preview_message = gr.Markdown("Upload a document to see a preview.", visible=True,
323
+ elem_id="no_preview_msg")
324
+
325
+ with gr.TabItem("Extracted Data (JSON)", elem_id="json_tab", elem_classes=["tab-item-content"]):
326
+ with gr.Group(elem_classes=["json-output-container"]):
327
+ json_output = gr.JSON(visible=False, show_label=False, elem_classes=["json-output-item"])
328
+ no_json_message = gr.Markdown("Analysis results will appear here.", visible=True,
329
+ elem_id="no_json_msg")
330
+
331
+
332
+ def update_outputs_and_previews(file_path_str):
333
+ pdf_val, pdf_vis_update = None, gr.update(visible=False)
334
+ img_val, img_vis_update = None, gr.update(visible=False)
335
+ json_val, json_vis_update = {"status": "Awaiting document..."}, gr.update(visible=False)
336
+ no_preview_msg_update = gr.update(visible=True, value="Upload a document to see a preview.")
337
+ no_json_msg_update = gr.update(visible=True, value="Analysis results will appear here.")
338
+
339
+ if file_path_str is None:
340
+ json_val = {"status": "No document provided. Please upload a file."}
341
+ return pdf_val, pdf_vis_update, img_val, img_vis_update, json_val, json_vis_update, no_preview_msg_update, no_json_msg_update
342
+
343
+
344
+ temp_image_to_process = None
345
+ pdf_display_path = None
346
+ image_display_path = None
347
+ delete_temp_file = False
348
+
349
+ current_file_path = file_path_str
350
+
351
+ if current_file_path.lower().endswith('.pdf'):
352
+ pdf_display_path = current_file_path
353
+ temp_image_to_process = convert_pdf_to_image(current_file_path)
354
+ if temp_image_to_process is None:
355
+ error_msg = {"error": f"Failed to convert PDF: {os.path.basename(current_file_path)}."}
356
+ print(error_msg["error"])
357
+ pdf_val, pdf_vis_update = pdf_display_path, gr.update(visible=True)
358
+ img_val, img_vis_update = None, gr.update(visible=False)
359
+ json_val, json_vis_update = error_msg, gr.update(visible=True)
360
+ no_preview_msg_update = gr.update(visible=False)
361
+ no_json_msg_update = gr.update(visible=False)
362
+ return pdf_val, pdf_vis_update, img_val, img_vis_update, json_val, json_vis_update, no_preview_msg_update, no_json_msg_update
363
+ delete_temp_file = True
364
+ pdf_val, pdf_vis_update = pdf_display_path, gr.update(visible=True)
365
+ no_preview_msg_update = gr.update(visible=False)
366
+
367
+ elif current_file_path.lower().endswith(('.png', '.jpg', '.jpeg', '.bmp', '.gif')):
368
+ image_display_path = current_file_path
369
+ temp_image_to_process = current_file_path
370
+ img_val, img_vis_update = image_display_path, gr.update(visible=True)
371
+ no_preview_msg_update = gr.update(visible=False)
372
+ else:
373
+ error_msg = {"error": "Unsupported file format. Please upload PDF, PNG, JPG, JPEG, BMP, or GIF."}
374
+ print(error_msg["error"])
375
+ json_val, json_vis_update = error_msg, gr.update(visible=True)
376
+ no_json_msg_update = gr.update(visible=False)
377
+ return pdf_val, pdf_vis_update, img_val, img_vis_update, json_val, json_vis_update, no_preview_msg_update, no_json_msg_update
378
+
379
+ if temp_image_to_process is None:
380
+ error_msg = {"error": "Internal error: No image available for processing after file check."}
381
+ print(error_msg["error"])
382
+ json_val, json_vis_update = error_msg, gr.update(visible=True)
383
+ no_json_msg_update = gr.update(visible=False)
384
+ return pdf_val, pdf_vis_update, img_val, img_vis_update, json_val, json_vis_update, no_preview_msg_update, no_json_msg_update
385
+
386
+ extracted_json_result = process_document_with_vision_model(temp_image_to_process)
387
+ json_val, json_vis_update = extracted_json_result, gr.update(visible=True)
388
+ no_json_msg_update = gr.update(visible=False)
389
+
390
+ if delete_temp_file and temp_image_to_process and os.path.exists(
391
+ temp_image_to_process) and temp_image_to_process != current_file_path:
392
+ try:
393
+ os.remove(temp_image_to_process)
394
+ print(f"Temporary image '{temp_image_to_process}' deleted.")
395
+ except Exception as e:
396
+ print(f"Error deleting temporary image '{temp_image_to_process}': {e}")
397
+
398
+
399
+ if pdf_display_path:
400
+ img_vis_update = gr.update(visible=False)
401
+ img_val = None
402
+ elif image_display_path:
403
+ pdf_vis_update = gr.update(visible=False)
404
+ pdf_val = None
405
+
406
+ return pdf_val, pdf_vis_update, img_val, img_vis_update, json_val, json_vis_update, no_preview_msg_update, no_json_msg_update
407
+
408
+
409
+ all_outputs = [
410
+ pdf_output, pdf_output,
411
+ image_output, image_output,
412
+ json_output, json_output,
413
+ no_preview_message,
414
+ no_json_message
415
+ ]
416
+
417
+ file_input.change(
418
+ update_outputs_and_previews,
419
+ inputs=[file_input],
420
+ outputs=all_outputs
421
+ )
422
+
423
+ if __name__ == "__main__":
424
+ app.launch(show_error=True, show_api=False, debug=True)
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ gradio
2
+ gradio-pdf
3
+ PyMuPDF
4
+ requests