diff --git "a/topic_extraction.log" "b/topic_extraction.log" --- "a/topic_extraction.log" +++ "b/topic_extraction.log" @@ -1,7719 +0,0 @@ -2025-02-28 15:26:46,260 [INFO] __main__ - Processing PDF: /home/user/app/input_output/AQA-7357-SP-2017.PDF -2025-02-28 15:26:46,367 [ERROR] __main__ - Gemini subtopic extraction error: 'NoneType' object has no attribute 'models' -2025-02-28 15:26:46,368 [INFO] __main__ - Gemini returned subtopics: {} -2025-02-28 15:26:46,368 [INFO] __main__ - Loaded 1035984 bytes from local file '/home/user/app/input_output/AQA-7357-SP-2017.PDF' -2025-02-28 15:26:46,374 [WARNING] __main__ - No subtopics found. Processing entire PDF as fallback. -2025-02-28 15:26:46,375 [INFO] __main__ - Processing pages (0-based): [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49] -2025-02-28 15:26:47,737 [INFO] __main__ - GPU memory cleaned up. -2025-02-28 15:27:16,530 [INFO] __main__ - Processing PDF: /home/user/app/input_output/ocr-specification-economics.pdf -2025-02-28 15:27:16,579 [ERROR] __main__ - Gemini subtopic extraction error: 'NoneType' object has no attribute 'models' -2025-02-28 15:27:16,579 [INFO] __main__ - Gemini returned subtopics: {} -2025-02-28 15:27:16,585 [INFO] __main__ - Loaded 9752567 bytes from local file '/home/user/app/input_output/ocr-specification-economics.pdf' -2025-02-28 15:27:16,588 [WARNING] __main__ - No subtopics found. Processing entire PDF as fallback. -2025-02-28 15:27:16,588 [INFO] __main__ - Processing pages (0-based): [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-02-28 15:27:17,564 [INFO] __main__ - GPU memory cleaned up. -2025-02-28 15:31:39,420 [INFO] __main__ - Processing PDF: /home/user/app/input_output/AQA-7357-SP-2017.PDF -2025-02-28 15:31:41,827 [INFO] __main__ - Gemini returned subtopics: {'A: Proof': [12, 12], 'B: Algebra and functions': [13, 13], 'C: Coordinate geometry in the ( x , y ) plane': [14, 14], 'D: Sequences and series': [15, 15], 'E: Trigonometry': [16, 16], 'F: Exponentials and logarithms': [17, 17], 'G: Differentiation': [18, 18], 'H: Integration': [19, 19], 'I: Numerical methods': [20, 20], 'J: Vectors': [20, 20], 'K: Statistical sampling': [21, 21], 'L: Data presentation and interpretation': [21, 21], 'M: Probability': [22, 22], 'N: Statistical distributions': [23, 22], 'O: Statistical hypothesis testing': [23, 23], 'P: Quantities and units in mechanics': [24, 23], 'Q: Kinematics': [24, 23], 'R: Forces and Newton’s laws': [24, 24], 'S: Moments': [25, 25]} -2025-02-28 15:31:41,828 [INFO] __main__ - Loaded 1035984 bytes from local file '/home/user/app/input_output/AQA-7357-SP-2017.PDF' -2025-02-28 15:31:41,986 [INFO] __main__ - Occurrences of subtopic 'A: Proof': [2, 6, 7, 11] -2025-02-28 15:31:42,138 [INFO] __main__ - Occurrences of subtopic 'B: Algebra and functions': [2, 6, 7, 12] -2025-02-28 15:31:42,297 [INFO] __main__ - Occurrences of subtopic 'C: Coordinate geometry in the ( x , y ) plane': [2, 6, 13] -2025-02-28 15:31:42,445 [INFO] __main__ - Occurrences of subtopic 'D: Sequences and series': [2, 6, 7, 14] -2025-02-28 15:31:42,594 [INFO] __main__ - Occurrences of subtopic 'E: Trigonometry': [2, 6, 7, 15] -2025-02-28 15:31:42,788 [INFO] __main__ - Occurrences of subtopic 'F: Exponentials and logarithms': [2, 6, 7, 16] -2025-02-28 15:31:42,936 [INFO] __main__ - Occurrences of subtopic 'G: Differentiation': [2, 6, 7, 17] -2025-02-28 15:31:43,087 [INFO] __main__ - Occurrences of subtopic 'H: Integration': [2, 6, 7, 18] -2025-02-28 15:31:43,238 [INFO] __main__ - Occurrences of subtopic 'I: Numerical methods': [2, 6, 7, 19] -2025-02-28 15:31:43,558 [INFO] __main__ - GPU memory cleaned up. -2025-02-28 15:32:03,728 [INFO] __main__ - Processing PDF: /home/user/app/input_output/AQA-7357-SP-2017.PDF -2025-02-28 15:32:06,129 [INFO] __main__ - Gemini returned subtopics: {'A: Proof': [12, 12], 'B: Algebra and functions': [13, 13], 'C: Coordinate geometry in the ( x , y ) plane': [14, 14], 'D: Sequences and series': [15, 15], 'E: Trigonometry': [16, 16], 'F: Exponentials and logarithms': [17, 17], 'G: Differentiation': [18, 18], 'H: Integration': [19, 19], 'I: Numerical methods': [20, 19], 'J: Vectors': [20, 20], 'K: Statistical sampling': [21, 20], 'L: Data presentation and interpretation': [21, 21], 'M: Probability': [22, 22], 'N: Statistical distributions': [23, 22], 'O: Statistical hypothesis testing': [23, 23], 'P: Quantities and units in mechanics': [24, 23], 'Q: Kinematics': [24, 23], 'R: Forces and Newton’s laws': [24, 24], 'S: Moments': [25, 25]} -2025-02-28 15:32:06,129 [INFO] __main__ - Loaded 1035984 bytes from local file '/home/user/app/input_output/AQA-7357-SP-2017.PDF' -2025-02-28 15:32:06,284 [INFO] __main__ - Occurrences of subtopic 'A: Proof': [2, 6, 7, 11] -2025-02-28 15:32:06,432 [INFO] __main__ - Occurrences of subtopic 'B: Algebra and functions': [2, 6, 7, 12] -2025-02-28 15:32:06,583 [INFO] __main__ - Occurrences of subtopic 'C: Coordinate geometry in the ( x , y ) plane': [2, 6, 13] -2025-02-28 15:32:06,732 [INFO] __main__ - Occurrences of subtopic 'D: Sequences and series': [2, 6, 7, 14] -2025-02-28 15:32:06,882 [INFO] __main__ - Occurrences of subtopic 'E: Trigonometry': [2, 6, 7, 15] -2025-02-28 15:32:07,034 [INFO] __main__ - Occurrences of subtopic 'F: Exponentials and logarithms': [2, 6, 7, 16] -2025-02-28 15:32:07,184 [INFO] __main__ - Occurrences of subtopic 'G: Differentiation': [2, 6, 7, 17] -2025-02-28 15:32:07,332 [INFO] __main__ - Occurrences of subtopic 'H: Integration': [2, 6, 7, 18] -2025-02-28 15:32:07,333 [WARNING] __main__ - Skipping subtopic 'I: Numerical methods' => start > end [20, 19] -2025-02-28 15:32:07,481 [INFO] __main__ - Occurrences of subtopic 'J: Vectors': [2, 6, 7, 19] -2025-02-28 15:32:07,481 [WARNING] __main__ - Skipping subtopic 'K: Statistical sampling' => start > end [21, 20] -2025-02-28 15:32:07,632 [INFO] __main__ - Occurrences of subtopic 'L: Data presentation and interpretation': [2, 6, 8, 20] -2025-02-28 15:32:07,780 [INFO] __main__ - Occurrences of subtopic 'M: Probability': [2, 6, 8, 21] -2025-02-28 15:32:07,781 [WARNING] __main__ - Skipping subtopic 'N: Statistical distributions' => start > end [23, 22] -2025-02-28 15:32:07,971 [INFO] __main__ - Occurrences of subtopic 'O: Statistical hypothesis testing': [2, 6, 8, 22] -2025-02-28 15:32:07,971 [WARNING] __main__ - Skipping subtopic 'P: Quantities and units in mechanics' => start > end [24, 23] -2025-02-28 15:32:07,971 [WARNING] __main__ - Skipping subtopic 'Q: Kinematics' => start > end [24, 23] -2025-02-28 15:32:08,132 [INFO] __main__ - Occurrences of subtopic 'R: Forces and Newton’s laws': [2, 6, 7, 23] -2025-02-28 15:32:08,285 [INFO] __main__ - Occurrences of subtopic 'S: Moments': [2, 6, 7, 24] -2025-02-28 15:32:08,285 [INFO] __main__ - Processing pages (0-based): [11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24] -2025-02-28 15:32:12,066 [INFO] __main__ - GPU memory cleaned up. -2025-02-28 15:33:29,098 [INFO] __main__ - Processing PDF: /home/user/app/input_output/pearson-as-mathematics-specification.pdf -2025-02-28 15:33:29,851 [INFO] __main__ - Gemini returned subtopics: {'Paper 1: Pure Mathematics': [11, 19], 'Paper 2: Statistics and Mechanics': [20, 27]} -2025-02-28 15:33:29,851 [INFO] __main__ - Loaded 1039025 bytes from local file '/home/user/app/input_output/pearson-as-mathematics-specification.pdf' -2025-02-28 15:33:30,023 [INFO] __main__ - Occurrences of subtopic 'Paper 1: Pure Mathematics': [4, 8, 14, 30] -2025-02-28 15:33:30,184 [INFO] __main__ - Occurrences of subtopic 'Paper 2: Statistics and Mechanics': [4, 9, 13, 23, 30] -2025-02-28 15:33:30,184 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30] -2025-02-28 15:33:33,670 [INFO] __main__ - GPU memory cleaned up. -2025-02-28 15:34:00,961 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf -2025-02-28 15:34:01,772 [INFO] __main__ - Gemini returned subtopics: {'AS Unit 1': [11, 17], 'AS Unit 2': [18, 23], 'A2 Unit 3': [24, 30], 'A2 Unit 4': [31, 35]} -2025-02-28 15:34:01,773 [INFO] __main__ - Loaded 1510568 bytes from local file '/home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf' -2025-02-28 15:34:01,961 [INFO] __main__ - Occurrences of subtopic 'AS Unit 1': [3, 4, 38, 40] -2025-02-28 15:34:02,141 [INFO] __main__ - Occurrences of subtopic 'AS Unit 2': [3, 4, 38, 40] -2025-02-28 15:34:02,320 [INFO] __main__ - Occurrences of subtopic 'A2 Unit 3': [3, 5, 38, 40] -2025-02-28 15:34:02,497 [INFO] __main__ - Occurrences of subtopic 'A2 Unit 4': [3, 5, 38, 40] -2025-02-28 15:34:02,497 [INFO] __main__ - Processing pages (0-based): [38, 39, 40, 41, 42, 43, 44] -2025-02-28 15:34:03,341 [INFO] __main__ - GPU memory cleaned up. -2025-02-28 15:34:26,135 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf -2025-02-28 15:34:27,000 [INFO] __main__ - Gemini returned subtopics: {'AS Unit 1': [11, 17], 'AS Unit 2': [18, 23], 'A2 Unit 3': [24, 30], 'A2 Unit 4': [31, 35]} -2025-02-28 15:34:27,001 [INFO] __main__ - Loaded 1510568 bytes from local file '/home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf' -2025-02-28 15:34:27,187 [INFO] __main__ - Occurrences of subtopic 'AS Unit 1': [3, 4, 38, 40] -2025-02-28 15:34:27,374 [INFO] __main__ - Occurrences of subtopic 'AS Unit 2': [3, 4, 38, 40] -2025-02-28 15:34:27,564 [INFO] __main__ - Occurrences of subtopic 'A2 Unit 3': [3, 5, 38, 40] -2025-02-28 15:34:27,747 [INFO] __main__ - Occurrences of subtopic 'A2 Unit 4': [3, 5, 38, 40] -2025-02-28 15:34:27,748 [INFO] __main__ - Processing pages (0-based): [38, 39, 40, 41, 42, 43, 44] -2025-02-28 15:34:33,618 [INFO] __main__ - GPU memory cleaned up. -2025-02-28 15:36:35,445 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf -2025-02-28 15:36:36,512 [INFO] __main__ - Gemini returned subtopics: {'2.1 AS Unit 1': [11, 17], '2.2 AS Unit 2': [18, 23], '2.3 A2 Unit 3': [24, 30], '2.4 A2 Unit 4': [31, 35]} -2025-02-28 15:36:36,513 [INFO] __main__ - Loaded 1510568 bytes from local file '/home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf' -2025-02-28 15:36:36,820 [INFO] __main__ - Occurrences of subtopic '2.1 AS Unit 1': [3] -2025-02-28 15:36:36,820 [WARNING] __main__ - No suitable occurrence for '2.1 AS Unit 1'. Using page 3. -2025-02-28 15:36:37,023 [INFO] __main__ - Occurrences of subtopic '2.2 AS Unit 2': [3] -2025-02-28 15:36:37,024 [WARNING] __main__ - No suitable occurrence for '2.2 AS Unit 2'. Using page 3. -2025-02-28 15:36:37,207 [INFO] __main__ - Occurrences of subtopic '2.3 A2 Unit 3': [3] -2025-02-28 15:36:37,207 [WARNING] __main__ - No suitable occurrence for '2.3 A2 Unit 3'. Using page 3. -2025-02-28 15:36:37,389 [INFO] __main__ - Occurrences of subtopic '2.4 A2 Unit 4': [3] -2025-02-28 15:36:37,389 [WARNING] __main__ - No suitable occurrence for '2.4 A2 Unit 4'. Using page 3. -2025-02-28 15:36:37,390 [INFO] __main__ - Processing pages (0-based): [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34] -2025-02-28 15:36:38,518 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 14:46:32,526 [INFO] __main__ - Processing PDF: /home/user/app/input_output/ocr-specification-economics.pdf -2025-03-02 14:46:33,659 [INFO] __main__ - Gemini returned subtopics: {'Content of A Level in Economics (H460)': [5, 5], 'Content of Component 1: Microeconomics (H460/01)': [6, 16], 'Content of Component 2: Macroeconomics (H460/02)': [17, 27], 'Content of Component 3: Themes in economics (H460/03)': [28, 28]} -2025-03-02 14:46:33,667 [INFO] __main__ - Loaded 9752567 bytes from local file '/home/user/app/input_output/ocr-specification-economics.pdf' -2025-03-02 14:46:33,882 [INFO] __main__ - Occurrences of subtopic 'Content of A Level in Economics (H460)': [2, 10] -2025-03-02 14:46:34,025 [INFO] __main__ - Occurrences of subtopic 'Content of Component 1: Microeconomics (H460/01)': [2, 11] -2025-03-02 14:46:34,172 [INFO] __main__ - Occurrences of subtopic 'Content of Component 2: Macroeconomics (H460/02)': [2, 22] -2025-03-02 14:46:34,318 [INFO] __main__ - Occurrences of subtopic 'Content of Component 3: Themes in economics (H460/03)': [2, 33] -2025-03-02 14:46:34,319 [INFO] __main__ - Processing pages (0-based): [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33] -2025-03-02 14:49:00,397 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 14:49:46,666 [INFO] __main__ - Processing PDF: /home/user/app/input_output/ocr-specification-economics.pdf -2025-03-02 14:49:47,756 [INFO] __main__ - Gemini returned subtopics: {'Content of A Level in Economics (H460)': [5, 5], 'Content of Component 1: Microeconomics (H460/01)': [6, 16], 'Content of Component 2: Macroeconomics (H460/02)': [17, 27], 'Content of Component 3: Themes in economics (H460/03)': [28, 28]} -2025-03-02 14:49:47,762 [INFO] __main__ - Loaded 9752567 bytes from local file '/home/user/app/input_output/ocr-specification-economics.pdf' -2025-03-02 14:49:47,909 [INFO] __main__ - Occurrences of subtopic 'Content of A Level in Economics (H460)': [2, 10] -2025-03-02 14:49:48,062 [INFO] __main__ - Occurrences of subtopic 'Content of Component 1: Microeconomics (H460/01)': [2, 11] -2025-03-02 14:49:48,290 [INFO] __main__ - Occurrences of subtopic 'Content of Component 2: Macroeconomics (H460/02)': [2, 22] -2025-03-02 14:49:48,475 [INFO] __main__ - Occurrences of subtopic 'Content of Component 3: Themes in economics (H460/03)': [2, 33] -2025-03-02 14:49:48,475 [INFO] __main__ - Processing pages (0-based): [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33] -2025-03-02 14:50:20,455 [INFO] __main__ - doc_analyze complete. Extracting images. -2025-03-02 14:50:23,117 [INFO] __main__ - Uploaded to S3: topic_extraction/img_1.jpg -2025-03-02 14:50:23,774 [INFO] __main__ - Uploaded to S3: topic_extraction/img_2.jpg -2025-03-02 14:50:24,317 [INFO] __main__ - Uploaded to S3: topic_extraction/img_3.jpg -2025-03-02 14:50:24,869 [INFO] __main__ - Uploaded to S3: topic_extraction/img_4.jpg -2025-03-02 14:50:25,434 [INFO] __main__ - Uploaded to S3: topic_extraction/img_5.jpg -2025-03-02 14:50:26,016 [INFO] __main__ - Uploaded to S3: topic_extraction/img_6.jpg -2025-03-02 14:50:26,595 [INFO] __main__ - Uploaded to S3: topic_extraction/img_7.jpg -2025-03-02 14:50:27,181 [INFO] __main__ - Uploaded to S3: topic_extraction/img_8.jpg -2025-03-02 14:50:27,789 [INFO] __main__ - Uploaded to S3: topic_extraction/img_9.jpg -2025-03-02 14:50:28,428 [INFO] __main__ - Uploaded to S3: topic_extraction/img_10.jpg -2025-03-02 14:50:28,805 [INFO] __main__ - Uploaded to S3: topic_extraction/img_11.jpg -2025-03-02 14:50:29,410 [INFO] __main__ - Uploaded to S3: topic_extraction/img_12.jpg -2025-03-02 14:50:30,093 [INFO] __main__ - Uploaded to S3: topic_extraction/img_13.jpg -2025-03-02 14:50:30,713 [INFO] __main__ - Uploaded to S3: topic_extraction/img_14.jpg -2025-03-02 14:50:31,163 [INFO] __main__ - Uploaded to S3: topic_extraction/img_15.jpg -2025-03-02 14:50:31,781 [INFO] __main__ - Uploaded to S3: topic_extraction/img_16.jpg -2025-03-02 14:50:32,235 [INFO] __main__ - Uploaded to S3: topic_extraction/img_17.jpg -2025-03-02 14:50:32,803 [INFO] __main__ - Uploaded to S3: topic_extraction/img_18.jpg -2025-03-02 14:50:33,387 [INFO] __main__ - Uploaded to S3: topic_extraction/img_19.jpg -2025-03-02 14:50:34,032 [INFO] __main__ - Uploaded to S3: topic_extraction/img_20.jpg -2025-03-02 14:50:34,618 [INFO] __main__ - Uploaded to S3: topic_extraction/img_21.jpg -2025-03-02 14:50:35,143 [INFO] __main__ - Uploaded to S3: topic_extraction/img_22.jpg -2025-03-02 14:50:35,616 [INFO] __main__ - Classifying images to detect tables. -2025-03-02 14:51:05,394 [INFO] __main__ - Processing table image: topic_extraction/img_1.jpg, columns=three -2025-03-02 14:51:09,026 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_1_row0_col0.jpg -2025-03-02 14:51:09,310 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_1_row0_col1.jpg -2025-03-02 14:51:09,522 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_1_row1_col0.jpg -2025-03-02 14:51:09,737 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_1_row2_col0.jpg -2025-03-02 14:51:09,738 [INFO] __main__ - Processing table image: topic_extraction/img_2.jpg, columns=three -2025-03-02 14:51:13,984 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row0_col0.jpg -2025-03-02 14:51:14,273 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row0_col1.jpg -2025-03-02 14:51:14,486 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row1_col0.jpg -2025-03-02 14:51:14,695 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row2_col0.jpg -2025-03-02 14:51:14,888 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row2_col1.jpg -2025-03-02 14:51:15,097 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row3_col0.jpg -2025-03-02 14:51:15,097 [INFO] __main__ - Processing table image: topic_extraction/img_3.jpg, columns=three -2025-03-02 14:51:18,329 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_3_row0_col0.jpg -2025-03-02 14:51:18,633 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_3_row0_col1.jpg -2025-03-02 14:51:18,633 [INFO] __main__ - Processing table image: topic_extraction/img_4.jpg, columns=three -2025-03-02 14:51:22,434 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_4_row0_col0.jpg -2025-03-02 14:51:22,707 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_4_row0_col1.jpg -2025-03-02 14:51:22,708 [INFO] __main__ - Processing table image: topic_extraction/img_5.jpg, columns=three -2025-03-02 14:51:26,211 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_5_row0_col0.jpg -2025-03-02 14:51:26,504 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_5_row0_col1.jpg -2025-03-02 14:51:26,710 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_5_row1_col0.jpg -2025-03-02 14:51:26,710 [INFO] __main__ - Processing table image: topic_extraction/img_6.jpg, columns=three -2025-03-02 14:51:30,866 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_6_row0_col0.jpg -2025-03-02 14:51:31,155 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_6_row0_col1.jpg -2025-03-02 14:51:31,156 [INFO] __main__ - Processing table image: topic_extraction/img_7.jpg, columns=three -2025-03-02 14:51:34,974 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_7_row0_col0.jpg -2025-03-02 14:51:35,285 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_7_row0_col1.jpg -2025-03-02 14:51:35,286 [INFO] __main__ - Processing table image: topic_extraction/img_8.jpg, columns=three -2025-03-02 14:51:39,778 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_8_row0_col0.jpg -2025-03-02 14:51:40,072 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_8_row0_col1.jpg -2025-03-02 14:51:40,072 [INFO] __main__ - Processing table image: topic_extraction/img_9.jpg, columns=three -2025-03-02 14:51:44,011 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row0_col0.jpg -2025-03-02 14:51:44,302 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row0_col1.jpg -2025-03-02 14:51:44,531 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row1_col0.jpg -2025-03-02 14:51:44,775 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row2_col0.jpg -2025-03-02 14:51:44,961 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row2_col1.jpg -2025-03-02 14:51:45,180 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row3_col0.jpg -2025-03-02 14:51:45,181 [INFO] __main__ - Processing table image: topic_extraction/img_10.jpg, columns=three -2025-03-02 14:51:49,515 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row0_col0.jpg -2025-03-02 14:51:49,817 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row0_col1.jpg -2025-03-02 14:51:50,041 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row0_col2.jpg -2025-03-02 14:51:50,258 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row1_col0.jpg -2025-03-02 14:51:50,504 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row1_col1.jpg -2025-03-02 14:51:50,693 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row2_col0.jpg -2025-03-02 14:51:50,910 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row2_col1.jpg -2025-03-02 14:51:51,124 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row3_col0.jpg -2025-03-02 14:51:51,347 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row3_col1.jpg -2025-03-02 14:51:51,347 [INFO] __main__ - Processing table image: topic_extraction/img_11.jpg, columns=three -2025-03-02 14:51:53,781 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_11_row0_col0.jpg -2025-03-02 14:51:54,064 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_11_row0_col1.jpg -2025-03-02 14:51:54,272 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_11_row1_col0.jpg -2025-03-02 14:51:54,272 [INFO] __main__ - Processing table image: topic_extraction/img_12.jpg, columns=three -2025-03-02 14:51:57,509 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_12_row0_col0.jpg -2025-03-02 14:51:57,808 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_12_row0_col1.jpg -2025-03-02 14:51:57,808 [INFO] __main__ - Processing table image: topic_extraction/img_13.jpg, columns=three -2025-03-02 14:52:02,191 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_13_row0_col0.jpg -2025-03-02 14:52:02,483 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_13_row0_col1.jpg -2025-03-02 14:52:02,688 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_13_row1_col0.jpg -2025-03-02 14:52:02,689 [INFO] __main__ - Processing table image: topic_extraction/img_14.jpg, columns=three -2025-03-02 14:52:06,735 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row0_col0.jpg -2025-03-02 14:52:07,026 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row0_col1.jpg -2025-03-02 14:52:07,272 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row1_col0.jpg -2025-03-02 14:52:07,462 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row2_col0.jpg -2025-03-02 14:52:07,680 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row2_col1.jpg -2025-03-02 14:52:07,680 [INFO] __main__ - Processing table image: topic_extraction/img_15.jpg, columns=three -2025-03-02 14:52:11,136 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row0_col0.jpg -2025-03-02 14:52:11,398 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row0_col1.jpg -2025-03-02 14:52:11,588 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row1_col0.jpg -2025-03-02 14:52:11,803 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row2_col0.jpg -2025-03-02 14:52:12,023 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row3_col0.jpg -2025-03-02 14:52:12,023 [INFO] __main__ - Processing table image: topic_extraction/img_16.jpg, columns=three -2025-03-02 14:52:16,429 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_16_row0_col0.jpg -2025-03-02 14:52:16,715 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_16_row0_col1.jpg -2025-03-02 14:52:16,715 [INFO] __main__ - Processing table image: topic_extraction/img_17.jpg, columns=three -2025-03-02 14:52:19,989 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_17_row0_col0.jpg -2025-03-02 14:52:20,300 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_17_row0_col1.jpg -2025-03-02 14:52:20,512 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_17_row1_col0.jpg -2025-03-02 14:52:20,513 [INFO] __main__ - Processing table image: topic_extraction/img_18.jpg, columns=three -2025-03-02 14:52:24,235 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_18_row0_col0.jpg -2025-03-02 14:52:24,536 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_18_row0_col1.jpg -2025-03-02 14:52:24,744 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_18_row1_col0.jpg -2025-03-02 14:52:24,744 [INFO] __main__ - Processing table image: topic_extraction/img_19.jpg, columns=three -2025-03-02 14:52:28,508 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_19_row0_col0.jpg -2025-03-02 14:52:28,795 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_19_row0_col1.jpg -2025-03-02 14:52:29,017 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_19_row1_col0.jpg -2025-03-02 14:52:29,018 [INFO] __main__ - Processing table image: topic_extraction/img_20.jpg, columns=three -2025-03-02 14:52:33,326 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_20_row0_col0.jpg -2025-03-02 14:52:33,636 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_20_row0_col1.jpg -2025-03-02 14:52:33,636 [INFO] __main__ - Processing table image: topic_extraction/img_21.jpg, columns=three -2025-03-02 14:52:37,439 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_21_row0_col0.jpg -2025-03-02 14:52:37,439 [INFO] __main__ - Processing table image: topic_extraction/img_22.jpg, columns=three -2025-03-02 14:52:40,149 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_22_row0_col0.jpg -2025-03-02 14:52:40,422 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_22_row0_col1.jpg -2025-03-02 14:52:40,633 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_22_row1_col0.jpg -2025-03-02 14:52:40,815 [INFO] __main__ - Uploaded to S3: topic_extraction/final_output.md -2025-03-02 14:52:40,815 [INFO] __main__ - Final markdown uploaded to S3 at topic_extraction/final_output.md -2025-03-02 14:52:41,142 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 14:52:41,149 [INFO] __main__ - Processing completed successfully. -2025-03-02 14:58:58,095 [INFO] __main__ - Processing PDF: /home/user/app/input_output/ocr-specification-economics.pdf -2025-03-02 14:58:59,186 [INFO] __main__ - Gemini returned subtopics: {'Content of A Level in Economics (H460)': [5, 5], 'Content of Component 1: Microeconomics (H460/01)': [6, 16], 'Content of Component 2: Macroeconomics (H460/02)': [17, 27], 'Content of Component 3: Themes in economics (H460/03)': [28, 28]} -2025-03-02 14:58:59,197 [INFO] __main__ - Loaded 9752567 bytes from local file '/home/user/app/input_output/ocr-specification-economics.pdf' -2025-03-02 14:58:59,457 [INFO] __main__ - Occurrences of subtopic 'Content of A Level in Economics (H460)': [2, 10] -2025-03-02 14:58:59,709 [INFO] __main__ - Occurrences of subtopic 'Content of Component 1: Microeconomics (H460/01)': [2, 11] -2025-03-02 14:58:59,957 [INFO] __main__ - Occurrences of subtopic 'Content of Component 2: Macroeconomics (H460/02)': [2, 22] -2025-03-02 14:59:00,181 [INFO] __main__ - Occurrences of subtopic 'Content of Component 3: Themes in economics (H460/03)': [2, 33] -2025-03-02 14:59:00,181 [INFO] __main__ - Processing pages (0-based): [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33] -2025-03-02 14:59:33,938 [INFO] __main__ - doc_analyze complete. Extracting images. -2025-03-02 14:59:36,525 [INFO] __main__ - Uploaded to S3: topic_extraction/img_1.jpg -2025-03-02 14:59:37,279 [INFO] __main__ - Uploaded to S3: topic_extraction/img_2.jpg -2025-03-02 14:59:37,821 [INFO] __main__ - Uploaded to S3: topic_extraction/img_3.jpg -2025-03-02 14:59:38,429 [INFO] __main__ - Uploaded to S3: topic_extraction/img_4.jpg -2025-03-02 14:59:38,965 [INFO] __main__ - Uploaded to S3: topic_extraction/img_5.jpg -2025-03-02 14:59:39,595 [INFO] __main__ - Uploaded to S3: topic_extraction/img_6.jpg -2025-03-02 14:59:40,173 [INFO] __main__ - Uploaded to S3: topic_extraction/img_7.jpg -2025-03-02 14:59:40,919 [INFO] __main__ - Uploaded to S3: topic_extraction/img_8.jpg -2025-03-02 14:59:41,487 [INFO] __main__ - Uploaded to S3: topic_extraction/img_9.jpg -2025-03-02 14:59:42,093 [INFO] __main__ - Uploaded to S3: topic_extraction/img_10.jpg -2025-03-02 14:59:42,494 [INFO] __main__ - Uploaded to S3: topic_extraction/img_11.jpg -2025-03-02 14:59:43,010 [INFO] __main__ - Uploaded to S3: topic_extraction/img_12.jpg -2025-03-02 14:59:43,608 [INFO] __main__ - Uploaded to S3: topic_extraction/img_13.jpg -2025-03-02 14:59:44,159 [INFO] __main__ - Uploaded to S3: topic_extraction/img_14.jpg -2025-03-02 14:59:44,653 [INFO] __main__ - Uploaded to S3: topic_extraction/img_15.jpg -2025-03-02 14:59:45,249 [INFO] __main__ - Uploaded to S3: topic_extraction/img_16.jpg -2025-03-02 14:59:45,706 [INFO] __main__ - Uploaded to S3: topic_extraction/img_17.jpg -2025-03-02 14:59:46,272 [INFO] __main__ - Uploaded to S3: topic_extraction/img_18.jpg -2025-03-02 14:59:46,931 [INFO] __main__ - Uploaded to S3: topic_extraction/img_19.jpg -2025-03-02 14:59:47,541 [INFO] __main__ - Uploaded to S3: topic_extraction/img_20.jpg -2025-03-02 14:59:48,107 [INFO] __main__ - Uploaded to S3: topic_extraction/img_21.jpg -2025-03-02 14:59:48,523 [INFO] __main__ - Uploaded to S3: topic_extraction/img_22.jpg -2025-03-02 14:59:49,068 [INFO] __main__ - Classifying images to detect tables. -2025-03-02 15:00:11,610 [INFO] __main__ - Processing table image: topic_extraction/img_1.jpg, columns=three -2025-03-02 15:00:15,216 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_1_row0_col0.jpg -2025-03-02 15:00:15,512 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_1_row0_col1.jpg -2025-03-02 15:00:15,709 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_1_row1_col0.jpg -2025-03-02 15:00:15,946 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_1_row2_col0.jpg -2025-03-02 15:00:15,946 [INFO] __main__ - Processing table image: topic_extraction/img_2.jpg, columns=three -2025-03-02 15:00:20,225 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row0_col0.jpg -2025-03-02 15:00:20,509 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row0_col1.jpg -2025-03-02 15:00:20,720 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row1_col0.jpg -2025-03-02 15:00:20,948 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row2_col0.jpg -2025-03-02 15:00:21,181 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row2_col1.jpg -2025-03-02 15:00:21,429 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row3_col0.jpg -2025-03-02 15:00:21,429 [INFO] __main__ - Processing table image: topic_extraction/img_3.jpg, columns=three -2025-03-02 15:00:24,720 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_3_row0_col0.jpg -2025-03-02 15:00:25,026 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_3_row0_col1.jpg -2025-03-02 15:00:25,026 [INFO] __main__ - Processing table image: topic_extraction/img_4.jpg, columns=three -2025-03-02 15:00:28,916 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_4_row0_col0.jpg -2025-03-02 15:00:29,206 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_4_row0_col1.jpg -2025-03-02 15:00:29,206 [INFO] __main__ - Processing table image: topic_extraction/img_5.jpg, columns=three -2025-03-02 15:00:32,671 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_5_row0_col0.jpg -2025-03-02 15:00:32,950 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_5_row0_col1.jpg -2025-03-02 15:00:33,187 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_5_row1_col0.jpg -2025-03-02 15:00:33,188 [INFO] __main__ - Processing table image: topic_extraction/img_6.jpg, columns=three -2025-03-02 15:00:37,317 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_6_row0_col0.jpg -2025-03-02 15:00:37,615 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_6_row0_col1.jpg -2025-03-02 15:00:37,616 [INFO] __main__ - Processing table image: topic_extraction/img_7.jpg, columns=three -2025-03-02 15:00:41,547 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_7_row0_col0.jpg -2025-03-02 15:00:41,827 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_7_row0_col1.jpg -2025-03-02 15:00:41,827 [INFO] __main__ - Processing table image: topic_extraction/img_8.jpg, columns=three -2025-03-02 15:00:45,889 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_8_row0_col0.jpg -2025-03-02 15:00:46,174 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_8_row0_col1.jpg -2025-03-02 15:00:46,175 [INFO] __main__ - Processing table image: topic_extraction/img_9.jpg, columns=three -2025-03-02 15:00:50,348 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row0_col0.jpg -2025-03-02 15:00:50,645 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row0_col1.jpg -2025-03-02 15:00:50,867 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row1_col0.jpg -2025-03-02 15:00:51,082 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row2_col0.jpg -2025-03-02 15:00:51,323 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row2_col1.jpg -2025-03-02 15:00:51,538 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row3_col0.jpg -2025-03-02 15:00:51,538 [INFO] __main__ - Processing table image: topic_extraction/img_10.jpg, columns=three -2025-03-02 15:00:55,761 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row0_col0.jpg -2025-03-02 15:00:56,051 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row0_col1.jpg -2025-03-02 15:00:56,295 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row0_col2.jpg -2025-03-02 15:00:56,505 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row1_col0.jpg -2025-03-02 15:00:56,718 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row1_col1.jpg -2025-03-02 15:00:56,925 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row2_col0.jpg -2025-03-02 15:00:57,127 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row2_col1.jpg -2025-03-02 15:00:57,358 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row3_col0.jpg -2025-03-02 15:00:57,583 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row3_col1.jpg -2025-03-02 15:00:57,583 [INFO] __main__ - Processing table image: topic_extraction/img_11.jpg, columns=three -2025-03-02 15:01:00,162 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_11_row0_col0.jpg -2025-03-02 15:01:00,435 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_11_row0_col1.jpg -2025-03-02 15:01:00,648 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_11_row1_col0.jpg -2025-03-02 15:01:00,648 [INFO] __main__ - Processing table image: topic_extraction/img_12.jpg, columns=three -2025-03-02 15:01:03,879 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_12_row0_col0.jpg -2025-03-02 15:01:04,133 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_12_row0_col1.jpg -2025-03-02 15:01:04,133 [INFO] __main__ - Processing table image: topic_extraction/img_13.jpg, columns=three -2025-03-02 15:01:08,128 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_13_row0_col0.jpg -2025-03-02 15:01:08,421 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_13_row0_col1.jpg -2025-03-02 15:01:08,637 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_13_row1_col0.jpg -2025-03-02 15:01:08,637 [INFO] __main__ - Processing table image: topic_extraction/img_14.jpg, columns=three -2025-03-02 15:01:12,721 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row0_col0.jpg -2025-03-02 15:01:13,014 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row0_col1.jpg -2025-03-02 15:01:13,229 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row1_col0.jpg -2025-03-02 15:01:13,484 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row2_col0.jpg -2025-03-02 15:01:13,725 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row2_col1.jpg -2025-03-02 15:01:13,726 [INFO] __main__ - Processing table image: topic_extraction/img_15.jpg, columns=three -2025-03-02 15:01:16,869 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row0_col0.jpg -2025-03-02 15:01:17,179 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row0_col1.jpg -2025-03-02 15:01:17,395 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row1_col0.jpg -2025-03-02 15:01:17,610 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row2_col0.jpg -2025-03-02 15:01:17,827 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row3_col0.jpg -2025-03-02 15:01:17,827 [INFO] __main__ - Processing table image: topic_extraction/img_16.jpg, columns=three -2025-03-02 15:01:22,073 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_16_row0_col0.jpg -2025-03-02 15:01:22,375 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_16_row0_col1.jpg -2025-03-02 15:01:22,376 [INFO] __main__ - Processing table image: topic_extraction/img_17.jpg, columns=three -2025-03-02 15:01:25,546 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_17_row0_col0.jpg -2025-03-02 15:01:25,841 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_17_row0_col1.jpg -2025-03-02 15:01:26,080 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_17_row1_col0.jpg -2025-03-02 15:01:26,081 [INFO] __main__ - Processing table image: topic_extraction/img_18.jpg, columns=three -2025-03-02 15:01:30,055 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_18_row0_col0.jpg -2025-03-02 15:01:30,330 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_18_row0_col1.jpg -2025-03-02 15:01:30,557 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_18_row1_col0.jpg -2025-03-02 15:01:30,557 [INFO] __main__ - Processing table image: topic_extraction/img_19.jpg, columns=three -2025-03-02 15:01:34,337 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_19_row0_col0.jpg -2025-03-02 15:01:34,610 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_19_row0_col1.jpg -2025-03-02 15:01:34,814 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_19_row1_col0.jpg -2025-03-02 15:01:34,814 [INFO] __main__ - Processing table image: topic_extraction/img_20.jpg, columns=three -2025-03-02 15:01:39,093 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_20_row0_col0.jpg -2025-03-02 15:01:39,389 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_20_row0_col1.jpg -2025-03-02 15:01:39,389 [INFO] __main__ - Processing table image: topic_extraction/img_21.jpg, columns=three -2025-03-02 15:01:43,178 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_21_row0_col0.jpg -2025-03-02 15:01:43,179 [INFO] __main__ - Processing table image: topic_extraction/img_22.jpg, columns=three -2025-03-02 15:01:45,849 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_22_row0_col0.jpg -2025-03-02 15:01:46,146 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_22_row0_col1.jpg -2025-03-02 15:01:46,351 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_22_row1_col0.jpg -2025-03-02 15:01:46,529 [INFO] __main__ - Uploaded to S3: topic_extraction/final_output.md -2025-03-02 15:01:46,529 [INFO] __main__ - Final markdown uploaded to S3 at topic_extraction/final_output.md -2025-03-02 15:01:46,530 [INFO] __main__ - Final markdown saved locally at /home/user/app/outputs/final_output_local.md -2025-03-02 15:01:46,842 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:01:46,849 [INFO] __main__ - Processing completed successfully. -2025-03-02 15:06:41,893 [INFO] __main__ - Processing PDF: /home/user/app/input_output/ocr-specification-economics.pdf -2025-03-02 15:06:42,974 [INFO] __main__ - Gemini returned subtopics: {'Content of A Level in Economics (H460)': [5, 5], 'Content of Component 1: Microeconomics (H460/01)': [6, 16], 'Content of Component 2: Macroeconomics (H460/02)': [17, 27], 'Content of Component 3: Themes in economics (H460/03)': [28, 28]} -2025-03-02 15:06:42,980 [INFO] __main__ - Loaded 9752567 bytes from local file '/home/user/app/input_output/ocr-specification-economics.pdf' -2025-03-02 15:06:43,152 [INFO] __main__ - Occurrences of subtopic 'Content of A Level in Economics (H460)': [2, 10] -2025-03-02 15:06:43,298 [INFO] __main__ - Occurrences of subtopic 'Content of Component 1: Microeconomics (H460/01)': [2, 11] -2025-03-02 15:06:43,444 [INFO] __main__ - Occurrences of subtopic 'Content of Component 2: Macroeconomics (H460/02)': [2, 22] -2025-03-02 15:06:43,590 [INFO] __main__ - Occurrences of subtopic 'Content of Component 3: Themes in economics (H460/03)': [2, 33] -2025-03-02 15:06:43,591 [INFO] __main__ - Processing pages (0-based): [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33] -2025-03-02 15:07:14,912 [INFO] __main__ - doc_analyze complete. Extracting images. -2025-03-02 15:07:17,601 [INFO] __main__ - Uploaded to S3: topic_extraction/img_1.jpg -2025-03-02 15:07:18,324 [INFO] __main__ - Uploaded to S3: topic_extraction/img_2.jpg -2025-03-02 15:07:18,895 [INFO] __main__ - Uploaded to S3: topic_extraction/img_3.jpg -2025-03-02 15:07:19,484 [INFO] __main__ - Uploaded to S3: topic_extraction/img_4.jpg -2025-03-02 15:07:20,030 [INFO] __main__ - Uploaded to S3: topic_extraction/img_5.jpg -2025-03-02 15:07:20,631 [INFO] __main__ - Uploaded to S3: topic_extraction/img_6.jpg -2025-03-02 15:07:21,213 [INFO] __main__ - Uploaded to S3: topic_extraction/img_7.jpg -2025-03-02 15:07:21,821 [INFO] __main__ - Uploaded to S3: topic_extraction/img_8.jpg -2025-03-02 15:07:22,408 [INFO] __main__ - Uploaded to S3: topic_extraction/img_9.jpg -2025-03-02 15:07:23,030 [INFO] __main__ - Uploaded to S3: topic_extraction/img_10.jpg -2025-03-02 15:07:23,487 [INFO] __main__ - Uploaded to S3: topic_extraction/img_11.jpg -2025-03-02 15:07:24,101 [INFO] __main__ - Uploaded to S3: topic_extraction/img_12.jpg -2025-03-02 15:07:24,728 [INFO] __main__ - Uploaded to S3: topic_extraction/img_13.jpg -2025-03-02 15:07:25,314 [INFO] __main__ - Uploaded to S3: topic_extraction/img_14.jpg -2025-03-02 15:07:25,789 [INFO] __main__ - Uploaded to S3: topic_extraction/img_15.jpg -2025-03-02 15:07:26,385 [INFO] __main__ - Uploaded to S3: topic_extraction/img_16.jpg -2025-03-02 15:07:26,857 [INFO] __main__ - Uploaded to S3: topic_extraction/img_17.jpg -2025-03-02 15:07:27,419 [INFO] __main__ - Uploaded to S3: topic_extraction/img_18.jpg -2025-03-02 15:07:28,000 [INFO] __main__ - Uploaded to S3: topic_extraction/img_19.jpg -2025-03-02 15:07:28,589 [INFO] __main__ - Uploaded to S3: topic_extraction/img_20.jpg -2025-03-02 15:07:29,168 [INFO] __main__ - Uploaded to S3: topic_extraction/img_21.jpg -2025-03-02 15:07:29,673 [INFO] __main__ - Uploaded to S3: topic_extraction/img_22.jpg -2025-03-02 15:07:30,148 [INFO] __main__ - Classifying images to detect tables. -2025-03-02 15:07:52,573 [INFO] __main__ - Processing table image: topic_extraction/img_1.jpg, columns=three -2025-03-02 15:07:55,799 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_1_row0_col0.jpg -2025-03-02 15:07:56,077 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_1_row0_col1.jpg -2025-03-02 15:07:56,284 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_1_row1_col0.jpg -2025-03-02 15:07:56,470 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_1_row2_col0.jpg -2025-03-02 15:07:56,470 [INFO] __main__ - Processing table image: topic_extraction/img_2.jpg, columns=three -2025-03-02 15:08:00,547 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row0_col0.jpg -2025-03-02 15:08:00,838 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row0_col1.jpg -2025-03-02 15:08:01,053 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row1_col0.jpg -2025-03-02 15:08:01,255 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row2_col0.jpg -2025-03-02 15:08:01,480 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row2_col1.jpg -2025-03-02 15:08:01,704 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_2_row3_col0.jpg -2025-03-02 15:08:01,704 [INFO] __main__ - Processing table image: topic_extraction/img_3.jpg, columns=three -2025-03-02 15:08:04,907 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_3_row0_col0.jpg -2025-03-02 15:08:05,215 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_3_row0_col1.jpg -2025-03-02 15:08:05,216 [INFO] __main__ - Processing table image: topic_extraction/img_4.jpg, columns=three -2025-03-02 15:08:08,972 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_4_row0_col0.jpg -2025-03-02 15:08:09,297 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_4_row0_col1.jpg -2025-03-02 15:08:09,297 [INFO] __main__ - Processing table image: topic_extraction/img_5.jpg, columns=three -2025-03-02 15:08:12,704 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_5_row0_col0.jpg -2025-03-02 15:08:12,991 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_5_row0_col1.jpg -2025-03-02 15:08:13,203 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_5_row1_col0.jpg -2025-03-02 15:08:13,204 [INFO] __main__ - Processing table image: topic_extraction/img_6.jpg, columns=three -2025-03-02 15:08:17,226 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_6_row0_col0.jpg -2025-03-02 15:08:17,513 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_6_row0_col1.jpg -2025-03-02 15:08:17,514 [INFO] __main__ - Processing table image: topic_extraction/img_7.jpg, columns=three -2025-03-02 15:08:21,513 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_7_row0_col0.jpg -2025-03-02 15:08:21,822 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_7_row0_col1.jpg -2025-03-02 15:08:21,822 [INFO] __main__ - Processing table image: topic_extraction/img_8.jpg, columns=three -2025-03-02 15:08:26,051 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_8_row0_col0.jpg -2025-03-02 15:08:26,349 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_8_row0_col1.jpg -2025-03-02 15:08:26,350 [INFO] __main__ - Processing table image: topic_extraction/img_9.jpg, columns=three -2025-03-02 15:08:30,459 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row0_col0.jpg -2025-03-02 15:08:30,758 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row0_col1.jpg -2025-03-02 15:08:30,974 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row1_col0.jpg -2025-03-02 15:08:31,214 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row2_col0.jpg -2025-03-02 15:08:31,436 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row2_col1.jpg -2025-03-02 15:08:31,657 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_9_row3_col0.jpg -2025-03-02 15:08:31,658 [INFO] __main__ - Processing table image: topic_extraction/img_10.jpg, columns=three -2025-03-02 15:08:35,933 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row0_col0.jpg -2025-03-02 15:08:36,209 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row0_col1.jpg -2025-03-02 15:08:36,396 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row0_col2.jpg -2025-03-02 15:08:36,585 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row1_col0.jpg -2025-03-02 15:08:36,801 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row1_col1.jpg -2025-03-02 15:08:37,025 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row2_col0.jpg -2025-03-02 15:08:37,237 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row2_col1.jpg -2025-03-02 15:08:37,454 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row3_col0.jpg -2025-03-02 15:08:37,673 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_10_row3_col1.jpg -2025-03-02 15:08:37,673 [INFO] __main__ - Processing table image: topic_extraction/img_11.jpg, columns=three -2025-03-02 15:08:40,182 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_11_row0_col0.jpg -2025-03-02 15:08:40,477 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_11_row0_col1.jpg -2025-03-02 15:08:40,700 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_11_row1_col0.jpg -2025-03-02 15:08:40,701 [INFO] __main__ - Processing table image: topic_extraction/img_12.jpg, columns=three -2025-03-02 15:08:43,907 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_12_row0_col0.jpg -2025-03-02 15:08:44,178 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_12_row0_col1.jpg -2025-03-02 15:08:44,178 [INFO] __main__ - Processing table image: topic_extraction/img_13.jpg, columns=three -2025-03-02 15:08:48,524 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_13_row0_col0.jpg -2025-03-02 15:08:48,870 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_13_row0_col1.jpg -2025-03-02 15:08:49,079 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_13_row1_col0.jpg -2025-03-02 15:08:49,080 [INFO] __main__ - Processing table image: topic_extraction/img_14.jpg, columns=three -2025-03-02 15:08:53,612 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row0_col0.jpg -2025-03-02 15:08:53,897 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row0_col1.jpg -2025-03-02 15:08:54,116 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row1_col0.jpg -2025-03-02 15:08:54,352 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row2_col0.jpg -2025-03-02 15:08:54,578 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_14_row2_col1.jpg -2025-03-02 15:08:54,579 [INFO] __main__ - Processing table image: topic_extraction/img_15.jpg, columns=three -2025-03-02 15:08:58,339 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row0_col0.jpg -2025-03-02 15:08:58,628 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row0_col1.jpg -2025-03-02 15:08:58,843 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row1_col0.jpg -2025-03-02 15:08:59,052 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row2_col0.jpg -2025-03-02 15:08:59,262 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_15_row3_col0.jpg -2025-03-02 15:08:59,262 [INFO] __main__ - Processing table image: topic_extraction/img_16.jpg, columns=three -2025-03-02 15:09:04,508 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_16_row0_col0.jpg -2025-03-02 15:09:04,817 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_16_row0_col1.jpg -2025-03-02 15:09:04,818 [INFO] __main__ - Processing table image: topic_extraction/img_17.jpg, columns=three -2025-03-02 15:09:08,615 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_17_row0_col0.jpg -2025-03-02 15:09:08,905 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_17_row0_col1.jpg -2025-03-02 15:09:09,111 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_17_row1_col0.jpg -2025-03-02 15:09:09,111 [INFO] __main__ - Processing table image: topic_extraction/img_18.jpg, columns=three -2025-03-02 15:09:12,917 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_18_row0_col0.jpg -2025-03-02 15:09:13,215 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_18_row0_col1.jpg -2025-03-02 15:09:13,425 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_18_row1_col0.jpg -2025-03-02 15:09:13,426 [INFO] __main__ - Processing table image: topic_extraction/img_19.jpg, columns=three -2025-03-02 15:09:17,250 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_19_row0_col0.jpg -2025-03-02 15:09:17,561 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_19_row0_col1.jpg -2025-03-02 15:09:17,745 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_19_row1_col0.jpg -2025-03-02 15:09:17,745 [INFO] __main__ - Processing table image: topic_extraction/img_20.jpg, columns=three -2025-03-02 15:09:22,139 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_20_row0_col0.jpg -2025-03-02 15:09:22,430 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_20_row0_col1.jpg -2025-03-02 15:09:22,431 [INFO] __main__ - Processing table image: topic_extraction/img_21.jpg, columns=three -2025-03-02 15:09:26,166 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_21_row0_col0.jpg -2025-03-02 15:09:26,166 [INFO] __main__ - Processing table image: topic_extraction/img_22.jpg, columns=three -2025-03-02 15:09:29,061 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_22_row0_col0.jpg -2025-03-02 15:09:29,365 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_22_row0_col1.jpg -2025-03-02 15:09:29,588 [INFO] __main__ - Uploaded to S3: topic_extraction/cells/img_22_row1_col0.jpg -2025-03-02 15:09:29,768 [INFO] __main__ - Uploaded to S3: topic_extraction/final_output.md -2025-03-02 15:09:29,768 [INFO] __main__ - Final markdown uploaded to S3 at topic_extraction/final_output.md -2025-03-02 15:09:29,768 [INFO] __main__ - Final markdown saved locally at /home/user/app/outputs/final_output_local.md -2025-03-02 15:09:30,106 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:09:30,113 [INFO] __main__ - Processing completed successfully. -2025-03-02 15:30:49,001 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:30:49,001 [INFO] __main__ - Processing PDF: input_output/a-level-pearson-mathematics-specification.pdf -2025-03-02 15:30:49,764 [INFO] __main__ - Gemini returned subtopics: {'Paper 1: Pure Mathematics 1': [13, 20], 'Paper 2: Pure Mathematics 2': [21, 27], 'Paper 3: Statistics & Mechanics': [28, 34]} -2025-03-02 15:30:49,764 [INFO] __main__ - Loaded 1135473 bytes from local file 'input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-02 15:30:50,022 [INFO] __main__ - Occurrences of subtopic 'Paper 1: Pure Mathematics 1': [8, 44] -2025-03-02 15:30:50,207 [INFO] __main__ - Occurrences of subtopic 'Paper 2: Pure Mathematics 2': [8, 44] -2025-03-02 15:30:50,354 [INFO] __main__ - Occurrences of subtopic 'Paper 3: Statistics & Mechanics': [] -2025-03-02 15:30:50,354 [WARNING] __main__ - No suitable occurrence for 'Paper 3: Statistics & Mechanics'. Using page 0. -2025-03-02 15:30:50,354 [INFO] __main__ - Processing pages (0-based): [27, 28, 29, 30, 31, 32, 33, 44, 45, 46, 47, 48, 49, 50, 51] -2025-03-02 15:31:22,342 [INFO] __main__ - doc_analyze complete. Extracting images. -2025-03-02 15:31:25,822 [INFO] __main__ - Classifying images to detect tables. -2025-03-02 15:31:35,415 [INFO] __main__ - Processing table image => img_1.jpg, columns=three -2025-03-02 15:31:39,569 [INFO] __main__ - Processing table image => img_2.jpg, columns=three -2025-03-02 15:31:42,660 [INFO] __main__ - Processing table image => img_3.jpg, columns=three -2025-03-02 15:31:45,999 [INFO] __main__ - Processing table image => img_4.jpg, columns=three -2025-03-02 15:31:49,327 [INFO] __main__ - Processing table image => img_5.jpg, columns=three -2025-03-02 15:31:50,413 [INFO] __main__ - Processing table image => img_6.jpg, columns=three -2025-03-02 15:31:52,854 [INFO] __main__ - Final JSON saved locally at /home/user/app/outputs/final_output.json -2025-03-02 15:31:52,854 [INFO] __main__ - Final JSON saved locally at /home/user/app/outputs/final_output_local.json -2025-03-02 15:31:53,147 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:31:53,150 [INFO] __main__ - Processing completed successfully. -2025-03-02 15:33:39,987 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:33:39,987 [INFO] __main__ - Processing PDF: input_output/a-level-pearson-mathematics-specification.pdf -2025-03-02 15:33:40,750 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-02 15:33:40,750 [INFO] __main__ - Loaded 1135473 bytes from local file 'input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-02 15:33:40,900 [INFO] __main__ - Occurrences of subtopic 'Paper 1 and Paper 2: Pure Mathematics': [2, 3, 4, 14] -2025-03-02 15:33:41,049 [INFO] __main__ - Occurrences of subtopic 'Paper 3: Statistics and Mechanics': [3, 4, 9, 13, 33, 44] -2025-03-02 15:33:41,049 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-02 15:33:45,815 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:34:54,783 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:34:54,783 [INFO] __main__ - Processing PDF: input_output/a-level-pearson-mathematics-specification.pdf -2025-03-02 15:34:55,467 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-02 15:34:55,468 [INFO] __main__ - Loaded 1135473 bytes from local file 'input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-02 15:34:55,725 [INFO] __main__ - Occurrences of subtopic 'Paper 1 and Paper 2: Pure Mathematics': [2, 3, 4, 14] -2025-03-02 15:34:55,880 [INFO] __main__ - Occurrences of subtopic 'Paper 3: Statistics and Mechanics': [3, 4, 9, 13, 33, 44] -2025-03-02 15:34:55,880 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-02 15:35:50,447 [INFO] __main__ - doc_analyze complete. Extracting images. -2025-03-02 15:35:59,077 [INFO] __main__ - Classifying images to detect tables. -2025-03-02 15:36:05,947 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:36:34,852 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:36:34,852 [INFO] __main__ - Processing PDF: /home/user/app/input_output/ocr-specification-economics.pdf -2025-03-02 15:36:35,931 [INFO] __main__ - Gemini returned subtopics: {'Content of A Level in Economics (H460)': [5, 5], 'Content of Component 1: Microeconomics (H460/01)': [6, 16], 'Content of Component 2: Macroeconomics (H460/02)': [17, 27], 'Content of Component 3: Themes in economics (H460/03)': [28, 28]} -2025-03-02 15:36:35,936 [INFO] __main__ - Loaded 9752567 bytes from local file '/home/user/app/input_output/ocr-specification-economics.pdf' -2025-03-02 15:36:36,086 [INFO] __main__ - Occurrences of subtopic 'Content of A Level in Economics (H460)': [2, 10] -2025-03-02 15:36:36,245 [INFO] __main__ - Occurrences of subtopic 'Content of Component 1: Microeconomics (H460/01)': [2, 11] -2025-03-02 15:36:36,393 [INFO] __main__ - Occurrences of subtopic 'Content of Component 2: Macroeconomics (H460/02)': [2, 22] -2025-03-02 15:36:36,543 [INFO] __main__ - Occurrences of subtopic 'Content of Component 3: Themes in economics (H460/03)': [2, 33] -2025-03-02 15:36:36,543 [INFO] __main__ - Processing pages (0-based): [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33] -2025-03-02 15:36:49,670 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:38:16,408 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:38:16,409 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-02 15:38:17,055 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-02 15:38:17,056 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-02 15:38:17,212 [INFO] __main__ - Occurrences of subtopic 'Paper 1 and Paper 2: Pure Mathematics': [2, 3, 4, 14] -2025-03-02 15:38:17,360 [INFO] __main__ - Occurrences of subtopic 'Paper 3: Statistics and Mechanics': [3, 4, 9, 13, 33, 44] -2025-03-02 15:38:17,360 [INFO] __main__ - Processing pages (0-based): [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39] -2025-03-02 15:38:55,690 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:39:13,884 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:39:13,884 [INFO] __main__ - Processing PDF: input_output/a-level-pearson-mathematics-specification.pdf -2025-03-02 15:39:14,572 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-02 15:39:14,572 [INFO] __main__ - Loaded 1135473 bytes from local file 'input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-02 15:39:14,723 [INFO] __main__ - Occurrences of subtopic 'Paper 1 and Paper 2: Pure Mathematics': [2, 3, 4, 14] -2025-03-02 15:39:14,870 [INFO] __main__ - Occurrences of subtopic 'Paper 3: Statistics and Mechanics': [3, 4, 9, 13, 33, 44] -2025-03-02 15:39:14,870 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-02 15:39:23,660 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:40:12,360 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:40:12,361 [INFO] __main__ - Processing PDF: input_output/a-level-pearson-mathematics-specification.pdf -2025-03-02 15:40:13,030 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-02 15:40:13,030 [INFO] __main__ - Loaded 1135473 bytes from local file 'input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-02 15:40:13,180 [INFO] __main__ - Occurrences of subtopic 'Paper 1 and Paper 2: Pure Mathematics': [2, 3, 4, 14] -2025-03-02 15:40:13,378 [INFO] __main__ - Occurrences of subtopic 'Paper 3: Statistics and Mechanics': [3, 4, 9, 13, 33, 44] -2025-03-02 15:40:13,379 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-02 15:41:08,554 [INFO] __main__ - doc_analyze complete. Extracting images. -2025-03-02 15:41:17,501 [INFO] __main__ - Classifying images to detect tables. -2025-03-02 15:41:52,527 [INFO] __main__ - Processing table image => img_1.jpg, columns=three -2025-03-02 15:41:55,145 [INFO] __main__ - Processing table image => img_2.jpg, columns=three -2025-03-02 15:41:57,942 [INFO] __main__ - Processing table image => img_3.jpg, columns=three -2025-03-02 15:42:00,993 [INFO] __main__ - Processing table image => img_4.jpg, columns=three -2025-03-02 15:42:04,150 [INFO] __main__ - Processing table image => img_5.jpg, columns=three -2025-03-02 15:42:07,532 [INFO] __main__ - Processing table image => img_6.jpg, columns=three -2025-03-02 15:42:10,795 [INFO] __main__ - Processing table image => img_7.jpg, columns=three -2025-03-02 15:42:13,915 [INFO] __main__ - Processing table image => img_8.jpg, columns=three -2025-03-02 15:42:17,135 [INFO] __main__ - Processing table image => img_9.jpg, columns=three -2025-03-02 15:42:20,493 [INFO] __main__ - Processing table image => img_10.jpg, columns=three -2025-03-02 15:42:23,444 [INFO] __main__ - Processing table image => img_11.jpg, columns=two -2025-03-02 15:42:26,406 [INFO] __main__ - Processing table image => img_12.jpg, columns=three -2025-03-02 15:42:29,391 [INFO] __main__ - Processing table image => img_13.jpg, columns=three -2025-03-02 15:42:32,164 [INFO] __main__ - Processing table image => img_14.jpg, columns=three -2025-03-02 15:42:35,925 [INFO] __main__ - Processing table image => img_15.jpg, columns=three -2025-03-02 15:42:39,199 [INFO] __main__ - Processing table image => img_16.jpg, columns=three -2025-03-02 15:42:42,422 [INFO] __main__ - Processing table image => img_17.jpg, columns=three -2025-03-02 15:42:45,629 [INFO] __main__ - Processing table image => img_18.jpg, columns=three -2025-03-02 15:42:46,544 [INFO] __main__ - Processing table image => img_19.jpg, columns=three -2025-03-02 15:42:48,900 [INFO] __main__ - Processing table image => img_20.jpg, columns=three -2025-03-02 15:42:52,615 [INFO] __main__ - Processing table image => img_21.jpg, columns=three -2025-03-02 15:42:55,988 [INFO] __main__ - Processing table image => img_22.jpg, columns=three -2025-03-02 15:42:59,472 [INFO] __main__ - Processing table image => img_23.jpg, columns=three -2025-03-02 15:43:02,264 [INFO] __main__ - Processing table image => img_24.jpg, columns=three -2025-03-02 15:43:05,548 [INFO] __main__ - Processing table image => img_25.jpg, columns=two -2025-03-02 15:43:08,426 [INFO] __main__ - Processing table image => img_26.jpg, columns=three -2025-03-02 15:43:11,359 [INFO] __main__ - Processing table image => img_27.jpg, columns=three -2025-03-02 15:43:13,895 [INFO] __main__ - Processing table image => img_28.jpg, columns=two -2025-03-02 15:43:16,763 [INFO] __main__ - Final JSON saved locally at /home/user/app/outputs/final_output.json -2025-03-02 15:43:16,763 [INFO] __main__ - Final JSON saved locally at /home/user/app/outputs/final_output_local.json -2025-03-02 15:43:17,057 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:43:17,064 [INFO] __main__ - Processing completed successfully. -2025-03-02 15:46:04,987 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:46:04,988 [INFO] __main__ - Processing PDF: /home/user/app/input_output/ocr-specification-economics.pdf -2025-03-02 15:46:06,095 [INFO] __main__ - Gemini returned subtopics: {'Content of A Level in Economics (H460)': [5, 5], 'Content of Component 1: Microeconomics (H460/01)': [6, 16], 'Content of Component 2: Macroeconomics (H460/02)': [17, 27], 'Content of Component 3: Themes in economics (H460/03)': [28, 28]} -2025-03-02 15:46:06,100 [INFO] __main__ - Loaded 9752567 bytes from local file '/home/user/app/input_output/ocr-specification-economics.pdf' -2025-03-02 15:46:06,247 [INFO] __main__ - Occurrences of subtopic 'Content of A Level in Economics (H460)': [2, 10] -2025-03-02 15:46:06,393 [INFO] __main__ - Occurrences of subtopic 'Content of Component 1: Microeconomics (H460/01)': [2, 11] -2025-03-02 15:46:06,536 [INFO] __main__ - Occurrences of subtopic 'Content of Component 2: Macroeconomics (H460/02)': [2, 22] -2025-03-02 15:46:06,701 [INFO] __main__ - Occurrences of subtopic 'Content of Component 3: Themes in economics (H460/03)': [2, 33] -2025-03-02 15:46:06,701 [INFO] __main__ - Processing pages (0-based): [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33] -2025-03-02 15:46:38,108 [INFO] __main__ - doc_analyze complete. Extracting images. -2025-03-02 15:46:46,013 [INFO] __main__ - Classifying images to detect tables. -2025-03-02 15:47:09,314 [INFO] __main__ - Processing table image => img_1.jpg, columns=three -2025-03-02 15:47:11,825 [INFO] __main__ - Processing table image => img_2.jpg, columns=three -2025-03-02 15:47:15,360 [INFO] __main__ - Processing table image => img_3.jpg, columns=three -2025-03-02 15:47:18,359 [INFO] __main__ - Processing table image => img_4.jpg, columns=three -2025-03-02 15:47:21,998 [INFO] __main__ - Processing table image => img_5.jpg, columns=three -2025-03-02 15:47:24,942 [INFO] __main__ - Processing table image => img_6.jpg, columns=three -2025-03-02 15:47:28,526 [INFO] __main__ - Processing table image => img_7.jpg, columns=three -2025-03-02 15:47:31,741 [INFO] __main__ - Processing table image => img_8.jpg, columns=three -2025-03-02 15:47:35,515 [INFO] __main__ - Processing table image => img_9.jpg, columns=three -2025-03-02 15:47:39,076 [INFO] __main__ - Processing table image => img_10.jpg, columns=three -2025-03-02 15:47:42,855 [INFO] __main__ - Processing table image => img_11.jpg, columns=three -2025-03-02 15:47:45,035 [INFO] __main__ - Processing table image => img_12.jpg, columns=three -2025-03-02 15:47:48,119 [INFO] __main__ - Processing table image => img_13.jpg, columns=three -2025-03-02 15:47:51,846 [INFO] __main__ - Processing table image => img_14.jpg, columns=three -2025-03-02 15:47:55,356 [INFO] __main__ - Processing table image => img_15.jpg, columns=three -2025-03-02 15:47:58,009 [INFO] __main__ - Processing table image => img_16.jpg, columns=three -2025-03-02 15:48:02,296 [INFO] __main__ - Processing table image => img_17.jpg, columns=three -2025-03-02 15:48:05,441 [INFO] __main__ - Processing table image => img_18.jpg, columns=three -2025-03-02 15:48:08,620 [INFO] __main__ - Processing table image => img_19.jpg, columns=three -2025-03-02 15:48:12,128 [INFO] __main__ - Processing table image => img_20.jpg, columns=three -2025-03-02 15:48:15,972 [INFO] __main__ - Processing table image => img_21.jpg, columns=three -2025-03-02 15:48:19,326 [INFO] __main__ - Processing table image => img_22.jpg, columns=three -2025-03-02 15:48:21,707 [INFO] __main__ - Final JSON saved locally at /home/user/app/output/final_output.json -2025-03-02 15:48:21,707 [INFO] __main__ - Final JSON saved locally at /home/user/app/output/final_output_local.json -2025-03-02 15:48:22,008 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:48:22,014 [INFO] __main__ - Processing completed successfully. -2025-03-02 15:51:36,990 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:51:36,990 [INFO] __main__ - Processing PDF: /home/user/app/input_output/aqa-Mathematics-specification.pdf -2025-03-02 15:51:38,323 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 11], '1 Proof': [11, 11], '2 Algebra and \nfunctions': [12, 15], '3 Coordinate \ngeometry in \nthe (x,y) plane': [16, 17], '4 Sequences \nand series': [17, 18], '5 Trigonometry': [19, 20]} -2025-03-02 15:51:38,323 [INFO] __main__ - Loaded 888895 bytes from local file '/home/user/app/input_output/aqa-Mathematics-specification.pdf' -2025-03-02 15:51:38,484 [INFO] __main__ - Occurrences of subtopic 'Paper 1 and Paper 2: Pure Mathematics': [0] -2025-03-02 15:51:38,485 [WARNING] __main__ - No suitable occurrence for 'Paper 1 and Paper 2: Pure Mathematics'. Using page 0. -2025-03-02 15:51:38,612 [INFO] __main__ - Occurrences of subtopic '1 Proof': [0] -2025-03-02 15:51:38,612 [WARNING] __main__ - No suitable occurrence for '1 Proof'. Using page 0. -2025-03-02 15:51:38,737 [INFO] __main__ - Occurrences of subtopic '2 Algebra and -functions': [1, 2, 3, 4, 5] -2025-03-02 15:51:38,738 [WARNING] __main__ - No suitable occurrence for '2 Algebra and -functions'. Using page 5. -2025-03-02 15:51:38,873 [INFO] __main__ - Occurrences of subtopic '3 Coordinate -geometry in -the (x,y) plane': [5] -2025-03-02 15:51:38,874 [WARNING] __main__ - No suitable occurrence for '3 Coordinate -geometry in -the (x,y) plane'. Using page 5. -2025-03-02 15:51:39,032 [INFO] __main__ - Occurrences of subtopic '4 Sequences -and series': [6, 7] -2025-03-02 15:51:39,032 [WARNING] __main__ - No suitable occurrence for '4 Sequences -and series'. Using page 7. -2025-03-02 15:51:39,158 [INFO] __main__ - Occurrences of subtopic '5 Trigonometry': [8, 9] -2025-03-02 15:51:39,159 [WARNING] __main__ - No suitable occurrence for '5 Trigonometry'. Using page 9. -2025-03-02 15:51:39,159 [INFO] __main__ - Processing pages (0-based): [10, 11, 12, 13, 14, 15, 16, 17, 18, 19] -2025-03-02 15:52:09,390 [INFO] __main__ - doc_analyze complete. Extracting images. -2025-03-02 15:52:13,176 [INFO] __main__ - Classifying images to detect tables. -2025-03-02 15:52:22,888 [INFO] __main__ - Processing table image => img_1.jpg, columns=two -2025-03-02 15:52:25,632 [INFO] __main__ - Processing table image => img_2.jpg, columns=three -2025-03-02 15:52:28,498 [INFO] __main__ - Processing table image => img_3.jpg, columns=three -2025-03-02 15:52:31,350 [INFO] __main__ - Processing table image => img_4.jpg, columns=three -2025-03-02 15:52:35,021 [INFO] __main__ - Processing table image => img_5.jpg, columns=three -2025-03-02 15:52:38,559 [INFO] __main__ - Processing table image => img_6.jpg, columns=three -2025-03-02 15:52:42,236 [INFO] __main__ - Processing table image => img_7.jpg, columns=three -2025-03-02 15:52:45,343 [INFO] __main__ - Processing table image => img_8.jpg, columns=three -2025-03-02 15:52:46,233 [INFO] __main__ - Processing table image => img_9.jpg, columns=three -2025-03-02 15:52:48,663 [INFO] __main__ - Final JSON saved locally at /home/user/app/outpu_m/final_output.json -2025-03-02 15:52:48,663 [INFO] __main__ - Final JSON saved locally at /home/user/app/outpu_m/final_output_local.json -2025-03-02 15:52:48,989 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:52:48,994 [INFO] __main__ - Processing completed successfully. -2025-03-02 15:55:31,322 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:55:31,323 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-as-a-economics-specification-from-2015.pdf -2025-03-02 15:55:32,172 [INFO] __main__ - Gemini returned subtopics: {'2.1 AS units': [7, 22], '2.2 A2 units': [23, 43]} -2025-03-02 15:55:32,174 [INFO] __main__ - Loaded 3543551 bytes from local file '/home/user/app/input_output/wjec-as-a-economics-specification-from-2015.pdf' -2025-03-02 15:55:32,396 [INFO] __main__ - Occurrences of subtopic '2.1 AS units': [] -2025-03-02 15:55:32,396 [WARNING] __main__ - No suitable occurrence for '2.1 AS units'. Using page 0. -2025-03-02 15:55:32,610 [INFO] __main__ - Occurrences of subtopic '2.2 A2 units': [] -2025-03-02 15:55:32,610 [WARNING] __main__ - No suitable occurrence for '2.2 A2 units'. Using page 0. -2025-03-02 15:55:32,611 [INFO] __main__ - Processing pages (0-based): [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42] -2025-03-02 15:55:50,803 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:56:57,528 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:56:57,529 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf -2025-03-02 15:56:58,358 [INFO] __main__ - Gemini returned subtopics: {'2.1 AS Unit 1': [11, 17], '2.2 AS Unit 2': [18, 23], '2.3 A2 Unit 3': [24, 30], '2.4 A2 Unit 4': [31, 35]} -2025-03-02 15:56:58,359 [INFO] __main__ - Loaded 1510568 bytes from local file '/home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf' -2025-03-02 15:56:58,658 [INFO] __main__ - Occurrences of subtopic '2.1 AS Unit 1': [3] -2025-03-02 15:56:58,659 [WARNING] __main__ - No suitable occurrence for '2.1 AS Unit 1'. Using page 3. -2025-03-02 15:56:58,982 [INFO] __main__ - Occurrences of subtopic '2.2 AS Unit 2': [3] -2025-03-02 15:56:58,983 [WARNING] __main__ - No suitable occurrence for '2.2 AS Unit 2'. Using page 3. -2025-03-02 15:56:59,291 [INFO] __main__ - Occurrences of subtopic '2.3 A2 Unit 3': [3] -2025-03-02 15:56:59,291 [WARNING] __main__ - No suitable occurrence for '2.3 A2 Unit 3'. Using page 3. -2025-03-02 15:56:59,542 [INFO] __main__ - Occurrences of subtopic '2.4 A2 Unit 4': [3] -2025-03-02 15:56:59,542 [WARNING] __main__ - No suitable occurrence for '2.4 A2 Unit 4'. Using page 3. -2025-03-02 15:56:59,542 [INFO] __main__ - Processing pages (0-based): [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34] -2025-03-02 15:57:41,369 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:57:41,370 [ERROR] __main__ - Processing failed: CUDA out of memory. Tried to allocate 756.00 MiB. GPU -2025-03-02 15:58:10,089 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:58:10,089 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf -2025-03-02 15:58:10,963 [INFO] __main__ - Gemini returned subtopics: {'2.1 AS Unit 1': [11, 17], '2.2 AS Unit 2': [18, 23], '2.3 A2 Unit 3': [24, 30], '2.4 A2 Unit 4': [31, 35]} -2025-03-02 15:58:10,964 [INFO] __main__ - Loaded 1510568 bytes from local file '/home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf' -2025-03-02 15:58:11,265 [INFO] __main__ - Occurrences of subtopic '2.1 AS Unit 1': [3] -2025-03-02 15:58:11,265 [WARNING] __main__ - No suitable occurrence for '2.1 AS Unit 1'. Using page 3. -2025-03-02 15:58:11,464 [INFO] __main__ - Occurrences of subtopic '2.2 AS Unit 2': [3] -2025-03-02 15:58:11,465 [WARNING] __main__ - No suitable occurrence for '2.2 AS Unit 2'. Using page 3. -2025-03-02 15:58:11,652 [INFO] __main__ - Occurrences of subtopic '2.3 A2 Unit 3': [3] -2025-03-02 15:58:11,652 [WARNING] __main__ - No suitable occurrence for '2.3 A2 Unit 3'. Using page 3. -2025-03-02 15:58:11,838 [INFO] __main__ - Occurrences of subtopic '2.4 A2 Unit 4': [3] -2025-03-02 15:58:11,838 [WARNING] __main__ - No suitable occurrence for '2.4 A2 Unit 4'. Using page 3. -2025-03-02 15:58:11,839 [INFO] __main__ - Processing pages (0-based): [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34] -2025-03-02 15:58:29,224 [INFO] __main__ - GPU memory cleaned up. -2025-03-02 15:58:29,225 [ERROR] __main__ - Processing failed: CUDA out of memory. Tried to allocate 148.00 MiB. GPU -2025-03-02 15:59:09,686 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-02 15:59:09,687 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf -2025-03-02 15:59:10,528 [INFO] __main__ - Gemini returned subtopics: {'2.1 AS Unit 1': [11, 17], '2.2 AS Unit 2': [18, 23], '2.3 A2 Unit 3': [24, 30], '2.4 A2 Unit 4': [31, 35]} -2025-03-02 15:59:10,529 [INFO] __main__ - Loaded 1510568 bytes from local file '/home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf' -2025-03-02 15:59:10,803 [INFO] __main__ - Occurrences of subtopic '2.1 AS Unit 1': [3] -2025-03-02 15:59:10,804 [WARNING] __main__ - No suitable occurrence for '2.1 AS Unit 1'. Using page 3. -2025-03-02 15:59:11,115 [INFO] __main__ - Occurrences of subtopic '2.2 AS Unit 2': [3] -2025-03-02 15:59:11,116 [WARNING] __main__ - No suitable occurrence for '2.2 AS Unit 2'. Using page 3. -2025-03-02 15:59:11,424 [INFO] __main__ - Occurrences of subtopic '2.3 A2 Unit 3': [3] -2025-03-02 15:59:11,424 [WARNING] __main__ - No suitable occurrence for '2.3 A2 Unit 3'. Using page 3. -2025-03-02 15:59:11,730 [INFO] __main__ - Occurrences of subtopic '2.4 A2 Unit 4': [3] -2025-03-02 15:59:11,730 [WARNING] __main__ - No suitable occurrence for '2.4 A2 Unit 4'. Using page 3. -2025-03-02 15:59:11,730 [INFO] __main__ - Processing pages (0-based): [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34] -2025-03-02 15:59:48,050 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 13:26:57,282 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 13:26:58,095 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 13:26:58,096 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 13:26:58,447 [INFO] __main__ - Computed global offset: 4 -2025-03-03 13:26:58,447 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 13:28:51,814 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 13:28:53,673 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 13:28:54,260 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 13:28:54,783 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 13:28:55,347 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 13:28:55,931 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 13:28:56,425 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 13:28:57,012 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 13:28:57,496 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 13:28:58,025 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 13:28:58,574 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 13:28:59,112 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 13:28:59,574 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 13:29:00,155 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 13:29:00,690 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 13:29:01,291 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 13:29:01,867 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 13:29:02,177 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 13:29:02,676 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 13:29:03,274 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 13:29:03,849 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 13:29:04,424 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 13:29:04,883 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 13:29:05,416 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 13:29:05,965 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 13:29:06,408 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 13:29:06,899 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 13:29:07,492 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 13:29:07,914 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 13:29:11,945 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 13:29:15,280 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c0.jpg -2025-03-03 13:29:15,569 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c1.jpg -2025-03-03 13:29:15,771 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c0.jpg -2025-03-03 13:29:16,000 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c1.jpg -2025-03-03 13:29:16,001 [ERROR] __main__ - Error processing table image /topic-extraction/img_1.jpg: No module named 'your_module' -2025-03-03 13:29:16,002 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 13:29:19,539 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c0.jpg -2025-03-03 13:29:19,840 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c1.jpg -2025-03-03 13:29:20,032 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r1_c0.jpg -2025-03-03 13:29:20,254 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r2_c0.jpg -2025-03-03 13:29:20,493 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r3_c0.jpg -2025-03-03 13:29:20,493 [ERROR] __main__ - Error processing table image /topic-extraction/img_2.jpg: No module named 'your_module' -2025-03-03 13:29:20,494 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 13:29:23,853 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c0.jpg -2025-03-03 13:29:24,160 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c1.jpg -2025-03-03 13:29:24,408 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r1_c0.jpg -2025-03-03 13:29:24,409 [ERROR] __main__ - Error processing table image /topic-extraction/img_3.jpg: No module named 'your_module' -2025-03-03 13:29:24,410 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 13:29:27,500 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c0.jpg -2025-03-03 13:29:27,818 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c1.jpg -2025-03-03 13:29:28,045 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c0.jpg -2025-03-03 13:29:28,266 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c1.jpg -2025-03-03 13:29:28,268 [ERROR] __main__ - Error processing table image /topic-extraction/img_4.jpg: No module named 'your_module' -2025-03-03 13:29:28,268 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 13:29:31,772 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c0.jpg -2025-03-03 13:29:32,031 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c1.jpg -2025-03-03 13:29:32,220 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c0.jpg -2025-03-03 13:29:32,432 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c1.jpg -2025-03-03 13:29:32,646 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r2_c0.jpg -2025-03-03 13:29:32,648 [ERROR] __main__ - Error processing table image /topic-extraction/img_5.jpg: No module named 'your_module' -2025-03-03 13:29:32,648 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 13:29:36,400 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c0.jpg -2025-03-03 13:29:36,689 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c1.jpg -2025-03-03 13:29:36,923 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c0.jpg -2025-03-03 13:29:37,135 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c1.jpg -2025-03-03 13:29:37,137 [ERROR] __main__ - Error processing table image /topic-extraction/img_6.jpg: No module named 'your_module' -2025-03-03 13:29:37,137 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 13:29:40,804 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c0.jpg -2025-03-03 13:29:41,126 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c1.jpg -2025-03-03 13:29:41,336 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r1_c0.jpg -2025-03-03 13:29:41,527 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c0.jpg -2025-03-03 13:29:41,750 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c1.jpg -2025-03-03 13:29:41,752 [ERROR] __main__ - Error processing table image /topic-extraction/img_7.jpg: No module named 'your_module' -2025-03-03 13:29:41,752 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 13:29:45,625 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c0.jpg -2025-03-03 13:29:45,929 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c1.jpg -2025-03-03 13:29:46,120 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c2.jpg -2025-03-03 13:29:46,332 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c0.jpg -2025-03-03 13:29:46,554 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c1.jpg -2025-03-03 13:29:46,785 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c2.jpg -2025-03-03 13:29:46,973 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c0.jpg -2025-03-03 13:29:47,161 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c1.jpg -2025-03-03 13:29:47,371 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c0.jpg -2025-03-03 13:29:47,604 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c1.jpg -2025-03-03 13:29:47,860 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c0.jpg -2025-03-03 13:29:48,073 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c1.jpg -2025-03-03 13:29:48,293 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c0.jpg -2025-03-03 13:29:48,527 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c1.jpg -2025-03-03 13:29:48,528 [ERROR] __main__ - Error processing table image /topic-extraction/img_8.jpg: No module named 'your_module' -2025-03-03 13:29:48,529 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 13:29:52,131 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c0.jpg -2025-03-03 13:29:52,394 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c1.jpg -2025-03-03 13:29:52,608 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c2.jpg -2025-03-03 13:29:52,828 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c0.jpg -2025-03-03 13:29:53,040 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c1.jpg -2025-03-03 13:29:53,230 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c0.jpg -2025-03-03 13:29:53,483 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c1.jpg -2025-03-03 13:29:53,712 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c0.jpg -2025-03-03 13:29:53,907 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c1.jpg -2025-03-03 13:29:53,908 [ERROR] __main__ - Error processing table image /topic-extraction/img_9.jpg: No module named 'your_module' -2025-03-03 13:29:53,908 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 13:29:57,203 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 13:33:11,287 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 13:33:12,086 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 13:33:12,087 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 13:33:12,450 [INFO] __main__ - Computed global offset: 4 -2025-03-03 13:33:12,450 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 13:34:08,590 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 13:34:10,443 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 13:34:11,003 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 13:34:11,551 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 13:34:12,104 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 13:34:12,646 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 13:34:13,132 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 13:34:13,599 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 13:34:14,153 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 13:34:14,847 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 13:34:15,304 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 13:34:15,831 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 13:34:16,257 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 13:34:16,888 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 13:34:17,395 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 13:34:17,984 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 13:34:18,529 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 13:34:18,824 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 13:34:19,296 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 13:34:19,924 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 13:34:20,496 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 13:34:21,104 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 13:34:21,531 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 13:34:22,061 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 13:34:22,624 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 13:34:23,058 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 13:34:23,603 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 13:34:24,201 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 13:34:24,606 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 13:34:28,395 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 13:34:31,655 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c0.jpg -2025-03-03 13:34:31,943 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c1.jpg -2025-03-03 13:34:32,144 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c0.jpg -2025-03-03 13:34:32,376 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c1.jpg -2025-03-03 13:34:32,801 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 13:34:36,018 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c0.jpg -2025-03-03 13:34:36,281 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c1.jpg -2025-03-03 13:34:36,504 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r1_c0.jpg -2025-03-03 13:34:36,725 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r2_c0.jpg -2025-03-03 13:34:36,916 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r3_c0.jpg -2025-03-03 13:34:37,320 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 13:34:40,825 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c0.jpg -2025-03-03 13:34:41,120 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c1.jpg -2025-03-03 13:34:41,332 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r1_c0.jpg -2025-03-03 13:34:41,846 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 13:34:44,953 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c0.jpg -2025-03-03 13:34:45,248 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c1.jpg -2025-03-03 13:34:45,467 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c0.jpg -2025-03-03 13:34:45,695 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c1.jpg -2025-03-03 13:34:46,080 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 13:34:49,588 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c0.jpg -2025-03-03 13:34:49,913 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c1.jpg -2025-03-03 13:34:50,102 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c0.jpg -2025-03-03 13:34:50,361 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c1.jpg -2025-03-03 13:34:50,608 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r2_c0.jpg -2025-03-03 13:34:51,003 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 13:34:54,650 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c0.jpg -2025-03-03 13:34:54,948 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c1.jpg -2025-03-03 13:34:55,165 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c0.jpg -2025-03-03 13:34:55,384 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c1.jpg -2025-03-03 13:34:55,811 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 13:34:59,337 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c0.jpg -2025-03-03 13:34:59,623 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c1.jpg -2025-03-03 13:34:59,864 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r1_c0.jpg -2025-03-03 13:35:00,087 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c0.jpg -2025-03-03 13:35:00,301 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c1.jpg -2025-03-03 13:35:00,734 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 13:35:04,169 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c0.jpg -2025-03-03 13:35:04,466 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c1.jpg -2025-03-03 13:35:04,693 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c2.jpg -2025-03-03 13:35:04,880 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c0.jpg -2025-03-03 13:35:05,111 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c1.jpg -2025-03-03 13:35:05,414 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c2.jpg -2025-03-03 13:35:05,607 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c0.jpg -2025-03-03 13:35:05,841 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c1.jpg -2025-03-03 13:35:06,062 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c0.jpg -2025-03-03 13:35:06,273 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c1.jpg -2025-03-03 13:35:06,497 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c0.jpg -2025-03-03 13:35:06,737 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c1.jpg -2025-03-03 13:35:06,967 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c0.jpg -2025-03-03 13:35:07,191 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c1.jpg -2025-03-03 13:35:07,636 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 13:35:11,212 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c0.jpg -2025-03-03 13:35:11,510 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c1.jpg -2025-03-03 13:35:11,732 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c2.jpg -2025-03-03 13:35:11,922 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c0.jpg -2025-03-03 13:35:12,139 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c1.jpg -2025-03-03 13:35:12,349 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c0.jpg -2025-03-03 13:35:12,596 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c1.jpg -2025-03-03 13:35:12,802 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c0.jpg -2025-03-03 13:35:13,038 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c1.jpg -2025-03-03 13:35:13,489 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 13:35:16,908 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c0.jpg -2025-03-03 13:35:17,202 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c1.jpg -2025-03-03 13:35:17,420 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r1_c0.jpg -2025-03-03 13:35:17,635 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r2_c0.jpg -2025-03-03 13:35:17,860 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r3_c0.jpg -2025-03-03 13:35:18,244 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 13:35:21,413 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r0_c0.jpg -2025-03-03 13:35:21,722 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r1_c0.jpg -2025-03-03 13:35:21,947 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r2_c0.jpg -2025-03-03 13:35:22,173 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r3_c0.jpg -2025-03-03 13:35:22,430 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r4_c0.jpg -2025-03-03 13:35:22,616 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r5_c0.jpg -2025-03-03 13:35:23,009 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 13:35:26,169 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c0.jpg -2025-03-03 13:35:26,452 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c1.jpg -2025-03-03 13:35:26,674 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c0.jpg -2025-03-03 13:35:26,894 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c1.jpg -2025-03-03 13:35:27,104 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c0.jpg -2025-03-03 13:35:27,342 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c1.jpg -2025-03-03 13:35:27,736 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 13:35:30,898 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c0.jpg -2025-03-03 13:35:31,165 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c1.jpg -2025-03-03 13:35:31,398 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c0.jpg -2025-03-03 13:35:31,601 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c1.jpg -2025-03-03 13:35:31,789 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r2_c0.jpg -2025-03-03 13:35:32,007 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r3_c0.jpg -2025-03-03 13:35:32,449 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 13:35:36,914 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c0.jpg -2025-03-03 13:35:37,204 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c1.jpg -2025-03-03 13:35:37,427 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c0.jpg -2025-03-03 13:35:37,643 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c1.jpg -2025-03-03 13:35:37,861 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r2_c0.jpg -2025-03-03 13:35:38,087 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r3_c0.jpg -2025-03-03 13:35:38,312 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c0.jpg -2025-03-03 13:35:38,536 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c1.jpg -2025-03-03 13:35:38,735 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r5_c0.jpg -2025-03-03 13:35:39,132 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 13:35:42,577 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c0.jpg -2025-03-03 13:35:42,882 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c1.jpg -2025-03-03 13:35:43,086 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c0.jpg -2025-03-03 13:35:43,310 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c1.jpg -2025-03-03 13:35:43,529 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r2_c0.jpg -2025-03-03 13:35:43,718 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r3_c0.jpg -2025-03-03 13:35:43,914 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r4_c0.jpg -2025-03-03 13:35:44,304 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 13:35:48,617 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c0.jpg -2025-03-03 13:35:48,915 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c1.jpg -2025-03-03 13:35:49,124 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c0.jpg -2025-03-03 13:35:49,350 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c1.jpg -2025-03-03 13:35:49,578 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r2_c0.jpg -2025-03-03 13:35:49,801 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c0.jpg -2025-03-03 13:35:50,013 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c1.jpg -2025-03-03 13:35:50,239 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r4_c0.jpg -2025-03-03 13:35:51,085 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r5_c0.jpg -2025-03-03 13:35:51,516 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 13:35:55,302 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c0.jpg -2025-03-03 13:35:55,608 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c1.jpg -2025-03-03 13:35:55,831 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r1_c0.jpg -2025-03-03 13:35:56,034 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r2_c0.jpg -2025-03-03 13:35:56,259 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r2_c1.jpg -2025-03-03 13:35:56,467 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r3_c0.jpg -2025-03-03 13:35:56,661 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r4_c0.jpg -2025-03-03 13:35:56,882 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r5_c0.jpg -2025-03-03 13:35:57,282 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 13:35:58,547 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c0.jpg -2025-03-03 13:35:58,740 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c1.jpg -2025-03-03 13:35:58,934 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r1_c0.jpg -2025-03-03 13:35:59,124 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r1_c1.jpg -2025-03-03 13:35:59,543 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 13:36:02,472 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c0.jpg -2025-03-03 13:36:02,792 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c1.jpg -2025-03-03 13:36:03,024 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r1_c0.jpg -2025-03-03 13:36:03,265 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r1_c1.jpg -2025-03-03 13:36:03,521 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r2_c0.jpg -2025-03-03 13:36:03,745 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r2_c1.jpg -2025-03-03 13:36:04,162 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 13:36:09,253 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c0.jpg -2025-03-03 13:36:09,551 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c1.jpg -2025-03-03 13:36:09,791 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r1_c0.jpg -2025-03-03 13:36:09,983 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r1_c1.jpg -2025-03-03 13:36:10,370 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 13:36:14,808 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c0.jpg -2025-03-03 13:36:15,103 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c1.jpg -2025-03-03 13:36:15,329 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r1_c0.jpg -2025-03-03 13:36:15,570 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r1_c1.jpg -2025-03-03 13:36:16,020 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 13:36:20,786 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c0.jpg -2025-03-03 13:36:21,053 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c1.jpg -2025-03-03 13:36:21,295 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r1_c0.jpg -2025-03-03 13:36:21,539 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r1_c1.jpg -2025-03-03 13:36:21,753 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r2_c0.jpg -2025-03-03 13:36:21,966 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r2_c1.jpg -2025-03-03 13:36:22,203 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r3_c0.jpg -2025-03-03 13:36:22,623 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 13:36:25,781 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c0.jpg -2025-03-03 13:36:26,112 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c1.jpg -2025-03-03 13:36:26,300 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r1_c0.jpg -2025-03-03 13:36:26,540 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r1_c1.jpg -2025-03-03 13:36:26,953 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 13:36:30,871 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c0.jpg -2025-03-03 13:36:31,169 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c1.jpg -2025-03-03 13:36:31,360 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r1_c0.jpg -2025-03-03 13:36:31,586 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r1_c1.jpg -2025-03-03 13:36:31,839 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r2_c0.jpg -2025-03-03 13:36:32,233 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-03 13:36:35,440 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r0_c0.jpg -2025-03-03 13:36:35,737 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r1_c0.jpg -2025-03-03 13:36:35,958 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r2_c0.jpg -2025-03-03 13:36:36,186 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r3_c0.jpg -2025-03-03 13:36:36,417 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r4_c0.jpg -2025-03-03 13:36:36,615 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r5_c0.jpg -2025-03-03 13:36:37,031 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-03 13:36:40,174 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r0_c0.jpg -2025-03-03 13:36:40,461 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r0_c1.jpg -2025-03-03 13:36:40,681 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r1_c0.jpg -2025-03-03 13:36:40,872 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r2_c0.jpg -2025-03-03 13:36:41,298 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-03 13:36:44,242 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r0_c0.jpg -2025-03-03 13:36:44,546 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r0_c1.jpg -2025-03-03 13:36:44,735 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r1_c0.jpg -2025-03-03 13:36:44,958 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r1_c1.jpg -2025-03-03 13:36:45,189 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r2_c0.jpg -2025-03-03 13:36:45,401 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r3_c0.jpg -2025-03-03 13:36:45,632 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r4_c0.jpg -2025-03-03 13:36:45,853 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r4_c1.jpg -2025-03-03 13:36:46,263 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=two -2025-03-03 13:36:49,648 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r0_c0.jpg -2025-03-03 13:36:49,911 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r1_c0.jpg -2025-03-03 13:36:50,133 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r2_c0.jpg -2025-03-03 13:36:50,350 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r3_c0.jpg -2025-03-03 13:36:50,760 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/we/final_subtopics.json -2025-03-03 13:36:51,086 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 13:36:51,093 [INFO] __main__ - Processing completed successfully. -2025-03-03 14:05:17,866 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 14:05:18,700 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 14:05:18,702 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 14:05:19,046 [INFO] __main__ - Computed global offset: 4 -2025-03-03 14:05:19,047 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 14:05:52,370 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 14:10:28,391 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 14:10:29,161 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 14:10:29,162 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 14:10:29,484 [INFO] __main__ - Computed global offset: 4 -2025-03-03 14:10:29,484 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 14:11:29,432 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 14:11:31,185 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 14:11:31,702 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 14:11:32,212 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 14:11:32,763 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 14:11:33,406 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 14:11:33,898 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 14:11:34,395 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 14:11:34,878 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 14:11:35,417 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 14:11:35,838 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 14:11:36,385 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 14:11:36,971 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 14:11:37,669 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 14:11:38,314 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 14:11:38,926 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 14:11:39,484 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 14:11:39,846 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 14:11:40,381 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 14:11:40,979 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 14:11:41,538 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 14:11:42,104 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 14:11:42,640 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 14:11:43,153 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 14:11:43,663 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 14:11:44,172 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 14:11:44,677 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 14:11:45,255 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 14:11:45,680 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 14:11:49,521 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 14:11:52,761 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c0.jpg -2025-03-03 14:11:52,762 [ERROR] __main__ - Error processing table image /topic-extraction/img_1.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:11:52,762 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 14:11:56,000 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c0.jpg -2025-03-03 14:11:56,000 [ERROR] __main__ - Error processing table image /topic-extraction/img_2.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:11:56,000 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 14:11:59,555 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c0.jpg -2025-03-03 14:11:59,555 [ERROR] __main__ - Error processing table image /topic-extraction/img_3.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:11:59,556 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 14:12:02,696 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c0.jpg -2025-03-03 14:12:02,697 [ERROR] __main__ - Error processing table image /topic-extraction/img_4.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:02,697 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 14:12:06,308 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c0.jpg -2025-03-03 14:12:06,309 [ERROR] __main__ - Error processing table image /topic-extraction/img_5.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:06,309 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 14:12:10,140 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c0.jpg -2025-03-03 14:12:10,141 [ERROR] __main__ - Error processing table image /topic-extraction/img_6.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:10,141 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 14:12:13,713 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c0.jpg -2025-03-03 14:12:13,713 [ERROR] __main__ - Error processing table image /topic-extraction/img_7.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:13,713 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 14:12:17,306 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c0.jpg -2025-03-03 14:12:17,306 [ERROR] __main__ - Error processing table image /topic-extraction/img_8.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:17,306 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 14:12:21,354 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c0.jpg -2025-03-03 14:12:21,354 [ERROR] __main__ - Error processing table image /topic-extraction/img_9.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:21,355 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 14:12:24,668 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c0.jpg -2025-03-03 14:12:24,668 [ERROR] __main__ - Error processing table image /topic-extraction/img_10.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:24,668 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 14:12:27,914 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r0_c0.jpg -2025-03-03 14:12:27,914 [ERROR] __main__ - Error processing table image /topic-extraction/img_11.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:27,915 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 14:12:31,345 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c0.jpg -2025-03-03 14:12:31,346 [ERROR] __main__ - Error processing table image /topic-extraction/img_12.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:31,346 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 14:12:34,536 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c0.jpg -2025-03-03 14:12:34,536 [ERROR] __main__ - Error processing table image /topic-extraction/img_13.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:34,536 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 14:12:39,055 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c0.jpg -2025-03-03 14:12:39,056 [ERROR] __main__ - Error processing table image /topic-extraction/img_14.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:39,056 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 14:12:43,762 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c0.jpg -2025-03-03 14:12:43,763 [ERROR] __main__ - Error processing table image /topic-extraction/img_15.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:43,763 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 14:12:48,110 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c0.jpg -2025-03-03 14:12:48,110 [ERROR] __main__ - Error processing table image /topic-extraction/img_16.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:48,110 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 14:12:52,283 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c0.jpg -2025-03-03 14:12:52,283 [ERROR] __main__ - Error processing table image /topic-extraction/img_17.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:52,284 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 14:12:53,748 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c0.jpg -2025-03-03 14:12:53,748 [ERROR] __main__ - Error processing table image /topic-extraction/img_18.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:53,749 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 14:12:57,191 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c0.jpg -2025-03-03 14:12:57,191 [ERROR] __main__ - Error processing table image /topic-extraction/img_19.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:12:57,192 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 14:13:01,859 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c0.jpg -2025-03-03 14:13:01,862 [ERROR] __main__ - Error processing table image /topic-extraction/img_20.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:13:01,862 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 14:13:05,555 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c0.jpg -2025-03-03 14:13:05,556 [ERROR] __main__ - Error processing table image /topic-extraction/img_21.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:13:05,556 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 14:13:10,335 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c0.jpg -2025-03-03 14:13:10,336 [ERROR] __main__ - Error processing table image /topic-extraction/img_22.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:13:10,336 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 14:13:13,658 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c0.jpg -2025-03-03 14:13:13,658 [ERROR] __main__ - Error processing table image /topic-extraction/img_23.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:13:13,659 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 14:13:17,286 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c0.jpg -2025-03-03 14:13:17,287 [ERROR] __main__ - Error processing table image /topic-extraction/img_24.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:13:17,287 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-03 14:13:20,933 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r0_c0.jpg -2025-03-03 14:13:20,933 [ERROR] __main__ - Error processing table image /topic-extraction/img_25.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:13:20,933 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-03 14:13:23,943 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r0_c0.jpg -2025-03-03 14:13:23,943 [ERROR] __main__ - Error processing table image /topic-extraction/img_26.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:13:23,944 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-03 14:13:27,633 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r0_c0.jpg -2025-03-03 14:13:27,633 [ERROR] __main__ - Error processing table image /topic-extraction/img_27.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:13:27,633 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=two -2025-03-03 14:13:31,099 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r0_c0.jpg -2025-03-03 14:13:31,100 [ERROR] __main__ - Error processing table image /topic-extraction/img_28.jpg: 'TableExtractor' object has no attribute 'crop_cell' -2025-03-03 14:13:31,102 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/we/final_subtopics.json -2025-03-03 14:13:31,434 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 14:13:31,442 [INFO] __main__ - Processing completed successfully. -2025-03-03 14:35:18,303 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-03 14:35:18,304 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf -2025-03-03 14:35:18,304 [ERROR] __main__ - Could not open PDF: [Errno 2] No such file or directory: '/home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf' -2025-03-03 14:35:18,304 [ERROR] __main__ - No text from first pages => cannot extract subtopics. -2025-03-03 14:35:18,304 [INFO] __main__ - Gemini returned subtopics: {} -2025-03-03 14:35:18,576 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 14:35:18,576 [ERROR] __main__ - Processing failed: [Errno 2] No such file or directory: '/home/user/app/input_output/wjec-gce-maths-spec-from-2017-e.pdf' -2025-03-03 14:36:15,476 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-03 14:36:15,476 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 14:36:16,171 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 14:36:16,172 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 14:36:16,326 [INFO] __main__ - Occurrences of subtopic 'Paper 1 and Paper 2: Pure Mathematics': [2, 3, 4, 14] -2025-03-03 14:36:16,513 [INFO] __main__ - Occurrences of subtopic 'Paper 3: Statistics and Mechanics': [3, 4, 9, 13, 33, 44] -2025-03-03 14:36:16,514 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 14:37:10,283 [INFO] __main__ - doc_analyze complete. Extracting images. -2025-03-03 14:37:18,985 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 14:37:47,492 [INFO] __main__ - Processing table image => img_1.jpg, columns=three -2025-03-03 14:37:50,068 [INFO] __main__ - Processing table image => img_2.jpg, columns=three -2025-03-03 14:37:52,978 [INFO] __main__ - Processing table image => img_3.jpg, columns=three -2025-03-03 14:37:56,011 [INFO] __main__ - Processing table image => img_4.jpg, columns=three -2025-03-03 14:37:58,952 [INFO] __main__ - Processing table image => img_5.jpg, columns=three -2025-03-03 14:38:02,311 [INFO] __main__ - Processing table image => img_6.jpg, columns=three -2025-03-03 14:38:05,509 [INFO] __main__ - Processing table image => img_7.jpg, columns=three -2025-03-03 14:38:08,662 [INFO] __main__ - Processing table image => img_8.jpg, columns=three -2025-03-03 14:38:11,711 [INFO] __main__ - Processing table image => img_9.jpg, columns=three -2025-03-03 14:38:15,102 [INFO] __main__ - Processing table image => img_10.jpg, columns=three -2025-03-03 14:38:18,262 [INFO] __main__ - Processing table image => img_11.jpg, columns=two -2025-03-03 14:38:21,059 [INFO] __main__ - Processing table image => img_12.jpg, columns=three -2025-03-03 14:38:23,975 [INFO] __main__ - Processing table image => img_13.jpg, columns=three -2025-03-03 14:38:26,770 [INFO] __main__ - Processing table image => img_14.jpg, columns=three -2025-03-03 14:38:30,620 [INFO] __main__ - Processing table image => img_15.jpg, columns=three -2025-03-03 14:38:33,945 [INFO] __main__ - Processing table image => img_16.jpg, columns=three -2025-03-03 14:38:37,404 [INFO] __main__ - Processing table image => img_17.jpg, columns=three -2025-03-03 14:38:40,601 [INFO] __main__ - Processing table image => img_18.jpg, columns=three -2025-03-03 14:38:41,459 [INFO] __main__ - Processing table image => img_19.jpg, columns=three -2025-03-03 14:38:44,087 [INFO] __main__ - Processing table image => img_20.jpg, columns=three -2025-03-03 14:38:47,631 [INFO] __main__ - Processing table image => img_21.jpg, columns=three -2025-03-03 14:38:50,716 [INFO] __main__ - Processing table image => img_22.jpg, columns=three -2025-03-03 14:38:54,125 [INFO] __main__ - Processing table image => img_23.jpg, columns=three -2025-03-03 14:38:57,256 [INFO] __main__ - Processing table image => img_24.jpg, columns=three -2025-03-03 14:39:00,533 [INFO] __main__ - Processing table image => img_25.jpg, columns=two -2025-03-03 14:39:03,631 [INFO] __main__ - Processing table image => img_26.jpg, columns=three -2025-03-03 14:39:06,279 [INFO] __main__ - Processing table image => img_27.jpg, columns=three -2025-03-03 14:39:08,942 [INFO] __main__ - Processing table image => img_28.jpg, columns=two -2025-03-03 14:39:11,877 [INFO] __main__ - Final JSON saved locally at /home/user/app/wje/final_output.json -2025-03-03 14:39:11,877 [INFO] __main__ - Final JSON saved locally at /home/user/app/wje/final_output_local.json -2025-03-03 14:39:12,200 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 14:39:12,207 [INFO] __main__ - Processing completed successfully. -2025-03-03 14:40:32,170 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 14:40:32,970 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 14:40:32,971 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 14:40:33,339 [INFO] __main__ - Computed global offset: 4 -2025-03-03 14:40:33,340 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 14:41:29,053 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 14:41:30,998 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 14:41:31,535 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 14:41:32,063 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 14:41:32,706 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 14:41:33,299 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 14:41:33,822 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 14:41:34,284 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 14:41:34,858 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 14:41:35,387 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 14:41:35,909 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 14:41:36,445 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 14:41:36,898 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 14:41:37,505 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 14:41:38,061 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 14:41:38,617 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 14:41:39,193 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 14:41:39,510 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 14:41:39,998 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 14:41:40,612 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 14:41:41,206 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 14:41:41,798 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 14:41:42,242 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 14:41:42,723 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 14:41:43,337 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 14:41:43,795 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 14:41:44,246 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 14:41:44,848 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 14:41:45,258 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 14:41:49,188 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 14:41:52,437 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c0.jpg -2025-03-03 14:41:52,752 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c1.jpg -2025-03-03 14:41:52,968 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c0.jpg -2025-03-03 14:41:53,195 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c1.jpg -2025-03-03 14:41:53,696 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 14:41:57,147 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c0.jpg -2025-03-03 14:41:57,453 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c1.jpg -2025-03-03 14:41:57,682 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r1_c0.jpg -2025-03-03 14:41:57,910 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r2_c0.jpg -2025-03-03 14:41:58,166 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r3_c0.jpg -2025-03-03 14:41:58,667 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 14:42:02,140 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c0.jpg -2025-03-03 14:42:02,430 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c1.jpg -2025-03-03 14:42:02,663 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r1_c0.jpg -2025-03-03 14:42:03,164 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 14:42:06,264 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c0.jpg -2025-03-03 14:42:06,558 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c1.jpg -2025-03-03 14:42:06,783 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c0.jpg -2025-03-03 14:42:06,996 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c1.jpg -2025-03-03 14:42:07,497 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 14:42:11,013 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c0.jpg -2025-03-03 14:42:11,312 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c1.jpg -2025-03-03 14:42:11,503 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c0.jpg -2025-03-03 14:42:11,721 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c1.jpg -2025-03-03 14:42:11,982 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r2_c0.jpg -2025-03-03 14:42:12,484 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 14:42:16,389 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c0.jpg -2025-03-03 14:42:16,683 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c1.jpg -2025-03-03 14:42:16,900 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c0.jpg -2025-03-03 14:42:17,133 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c1.jpg -2025-03-03 14:42:17,634 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 14:42:21,186 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c0.jpg -2025-03-03 14:42:21,499 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c1.jpg -2025-03-03 14:42:21,717 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r1_c0.jpg -2025-03-03 14:42:22,011 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c0.jpg -2025-03-03 14:42:22,248 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c1.jpg -2025-03-03 14:42:22,750 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 14:42:26,069 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c0.jpg -2025-03-03 14:42:26,374 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c1.jpg -2025-03-03 14:42:26,616 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c2.jpg -2025-03-03 14:42:26,848 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c0.jpg -2025-03-03 14:42:27,062 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c1.jpg -2025-03-03 14:42:27,295 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c2.jpg -2025-03-03 14:42:27,507 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c0.jpg -2025-03-03 14:42:27,736 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c1.jpg -2025-03-03 14:42:27,960 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c0.jpg -2025-03-03 14:42:28,170 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c1.jpg -2025-03-03 14:42:28,377 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c0.jpg -2025-03-03 14:42:28,613 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c1.jpg -2025-03-03 14:42:28,846 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c0.jpg -2025-03-03 14:42:29,068 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c1.jpg -2025-03-03 14:42:29,569 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 14:42:33,545 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c0.jpg -2025-03-03 14:42:33,843 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c1.jpg -2025-03-03 14:42:34,053 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c2.jpg -2025-03-03 14:42:34,267 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c0.jpg -2025-03-03 14:42:34,488 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c1.jpg -2025-03-03 14:42:34,706 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c0.jpg -2025-03-03 14:42:34,914 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c1.jpg -2025-03-03 14:42:35,159 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c0.jpg -2025-03-03 14:42:35,367 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c1.jpg -2025-03-03 14:42:35,867 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 14:42:39,999 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c0.jpg -2025-03-03 14:42:40,351 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c1.jpg -2025-03-03 14:42:40,572 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r1_c0.jpg -2025-03-03 14:42:40,781 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r2_c0.jpg -2025-03-03 14:42:40,969 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r3_c0.jpg -2025-03-03 14:42:41,470 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 14:42:44,724 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r0_c0.jpg -2025-03-03 14:42:45,053 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r1_c0.jpg -2025-03-03 14:42:45,266 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r2_c0.jpg -2025-03-03 14:42:45,496 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r3_c0.jpg -2025-03-03 14:42:45,715 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r4_c0.jpg -2025-03-03 14:42:45,925 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r5_c0.jpg -2025-03-03 14:42:46,426 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 14:42:50,046 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c0.jpg -2025-03-03 14:42:50,340 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c1.jpg -2025-03-03 14:42:50,548 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c0.jpg -2025-03-03 14:42:50,760 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c1.jpg -2025-03-03 14:42:50,979 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c0.jpg -2025-03-03 14:42:51,207 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c1.jpg -2025-03-03 14:42:51,708 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 14:42:55,058 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c0.jpg -2025-03-03 14:42:55,385 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c1.jpg -2025-03-03 14:42:55,616 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c0.jpg -2025-03-03 14:42:55,838 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c1.jpg -2025-03-03 14:42:56,071 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r2_c0.jpg -2025-03-03 14:42:56,293 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r3_c0.jpg -2025-03-03 14:42:56,794 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 14:43:01,227 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c0.jpg -2025-03-03 14:43:01,523 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c1.jpg -2025-03-03 14:43:01,761 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c0.jpg -2025-03-03 14:43:01,984 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c1.jpg -2025-03-03 14:43:02,209 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r2_c0.jpg -2025-03-03 14:43:02,433 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r3_c0.jpg -2025-03-03 14:43:02,677 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c0.jpg -2025-03-03 14:43:02,871 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c1.jpg -2025-03-03 14:43:03,091 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r5_c0.jpg -2025-03-03 14:43:03,592 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 14:43:07,715 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c0.jpg -2025-03-03 14:43:07,982 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c1.jpg -2025-03-03 14:43:08,169 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c0.jpg -2025-03-03 14:43:08,360 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c1.jpg -2025-03-03 14:43:08,623 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r2_c0.jpg -2025-03-03 14:43:08,855 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r3_c0.jpg -2025-03-03 14:43:09,088 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r4_c0.jpg -2025-03-03 14:43:09,589 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 14:43:13,265 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c0.jpg -2025-03-03 14:43:13,580 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c1.jpg -2025-03-03 14:43:13,802 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c0.jpg -2025-03-03 14:43:14,017 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c1.jpg -2025-03-03 14:43:14,240 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r2_c0.jpg -2025-03-03 14:43:14,444 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c0.jpg -2025-03-03 14:43:14,664 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c1.jpg -2025-03-03 14:43:14,882 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r4_c0.jpg -2025-03-03 14:43:15,692 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r5_c0.jpg -2025-03-03 14:43:16,193 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 14:43:20,159 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c0.jpg -2025-03-03 14:43:20,446 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c1.jpg -2025-03-03 14:43:20,634 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r1_c0.jpg -2025-03-03 14:43:20,849 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r2_c0.jpg -2025-03-03 14:43:21,057 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r2_c1.jpg -2025-03-03 14:43:21,296 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r3_c0.jpg -2025-03-03 14:43:21,521 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r4_c0.jpg -2025-03-03 14:43:21,731 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r5_c0.jpg -2025-03-03 14:43:22,232 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 14:43:23,394 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c0.jpg -2025-03-03 14:43:23,649 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c1.jpg -2025-03-03 14:43:23,830 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r1_c0.jpg -2025-03-03 14:43:24,014 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r1_c1.jpg -2025-03-03 14:43:24,515 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 14:43:27,381 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c0.jpg -2025-03-03 14:43:27,677 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c1.jpg -2025-03-03 14:43:27,862 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r1_c0.jpg -2025-03-03 14:43:28,099 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r1_c1.jpg -2025-03-03 14:43:28,327 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r2_c0.jpg -2025-03-03 14:43:28,532 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r2_c1.jpg -2025-03-03 14:43:29,033 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 14:43:33,082 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c0.jpg -2025-03-03 14:43:33,375 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c1.jpg -2025-03-03 14:43:33,565 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r1_c0.jpg -2025-03-03 14:43:33,779 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r1_c1.jpg -2025-03-03 14:43:34,280 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 14:43:37,916 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c0.jpg -2025-03-03 14:43:38,204 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c1.jpg -2025-03-03 14:43:38,389 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r1_c0.jpg -2025-03-03 14:43:38,602 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r1_c1.jpg -2025-03-03 14:43:39,103 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 14:43:43,240 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c0.jpg -2025-03-03 14:43:43,525 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c1.jpg -2025-03-03 14:43:43,724 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r1_c0.jpg -2025-03-03 14:43:43,948 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r1_c1.jpg -2025-03-03 14:43:44,175 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r2_c0.jpg -2025-03-03 14:43:44,382 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r2_c1.jpg -2025-03-03 14:43:44,580 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r3_c0.jpg -2025-03-03 14:43:45,081 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 14:43:48,504 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c0.jpg -2025-03-03 14:43:48,761 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c1.jpg -2025-03-03 14:43:48,988 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r1_c0.jpg -2025-03-03 14:43:49,194 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r1_c1.jpg -2025-03-03 14:43:49,695 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 14:43:53,228 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c0.jpg -2025-03-03 14:43:53,517 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c1.jpg -2025-03-03 14:43:53,728 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r1_c0.jpg -2025-03-03 14:43:53,961 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r1_c1.jpg -2025-03-03 14:43:54,180 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r2_c0.jpg -2025-03-03 14:43:54,681 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-03 14:43:57,957 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r0_c0.jpg -2025-03-03 14:43:58,230 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r1_c0.jpg -2025-03-03 14:43:58,415 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r2_c0.jpg -2025-03-03 14:43:58,672 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r3_c0.jpg -2025-03-03 14:43:58,857 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r4_c0.jpg -2025-03-03 14:43:59,067 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r5_c0.jpg -2025-03-03 14:43:59,568 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-03 14:44:02,774 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r0_c0.jpg -2025-03-03 14:44:03,053 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r0_c1.jpg -2025-03-03 14:44:03,283 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r1_c0.jpg -2025-03-03 14:44:03,506 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r2_c0.jpg -2025-03-03 14:44:04,007 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-03 14:44:06,977 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r0_c0.jpg -2025-03-03 14:44:07,259 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r0_c1.jpg -2025-03-03 14:44:07,476 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r1_c0.jpg -2025-03-03 14:44:07,696 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r1_c1.jpg -2025-03-03 14:44:07,914 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r2_c0.jpg -2025-03-03 14:44:08,123 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r3_c0.jpg -2025-03-03 14:44:08,350 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r4_c0.jpg -2025-03-03 14:44:08,561 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r4_c1.jpg -2025-03-03 14:44:09,062 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=two -2025-03-03 14:44:12,602 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r0_c0.jpg -2025-03-03 14:44:12,906 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r1_c0.jpg -2025-03-03 14:44:13,091 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r2_c0.jpg -2025-03-03 14:44:13,323 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r3_c0.jpg -2025-03-03 14:44:13,828 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/we/final_subtopics.json -2025-03-03 14:44:14,194 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 14:44:14,201 [INFO] __main__ - Processing completed successfully. -2025-03-03 15:06:28,358 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 15:06:29,190 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 15:06:29,191 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 15:06:29,684 [INFO] __main__ - Computed global offset: 4 -2025-03-03 15:06:29,684 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 15:07:26,351 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 15:07:28,271 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 15:07:28,880 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 15:07:29,380 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 15:07:29,978 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 15:07:30,515 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 15:07:31,001 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 15:07:31,467 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 15:07:32,038 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 15:07:32,648 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 15:07:33,107 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 15:07:33,683 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 15:07:34,203 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 15:07:34,803 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 15:07:35,241 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 15:07:35,821 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 15:07:36,441 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 15:07:36,793 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 15:07:37,345 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 15:07:37,943 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 15:07:38,490 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 15:07:39,145 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 15:07:39,623 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 15:07:40,159 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 15:07:40,778 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 15:07:41,328 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 15:07:41,899 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 15:07:42,663 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 15:07:43,118 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 15:07:46,985 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 15:07:52,030 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c0.jpg -2025-03-03 15:07:52,359 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c1.jpg -2025-03-03 15:07:52,583 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c0.jpg -2025-03-03 15:07:52,818 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c1.jpg -2025-03-03 15:07:52,819 [ERROR] __main__ - Error processing table image /topic-extraction/img_1.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:07:52,819 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 15:07:56,276 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c0.jpg -2025-03-03 15:07:56,585 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c1.jpg -2025-03-03 15:07:56,773 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r1_c0.jpg -2025-03-03 15:07:57,004 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r2_c0.jpg -2025-03-03 15:07:57,234 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r3_c0.jpg -2025-03-03 15:07:57,234 [ERROR] __main__ - Error processing table image /topic-extraction/img_2.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:07:57,235 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 15:08:00,963 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c0.jpg -2025-03-03 15:08:01,254 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c1.jpg -2025-03-03 15:08:01,442 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r1_c0.jpg -2025-03-03 15:08:01,442 [ERROR] __main__ - Error processing table image /topic-extraction/img_3.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:08:01,443 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 15:08:04,900 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c0.jpg -2025-03-03 15:08:05,196 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c1.jpg -2025-03-03 15:08:05,413 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c0.jpg -2025-03-03 15:08:05,624 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c1.jpg -2025-03-03 15:08:05,624 [ERROR] __main__ - Error processing table image /topic-extraction/img_4.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:08:05,624 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 15:08:09,549 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c0.jpg -2025-03-03 15:08:09,860 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c1.jpg -2025-03-03 15:08:10,084 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c0.jpg -2025-03-03 15:08:10,315 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c1.jpg -2025-03-03 15:08:10,529 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r2_c0.jpg -2025-03-03 15:08:10,529 [ERROR] __main__ - Error processing table image /topic-extraction/img_5.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:08:10,529 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 15:08:14,256 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c0.jpg -2025-03-03 15:08:14,539 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c1.jpg -2025-03-03 15:08:14,764 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c0.jpg -2025-03-03 15:08:14,973 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c1.jpg -2025-03-03 15:08:14,973 [ERROR] __main__ - Error processing table image /topic-extraction/img_6.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:08:14,973 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 15:08:18,857 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c0.jpg -2025-03-03 15:08:19,176 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c1.jpg -2025-03-03 15:08:19,406 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r1_c0.jpg -2025-03-03 15:08:19,628 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c0.jpg -2025-03-03 15:08:19,815 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c1.jpg -2025-03-03 15:08:19,815 [ERROR] __main__ - Error processing table image /topic-extraction/img_7.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:08:19,815 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 15:08:23,570 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c0.jpg -2025-03-03 15:08:23,858 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c1.jpg -2025-03-03 15:08:24,110 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c2.jpg -2025-03-03 15:08:24,345 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c0.jpg -2025-03-03 15:08:24,563 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c1.jpg -2025-03-03 15:08:24,779 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c2.jpg -2025-03-03 15:08:25,006 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c0.jpg -2025-03-03 15:08:25,260 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c1.jpg -2025-03-03 15:08:25,479 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c0.jpg -2025-03-03 15:08:25,695 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c1.jpg -2025-03-03 15:08:25,909 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c0.jpg -2025-03-03 15:08:26,095 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c1.jpg -2025-03-03 15:08:26,331 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c0.jpg -2025-03-03 15:08:26,541 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c1.jpg -2025-03-03 15:08:26,541 [ERROR] __main__ - Error processing table image /topic-extraction/img_8.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:08:26,541 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 15:08:30,374 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c0.jpg -2025-03-03 15:08:30,676 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c1.jpg -2025-03-03 15:08:30,888 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c2.jpg -2025-03-03 15:08:31,120 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c0.jpg -2025-03-03 15:08:31,338 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c1.jpg -2025-03-03 15:08:31,579 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c0.jpg -2025-03-03 15:08:31,801 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c1.jpg -2025-03-03 15:08:32,004 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c0.jpg -2025-03-03 15:08:32,213 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c1.jpg -2025-03-03 15:08:32,214 [ERROR] __main__ - Error processing table image /topic-extraction/img_9.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:08:32,214 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 15:08:35,917 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c0.jpg -2025-03-03 15:08:36,212 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c1.jpg -2025-03-03 15:08:36,464 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r1_c0.jpg -2025-03-03 15:08:36,689 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r2_c0.jpg -2025-03-03 15:08:36,933 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r3_c0.jpg -2025-03-03 15:08:36,934 [ERROR] __main__ - Error processing table image /topic-extraction/img_10.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:08:36,934 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 15:08:40,244 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r0_c0.jpg -2025-03-03 15:08:40,533 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r1_c0.jpg -2025-03-03 15:08:40,756 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r2_c0.jpg -2025-03-03 15:08:40,996 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r3_c0.jpg -2025-03-03 15:08:41,209 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r4_c0.jpg -2025-03-03 15:08:41,440 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r5_c0.jpg -2025-03-03 15:08:41,440 [ERROR] __main__ - Error processing table image /topic-extraction/img_11.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:08:41,440 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 15:08:44,731 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c0.jpg -2025-03-03 15:08:45,033 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c1.jpg -2025-03-03 15:08:45,244 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c0.jpg -2025-03-03 15:08:45,427 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c1.jpg -2025-03-03 15:08:45,678 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c0.jpg -2025-03-03 15:08:45,919 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c1.jpg -2025-03-03 15:08:45,919 [ERROR] __main__ - Error processing table image /topic-extraction/img_12.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:08:45,919 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 15:08:49,575 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c0.jpg -2025-03-03 15:08:49,833 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c1.jpg -2025-03-03 15:08:50,062 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c0.jpg -2025-03-03 15:08:50,268 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c1.jpg -2025-03-03 15:08:50,492 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r2_c0.jpg -2025-03-03 15:08:50,681 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r3_c0.jpg -2025-03-03 15:08:50,681 [ERROR] __main__ - Error processing table image /topic-extraction/img_13.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:08:50,682 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 15:08:54,968 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c0.jpg -2025-03-03 15:08:55,283 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c1.jpg -2025-03-03 15:08:55,502 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c0.jpg -2025-03-03 15:08:55,718 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c1.jpg -2025-03-03 15:08:55,913 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r2_c0.jpg -2025-03-03 15:08:56,148 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r3_c0.jpg -2025-03-03 15:08:56,395 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c0.jpg -2025-03-03 15:08:56,628 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c1.jpg -2025-03-03 15:08:56,846 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r5_c0.jpg -2025-03-03 15:08:56,847 [ERROR] __main__ - Error processing table image /topic-extraction/img_14.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:08:56,847 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 15:09:00,646 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c0.jpg -2025-03-03 15:09:00,938 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c1.jpg -2025-03-03 15:09:01,168 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c0.jpg -2025-03-03 15:09:01,395 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c1.jpg -2025-03-03 15:09:01,594 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r2_c0.jpg -2025-03-03 15:09:01,818 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r3_c0.jpg -2025-03-03 15:09:02,037 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r4_c0.jpg -2025-03-03 15:09:02,037 [ERROR] __main__ - Error processing table image /topic-extraction/img_15.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:09:02,037 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 15:09:06,565 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c0.jpg -2025-03-03 15:09:06,824 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c1.jpg -2025-03-03 15:09:07,056 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c0.jpg -2025-03-03 15:09:07,270 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c1.jpg -2025-03-03 15:09:07,488 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r2_c0.jpg -2025-03-03 15:09:07,703 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c0.jpg -2025-03-03 15:09:07,891 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c1.jpg -2025-03-03 15:09:08,117 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r4_c0.jpg -2025-03-03 15:09:08,993 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r5_c0.jpg -2025-03-03 15:09:08,993 [ERROR] __main__ - Error processing table image /topic-extraction/img_16.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:09:08,994 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 15:09:12,825 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c0.jpg -2025-03-03 15:09:13,119 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c1.jpg -2025-03-03 15:09:13,357 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r1_c0.jpg -2025-03-03 15:09:13,563 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r2_c0.jpg -2025-03-03 15:09:13,882 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r2_c1.jpg -2025-03-03 15:09:14,081 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r3_c0.jpg -2025-03-03 15:09:14,306 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r4_c0.jpg -2025-03-03 15:09:14,501 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r5_c0.jpg -2025-03-03 15:09:14,502 [ERROR] __main__ - Error processing table image /topic-extraction/img_17.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:09:14,502 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 15:09:15,797 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c0.jpg -2025-03-03 15:09:15,990 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c1.jpg -2025-03-03 15:09:16,192 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r1_c0.jpg -2025-03-03 15:09:16,383 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r1_c1.jpg -2025-03-03 15:09:16,383 [ERROR] __main__ - Error processing table image /topic-extraction/img_18.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:09:16,384 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 15:09:19,388 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c0.jpg -2025-03-03 15:09:19,717 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c1.jpg -2025-03-03 15:09:19,935 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r1_c0.jpg -2025-03-03 15:09:20,147 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r1_c1.jpg -2025-03-03 15:09:20,371 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r2_c0.jpg -2025-03-03 15:09:20,584 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r2_c1.jpg -2025-03-03 15:09:20,584 [ERROR] __main__ - Error processing table image /topic-extraction/img_19.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:09:20,585 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 15:09:25,259 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c0.jpg -2025-03-03 15:09:25,550 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c1.jpg -2025-03-03 15:09:25,773 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r1_c0.jpg -2025-03-03 15:09:25,987 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r1_c1.jpg -2025-03-03 15:09:25,988 [ERROR] __main__ - Error processing table image /topic-extraction/img_20.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:09:25,988 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 15:09:29,922 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c0.jpg -2025-03-03 15:09:30,236 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c1.jpg -2025-03-03 15:09:30,470 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r1_c0.jpg -2025-03-03 15:09:30,696 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r1_c1.jpg -2025-03-03 15:09:30,697 [ERROR] __main__ - Error processing table image /topic-extraction/img_21.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:09:30,697 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 15:09:34,601 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c0.jpg -2025-03-03 15:09:34,919 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c1.jpg -2025-03-03 15:09:35,162 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r1_c0.jpg -2025-03-03 15:09:35,391 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r1_c1.jpg -2025-03-03 15:09:35,635 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r2_c0.jpg -2025-03-03 15:09:35,880 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r2_c1.jpg -2025-03-03 15:09:36,103 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r3_c0.jpg -2025-03-03 15:09:36,103 [ERROR] __main__ - Error processing table image /topic-extraction/img_22.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:09:36,103 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 15:09:39,775 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c0.jpg -2025-03-03 15:09:40,053 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c1.jpg -2025-03-03 15:09:40,311 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r1_c0.jpg -2025-03-03 15:09:40,556 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r1_c1.jpg -2025-03-03 15:09:40,557 [ERROR] __main__ - Error processing table image /topic-extraction/img_23.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:09:40,557 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 15:09:44,517 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c0.jpg -2025-03-03 15:09:44,816 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c1.jpg -2025-03-03 15:09:45,025 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r1_c0.jpg -2025-03-03 15:09:45,216 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r1_c1.jpg -2025-03-03 15:09:45,474 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r2_c0.jpg -2025-03-03 15:09:45,475 [ERROR] __main__ - Error processing table image /topic-extraction/img_24.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:09:45,475 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-03 15:09:48,747 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 15:11:02,660 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 15:11:03,435 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 15:11:03,436 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 15:11:03,810 [INFO] __main__ - Computed global offset: 4 -2025-03-03 15:11:03,810 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 15:12:00,303 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 15:12:02,229 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 15:12:02,838 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 15:12:03,387 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 15:12:03,934 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 15:12:04,493 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 15:12:04,974 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 15:12:05,462 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 15:12:05,948 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 15:12:06,494 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 15:12:06,959 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 15:12:07,507 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 15:12:08,001 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 15:12:08,622 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 15:12:09,113 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 15:12:09,687 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 15:12:10,253 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 15:12:10,551 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 15:12:11,029 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 15:12:11,646 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 15:12:12,199 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 15:12:12,773 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 15:12:13,230 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 15:12:13,744 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 15:12:14,334 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 15:12:14,814 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 15:12:15,247 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 15:12:15,865 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 15:12:16,255 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 15:12:20,255 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 15:12:23,921 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c0.jpg -2025-03-03 15:12:24,217 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c1.jpg -2025-03-03 15:12:24,450 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c0.jpg -2025-03-03 15:12:24,703 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c1.jpg -2025-03-03 15:12:24,703 [ERROR] __main__ - Error processing table image /topic-extraction/img_1.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:12:24,703 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 15:12:28,056 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c0.jpg -2025-03-03 15:12:28,350 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c1.jpg -2025-03-03 15:12:28,608 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r1_c0.jpg -2025-03-03 15:12:28,802 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r2_c0.jpg -2025-03-03 15:12:29,021 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r3_c0.jpg -2025-03-03 15:12:29,021 [ERROR] __main__ - Error processing table image /topic-extraction/img_2.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:12:29,021 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 15:12:32,674 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c0.jpg -2025-03-03 15:12:32,963 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c1.jpg -2025-03-03 15:12:33,184 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r1_c0.jpg -2025-03-03 15:12:33,184 [ERROR] __main__ - Error processing table image /topic-extraction/img_3.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:12:33,184 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 15:12:36,359 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c0.jpg -2025-03-03 15:12:36,659 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c1.jpg -2025-03-03 15:12:36,883 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c0.jpg -2025-03-03 15:12:37,114 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c1.jpg -2025-03-03 15:12:37,114 [ERROR] __main__ - Error processing table image /topic-extraction/img_4.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:12:37,115 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 15:12:40,889 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c0.jpg -2025-03-03 15:12:41,192 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c1.jpg -2025-03-03 15:12:41,416 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c0.jpg -2025-03-03 15:12:41,662 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c1.jpg -2025-03-03 15:12:41,866 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r2_c0.jpg -2025-03-03 15:12:41,867 [ERROR] __main__ - Error processing table image /topic-extraction/img_5.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:12:41,867 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 15:12:46,953 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c0.jpg -2025-03-03 15:12:47,267 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c1.jpg -2025-03-03 15:12:47,502 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c0.jpg -2025-03-03 15:12:47,770 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c1.jpg -2025-03-03 15:12:47,771 [ERROR] __main__ - Error processing table image /topic-extraction/img_6.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:12:47,772 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 15:12:52,731 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c0.jpg -2025-03-03 15:12:53,003 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c1.jpg -2025-03-03 15:12:53,198 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r1_c0.jpg -2025-03-03 15:12:53,431 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c0.jpg -2025-03-03 15:12:53,655 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c1.jpg -2025-03-03 15:12:53,655 [ERROR] __main__ - Error processing table image /topic-extraction/img_7.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:12:53,655 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 15:12:57,299 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c0.jpg -2025-03-03 15:12:57,597 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c1.jpg -2025-03-03 15:12:57,822 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c2.jpg -2025-03-03 15:12:58,055 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c0.jpg -2025-03-03 15:12:58,290 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c1.jpg -2025-03-03 15:12:58,518 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c2.jpg -2025-03-03 15:12:58,753 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c0.jpg -2025-03-03 15:12:58,987 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c1.jpg -2025-03-03 15:12:59,218 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c0.jpg -2025-03-03 15:12:59,434 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c1.jpg -2025-03-03 15:12:59,632 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c0.jpg -2025-03-03 15:12:59,877 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c1.jpg -2025-03-03 15:13:00,135 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c0.jpg -2025-03-03 15:13:00,374 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c1.jpg -2025-03-03 15:13:00,374 [ERROR] __main__ - Error processing table image /topic-extraction/img_8.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:13:00,375 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 15:13:04,350 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c0.jpg -2025-03-03 15:13:04,658 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c1.jpg -2025-03-03 15:13:04,896 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c2.jpg -2025-03-03 15:13:05,132 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c0.jpg -2025-03-03 15:13:05,358 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c1.jpg -2025-03-03 15:13:05,590 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c0.jpg -2025-03-03 15:13:05,817 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c1.jpg -2025-03-03 15:13:06,048 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c0.jpg -2025-03-03 15:13:06,274 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c1.jpg -2025-03-03 15:13:06,275 [ERROR] __main__ - Error processing table image /topic-extraction/img_9.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:13:06,275 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 15:13:09,774 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c0.jpg -2025-03-03 15:13:10,098 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c1.jpg -2025-03-03 15:13:10,328 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r1_c0.jpg -2025-03-03 15:13:10,570 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r2_c0.jpg -2025-03-03 15:13:10,816 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r3_c0.jpg -2025-03-03 15:13:10,817 [ERROR] __main__ - Error processing table image /topic-extraction/img_10.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:13:10,818 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 15:13:14,111 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r0_c0.jpg -2025-03-03 15:13:14,417 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r1_c0.jpg -2025-03-03 15:13:14,656 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r2_c0.jpg -2025-03-03 15:13:14,910 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r3_c0.jpg -2025-03-03 15:13:15,126 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r4_c0.jpg -2025-03-03 15:13:15,344 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r5_c0.jpg -2025-03-03 15:13:15,344 [ERROR] __main__ - Error processing table image /topic-extraction/img_11.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:13:15,344 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 15:13:18,945 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c0.jpg -2025-03-03 15:13:19,257 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c1.jpg -2025-03-03 15:13:19,469 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c0.jpg -2025-03-03 15:13:19,698 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c1.jpg -2025-03-03 15:13:19,919 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c0.jpg -2025-03-03 15:13:20,141 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c1.jpg -2025-03-03 15:13:20,141 [ERROR] __main__ - Error processing table image /topic-extraction/img_12.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:13:20,141 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 15:13:23,436 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c0.jpg -2025-03-03 15:13:23,732 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c1.jpg -2025-03-03 15:13:23,960 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c0.jpg -2025-03-03 15:13:24,187 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c1.jpg -2025-03-03 15:13:24,424 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r2_c0.jpg -2025-03-03 15:13:24,698 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r3_c0.jpg -2025-03-03 15:13:24,699 [ERROR] __main__ - Error processing table image /topic-extraction/img_13.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:13:24,699 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 15:13:29,158 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c0.jpg -2025-03-03 15:13:29,447 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c1.jpg -2025-03-03 15:13:29,679 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c0.jpg -2025-03-03 15:13:29,915 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c1.jpg -2025-03-03 15:13:30,141 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r2_c0.jpg -2025-03-03 15:13:30,365 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r3_c0.jpg -2025-03-03 15:13:30,597 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c0.jpg -2025-03-03 15:13:30,815 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c1.jpg -2025-03-03 15:13:31,054 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r5_c0.jpg -2025-03-03 15:13:31,055 [ERROR] __main__ - Error processing table image /topic-extraction/img_14.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:13:31,055 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 15:13:34,757 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c0.jpg -2025-03-03 15:13:35,068 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c1.jpg -2025-03-03 15:13:35,311 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c0.jpg -2025-03-03 15:13:35,553 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c1.jpg -2025-03-03 15:13:35,772 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r2_c0.jpg -2025-03-03 15:13:35,967 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r3_c0.jpg -2025-03-03 15:13:36,164 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r4_c0.jpg -2025-03-03 15:13:36,164 [ERROR] __main__ - Error processing table image /topic-extraction/img_15.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:13:36,165 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 15:13:40,188 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c0.jpg -2025-03-03 15:13:40,459 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c1.jpg -2025-03-03 15:13:40,671 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c0.jpg -2025-03-03 15:13:40,881 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c1.jpg -2025-03-03 15:13:41,105 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r2_c0.jpg -2025-03-03 15:13:41,348 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c0.jpg -2025-03-03 15:13:41,563 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c1.jpg -2025-03-03 15:13:41,786 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r4_c0.jpg -2025-03-03 15:13:42,769 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r5_c0.jpg -2025-03-03 15:13:42,770 [ERROR] __main__ - Error processing table image /topic-extraction/img_16.jpg: local variable 'recognized_main_topic' referenced before assignment -2025-03-03 15:13:42,770 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 15:13:46,506 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 15:17:44,033 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 15:17:44,943 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 15:17:44,944 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 15:17:45,443 [INFO] __main__ - Computed global offset: 4 -2025-03-03 15:17:45,444 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 15:17:46,698 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 15:19:23,940 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 15:19:24,695 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 15:19:24,696 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 15:19:25,034 [INFO] __main__ - Computed global offset: 4 -2025-03-03 15:19:25,035 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 15:20:21,942 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 15:20:23,866 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 15:20:24,397 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 15:20:24,943 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 15:20:25,502 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 15:20:26,063 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 15:20:26,555 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 15:20:27,020 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 15:20:27,549 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 15:20:28,121 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 15:20:28,682 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 15:20:29,233 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 15:20:29,708 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 15:20:30,309 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 15:20:30,792 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 15:20:31,381 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 15:20:32,012 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 15:20:32,303 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 15:20:32,775 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 15:20:33,407 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 15:20:34,029 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 15:20:34,610 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 15:20:35,055 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 15:20:35,568 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 15:20:36,120 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 15:20:36,549 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 15:20:37,025 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 15:20:37,655 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 15:20:38,043 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 15:20:42,180 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 15:20:45,639 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c0.jpg -2025-03-03 15:20:45,932 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c1.jpg -2025-03-03 15:20:46,117 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c0.jpg -2025-03-03 15:20:46,338 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c1.jpg -2025-03-03 15:20:46,338 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 15:20:50,116 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c0.jpg -2025-03-03 15:20:50,422 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c1.jpg -2025-03-03 15:20:50,660 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r1_c0.jpg -2025-03-03 15:20:50,871 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r2_c0.jpg -2025-03-03 15:20:51,056 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r3_c0.jpg -2025-03-03 15:20:51,057 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 15:20:54,596 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c0.jpg -2025-03-03 15:20:54,864 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c1.jpg -2025-03-03 15:20:55,053 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r1_c0.jpg -2025-03-03 15:20:55,054 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 15:20:58,584 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c0.jpg -2025-03-03 15:20:58,873 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c1.jpg -2025-03-03 15:20:59,086 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c0.jpg -2025-03-03 15:20:59,317 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c1.jpg -2025-03-03 15:20:59,318 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 15:21:02,998 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c0.jpg -2025-03-03 15:21:03,281 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c1.jpg -2025-03-03 15:21:03,502 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c0.jpg -2025-03-03 15:21:03,709 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c1.jpg -2025-03-03 15:21:03,943 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r2_c0.jpg -2025-03-03 15:21:03,944 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 15:21:08,925 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c0.jpg -2025-03-03 15:21:09,224 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c1.jpg -2025-03-03 15:21:09,448 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c0.jpg -2025-03-03 15:21:09,681 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c1.jpg -2025-03-03 15:21:09,682 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 15:21:13,431 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c0.jpg -2025-03-03 15:21:13,739 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c1.jpg -2025-03-03 15:21:13,972 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r1_c0.jpg -2025-03-03 15:21:14,183 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c0.jpg -2025-03-03 15:21:14,399 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c1.jpg -2025-03-03 15:21:14,400 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 15:21:18,274 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c0.jpg -2025-03-03 15:21:18,533 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c1.jpg -2025-03-03 15:21:18,720 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c2.jpg -2025-03-03 15:21:18,946 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c0.jpg -2025-03-03 15:21:19,154 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c1.jpg -2025-03-03 15:21:19,378 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c2.jpg -2025-03-03 15:21:19,588 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c0.jpg -2025-03-03 15:21:19,821 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c1.jpg -2025-03-03 15:21:20,049 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c0.jpg -2025-03-03 15:21:20,274 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c1.jpg -2025-03-03 15:21:20,509 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c0.jpg -2025-03-03 15:21:20,739 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c1.jpg -2025-03-03 15:21:20,959 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c0.jpg -2025-03-03 15:21:21,179 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c1.jpg -2025-03-03 15:21:21,179 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 15:21:25,070 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c0.jpg -2025-03-03 15:21:25,356 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c1.jpg -2025-03-03 15:21:25,544 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c2.jpg -2025-03-03 15:21:25,732 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c0.jpg -2025-03-03 15:21:25,948 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c1.jpg -2025-03-03 15:21:26,140 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c0.jpg -2025-03-03 15:21:26,348 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c1.jpg -2025-03-03 15:21:26,564 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c0.jpg -2025-03-03 15:21:26,791 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c1.jpg -2025-03-03 15:21:26,791 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 15:21:30,393 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c0.jpg -2025-03-03 15:21:30,695 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c1.jpg -2025-03-03 15:21:30,914 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r1_c0.jpg -2025-03-03 15:21:31,189 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r2_c0.jpg -2025-03-03 15:21:31,375 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r3_c0.jpg -2025-03-03 15:21:31,375 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 15:21:34,741 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r0_c0.jpg -2025-03-03 15:21:35,030 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r1_c0.jpg -2025-03-03 15:21:35,254 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r2_c0.jpg -2025-03-03 15:21:35,473 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r3_c0.jpg -2025-03-03 15:21:35,698 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r4_c0.jpg -2025-03-03 15:21:35,898 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r5_c0.jpg -2025-03-03 15:21:35,898 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 15:21:39,541 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c0.jpg -2025-03-03 15:21:39,846 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c1.jpg -2025-03-03 15:21:40,073 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c0.jpg -2025-03-03 15:21:40,285 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c1.jpg -2025-03-03 15:21:40,512 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c0.jpg -2025-03-03 15:21:40,767 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c1.jpg -2025-03-03 15:21:40,767 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 15:21:44,049 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c0.jpg -2025-03-03 15:21:44,342 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c1.jpg -2025-03-03 15:21:44,563 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c0.jpg -2025-03-03 15:21:44,750 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c1.jpg -2025-03-03 15:21:44,999 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r2_c0.jpg -2025-03-03 15:21:45,216 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r3_c0.jpg -2025-03-03 15:21:45,217 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 15:21:50,026 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c0.jpg -2025-03-03 15:21:50,324 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c1.jpg -2025-03-03 15:21:50,539 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c0.jpg -2025-03-03 15:21:50,748 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c1.jpg -2025-03-03 15:21:50,963 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r2_c0.jpg -2025-03-03 15:21:51,203 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r3_c0.jpg -2025-03-03 15:21:51,459 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c0.jpg -2025-03-03 15:21:51,671 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c1.jpg -2025-03-03 15:21:51,898 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r5_c0.jpg -2025-03-03 15:21:51,898 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 15:21:56,013 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c0.jpg -2025-03-03 15:21:56,332 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c1.jpg -2025-03-03 15:21:56,521 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c0.jpg -2025-03-03 15:21:56,768 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c1.jpg -2025-03-03 15:21:57,016 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r2_c0.jpg -2025-03-03 15:21:57,229 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r3_c0.jpg -2025-03-03 15:21:57,413 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r4_c0.jpg -2025-03-03 15:21:57,414 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 15:22:03,240 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c0.jpg -2025-03-03 15:22:03,526 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c1.jpg -2025-03-03 15:22:03,838 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c0.jpg -2025-03-03 15:22:04,072 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c1.jpg -2025-03-03 15:22:04,303 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r2_c0.jpg -2025-03-03 15:22:04,510 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c0.jpg -2025-03-03 15:22:04,737 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c1.jpg -2025-03-03 15:22:04,963 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r4_c0.jpg -2025-03-03 15:22:05,842 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r5_c0.jpg -2025-03-03 15:22:05,843 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 15:22:06,029 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 15:22:07,171 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 15:22:07,172 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 15:22:07,700 [INFO] __main__ - Computed global offset: 4 -2025-03-03 15:22:07,701 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 15:22:10,835 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c0.jpg -2025-03-03 15:22:11,166 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c1.jpg -2025-03-03 15:22:11,383 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r1_c0.jpg -2025-03-03 15:22:11,616 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r2_c0.jpg -2025-03-03 15:22:11,852 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r2_c1.jpg -2025-03-03 15:22:12,098 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r3_c0.jpg -2025-03-03 15:22:12,327 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r4_c0.jpg -2025-03-03 15:22:12,518 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r5_c0.jpg -2025-03-03 15:22:12,519 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 15:22:14,316 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c0.jpg -2025-03-03 15:22:14,503 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c1.jpg -2025-03-03 15:22:14,698 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r1_c0.jpg -2025-03-03 15:22:14,894 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r1_c1.jpg -2025-03-03 15:22:14,895 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 15:22:18,689 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c0.jpg -2025-03-03 15:22:19,052 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c1.jpg -2025-03-03 15:22:19,269 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r1_c0.jpg -2025-03-03 15:22:19,533 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r1_c1.jpg -2025-03-03 15:22:19,730 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r2_c0.jpg -2025-03-03 15:22:19,924 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r2_c1.jpg -2025-03-03 15:22:19,925 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 15:22:25,628 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c0.jpg -2025-03-03 15:22:25,959 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c1.jpg -2025-03-03 15:22:26,188 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r1_c0.jpg -2025-03-03 15:22:26,423 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r1_c1.jpg -2025-03-03 15:22:26,424 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 15:22:31,077 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c0.jpg -2025-03-03 15:22:31,343 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c1.jpg -2025-03-03 15:22:31,603 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r1_c0.jpg -2025-03-03 15:22:31,835 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r1_c1.jpg -2025-03-03 15:22:31,836 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 15:22:36,950 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c0.jpg -2025-03-03 15:22:37,234 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c1.jpg -2025-03-03 15:22:37,450 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r1_c0.jpg -2025-03-03 15:22:37,664 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r1_c1.jpg -2025-03-03 15:22:37,882 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r2_c0.jpg -2025-03-03 15:22:38,078 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r2_c1.jpg -2025-03-03 15:22:38,308 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r3_c0.jpg -2025-03-03 15:22:38,309 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 15:22:43,854 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c0.jpg -2025-03-03 15:22:44,170 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c1.jpg -2025-03-03 15:22:44,367 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r1_c0.jpg -2025-03-03 15:22:44,601 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r1_c1.jpg -2025-03-03 15:22:44,602 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 15:22:50,835 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c0.jpg -2025-03-03 15:22:51,129 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c1.jpg -2025-03-03 15:22:51,352 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r1_c0.jpg -2025-03-03 15:22:51,568 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r1_c1.jpg -2025-03-03 15:22:51,799 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r2_c0.jpg -2025-03-03 15:22:51,800 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-03 15:22:57,515 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r0_c0.jpg -2025-03-03 15:22:57,832 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r1_c0.jpg -2025-03-03 15:22:58,035 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r2_c0.jpg -2025-03-03 15:22:58,281 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r3_c0.jpg -2025-03-03 15:22:58,507 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r4_c0.jpg -2025-03-03 15:22:58,694 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r5_c0.jpg -2025-03-03 15:22:58,694 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-03 15:23:02,977 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r0_c0.jpg -2025-03-03 15:23:03,247 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r0_c1.jpg -2025-03-03 15:23:03,496 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r1_c0.jpg -2025-03-03 15:23:03,727 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r2_c0.jpg -2025-03-03 15:23:03,728 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-03 15:23:07,668 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r0_c0.jpg -2025-03-03 15:23:07,990 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r0_c1.jpg -2025-03-03 15:23:08,181 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r1_c0.jpg -2025-03-03 15:23:08,440 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r1_c1.jpg -2025-03-03 15:23:08,675 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r2_c0.jpg -2025-03-03 15:23:08,902 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r3_c0.jpg -2025-03-03 15:23:09,119 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r4_c0.jpg -2025-03-03 15:23:09,308 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r4_c1.jpg -2025-03-03 15:23:09,309 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=two -2025-03-03 15:23:13,622 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r0_c0.jpg -2025-03-03 15:23:13,931 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r1_c0.jpg -2025-03-03 15:23:14,164 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r2_c0.jpg -2025-03-03 15:23:14,394 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r3_c0.jpg -2025-03-03 15:23:14,399 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/final_subtopics.json -2025-03-03 15:23:14,792 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 15:23:14,800 [INFO] __main__ - Processing completed successfully. -2025-03-03 15:23:16,376 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 15:23:18,490 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 15:23:19,071 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 15:23:19,624 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 15:23:20,184 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 15:23:20,749 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 15:23:21,303 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 15:23:21,747 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 15:23:22,263 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 15:23:22,793 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 15:23:23,254 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 15:23:23,812 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 15:23:24,292 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 15:23:24,919 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 15:23:25,357 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 15:23:25,966 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 15:23:26,540 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 15:23:26,834 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 15:23:27,323 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 15:23:27,870 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 15:23:28,392 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 15:23:28,972 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 15:23:29,421 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 15:23:30,016 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 15:23:30,682 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 15:23:31,146 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 15:23:31,597 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 15:23:32,202 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 15:23:32,634 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 15:23:36,443 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 15:23:40,026 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c0.jpg -2025-03-03 15:23:40,315 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c1.jpg -2025-03-03 15:23:40,527 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c0.jpg -2025-03-03 15:23:40,744 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c1.jpg -2025-03-03 15:23:40,744 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 15:23:44,104 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c0.jpg -2025-03-03 15:23:44,391 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c1.jpg -2025-03-03 15:23:44,640 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r1_c0.jpg -2025-03-03 15:23:44,885 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r2_c0.jpg -2025-03-03 15:23:45,134 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r3_c0.jpg -2025-03-03 15:23:45,134 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 15:23:48,906 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c0.jpg -2025-03-03 15:23:49,222 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c1.jpg -2025-03-03 15:23:49,474 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r1_c0.jpg -2025-03-03 15:23:49,475 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 15:23:52,653 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c0.jpg -2025-03-03 15:23:52,949 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c1.jpg -2025-03-03 15:23:53,167 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c0.jpg -2025-03-03 15:23:53,382 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c1.jpg -2025-03-03 15:23:53,382 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 15:23:57,190 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c0.jpg -2025-03-03 15:23:57,451 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c1.jpg -2025-03-03 15:23:57,676 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c0.jpg -2025-03-03 15:23:57,889 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c1.jpg -2025-03-03 15:23:58,128 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r2_c0.jpg -2025-03-03 15:23:58,129 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 15:24:01,879 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c0.jpg -2025-03-03 15:24:02,174 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c1.jpg -2025-03-03 15:24:02,391 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c0.jpg -2025-03-03 15:24:02,624 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c1.jpg -2025-03-03 15:24:02,625 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 15:24:06,022 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c0.jpg -2025-03-03 15:24:06,307 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c1.jpg -2025-03-03 15:24:06,540 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r1_c0.jpg -2025-03-03 15:24:06,746 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c0.jpg -2025-03-03 15:24:06,963 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c1.jpg -2025-03-03 15:24:06,963 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 15:24:10,802 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c0.jpg -2025-03-03 15:24:11,123 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c1.jpg -2025-03-03 15:24:11,313 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c2.jpg -2025-03-03 15:24:11,526 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c0.jpg -2025-03-03 15:24:11,743 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c1.jpg -2025-03-03 15:24:11,929 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c2.jpg -2025-03-03 15:24:12,151 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c0.jpg -2025-03-03 15:24:12,370 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c1.jpg -2025-03-03 15:24:12,555 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c0.jpg -2025-03-03 15:24:12,745 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c1.jpg -2025-03-03 15:24:12,972 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c0.jpg -2025-03-03 15:24:13,156 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c1.jpg -2025-03-03 15:24:13,390 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c0.jpg -2025-03-03 15:24:13,623 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c1.jpg -2025-03-03 15:24:13,623 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 15:24:17,854 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c0.jpg -2025-03-03 15:24:18,114 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c1.jpg -2025-03-03 15:24:18,319 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c2.jpg -2025-03-03 15:24:18,550 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c0.jpg -2025-03-03 15:24:18,738 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c1.jpg -2025-03-03 15:24:18,982 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c0.jpg -2025-03-03 15:24:19,223 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c1.jpg -2025-03-03 15:24:19,479 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c0.jpg -2025-03-03 15:24:19,686 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c1.jpg -2025-03-03 15:24:19,686 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 15:24:23,403 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c0.jpg -2025-03-03 15:24:23,661 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c1.jpg -2025-03-03 15:24:23,846 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r1_c0.jpg -2025-03-03 15:24:24,033 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r2_c0.jpg -2025-03-03 15:24:24,275 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r3_c0.jpg -2025-03-03 15:24:24,276 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 15:24:27,485 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r0_c0.jpg -2025-03-03 15:24:27,743 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r1_c0.jpg -2025-03-03 15:24:27,985 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r2_c0.jpg -2025-03-03 15:24:28,194 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r3_c0.jpg -2025-03-03 15:24:28,430 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r4_c0.jpg -2025-03-03 15:24:28,650 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r5_c0.jpg -2025-03-03 15:24:28,650 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 15:24:31,948 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c0.jpg -2025-03-03 15:24:32,241 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c1.jpg -2025-03-03 15:24:32,469 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c0.jpg -2025-03-03 15:24:32,704 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c1.jpg -2025-03-03 15:24:32,890 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c0.jpg -2025-03-03 15:24:33,110 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c1.jpg -2025-03-03 15:24:33,110 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 15:24:36,659 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c0.jpg -2025-03-03 15:24:36,949 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c1.jpg -2025-03-03 15:24:37,171 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c0.jpg -2025-03-03 15:24:37,383 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c1.jpg -2025-03-03 15:24:37,590 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r2_c0.jpg -2025-03-03 15:24:37,799 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r3_c0.jpg -2025-03-03 15:24:37,799 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 15:24:42,224 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c0.jpg -2025-03-03 15:24:42,511 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c1.jpg -2025-03-03 15:24:42,696 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c0.jpg -2025-03-03 15:24:42,914 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c1.jpg -2025-03-03 15:24:43,154 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r2_c0.jpg -2025-03-03 15:24:43,401 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r3_c0.jpg -2025-03-03 15:24:43,589 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c0.jpg -2025-03-03 15:24:43,784 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c1.jpg -2025-03-03 15:24:44,012 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r5_c0.jpg -2025-03-03 15:24:44,013 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 15:24:47,657 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c0.jpg -2025-03-03 15:24:47,954 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c1.jpg -2025-03-03 15:24:48,182 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c0.jpg -2025-03-03 15:24:48,370 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c1.jpg -2025-03-03 15:24:48,582 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r2_c0.jpg -2025-03-03 15:24:48,813 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r3_c0.jpg -2025-03-03 15:24:49,044 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r4_c0.jpg -2025-03-03 15:24:49,044 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 15:24:53,044 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c0.jpg -2025-03-03 15:24:53,303 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c1.jpg -2025-03-03 15:24:53,541 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c0.jpg -2025-03-03 15:24:53,764 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c1.jpg -2025-03-03 15:24:53,980 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r2_c0.jpg -2025-03-03 15:24:54,195 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c0.jpg -2025-03-03 15:24:54,420 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c1.jpg -2025-03-03 15:24:54,647 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r4_c0.jpg -2025-03-03 15:24:55,449 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r5_c0.jpg -2025-03-03 15:24:55,450 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 15:24:59,334 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c0.jpg -2025-03-03 15:24:59,611 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c1.jpg -2025-03-03 15:24:59,820 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r1_c0.jpg -2025-03-03 15:25:00,017 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r2_c0.jpg -2025-03-03 15:25:00,225 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r2_c1.jpg -2025-03-03 15:25:00,426 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r3_c0.jpg -2025-03-03 15:25:00,629 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r4_c0.jpg -2025-03-03 15:25:00,813 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r5_c0.jpg -2025-03-03 15:25:00,813 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 15:25:02,260 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c0.jpg -2025-03-03 15:25:02,437 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c1.jpg -2025-03-03 15:25:02,616 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r1_c0.jpg -2025-03-03 15:25:02,794 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r1_c1.jpg -2025-03-03 15:25:02,795 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 15:25:05,728 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c0.jpg -2025-03-03 15:25:06,016 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c1.jpg -2025-03-03 15:25:06,220 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r1_c0.jpg -2025-03-03 15:25:06,407 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r1_c1.jpg -2025-03-03 15:25:06,614 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r2_c0.jpg -2025-03-03 15:25:06,823 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r2_c1.jpg -2025-03-03 15:25:06,824 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 15:25:10,984 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c0.jpg -2025-03-03 15:25:11,271 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c1.jpg -2025-03-03 15:25:11,476 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r1_c0.jpg -2025-03-03 15:25:11,689 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r1_c1.jpg -2025-03-03 15:25:11,690 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 15:25:15,864 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c0.jpg -2025-03-03 15:25:16,151 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c1.jpg -2025-03-03 15:25:16,361 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r1_c0.jpg -2025-03-03 15:25:16,580 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r1_c1.jpg -2025-03-03 15:25:16,581 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 15:25:20,753 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c0.jpg -2025-03-03 15:25:21,035 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c1.jpg -2025-03-03 15:25:21,255 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r1_c0.jpg -2025-03-03 15:25:21,463 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r1_c1.jpg -2025-03-03 15:25:21,683 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r2_c0.jpg -2025-03-03 15:25:21,905 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r2_c1.jpg -2025-03-03 15:25:22,123 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r3_c0.jpg -2025-03-03 15:25:22,123 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 15:25:25,343 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c0.jpg -2025-03-03 15:25:25,658 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c1.jpg -2025-03-03 15:25:25,875 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r1_c0.jpg -2025-03-03 15:25:26,081 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r1_c1.jpg -2025-03-03 15:25:26,082 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 15:25:29,857 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c0.jpg -2025-03-03 15:25:30,146 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c1.jpg -2025-03-03 15:25:30,363 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r1_c0.jpg -2025-03-03 15:25:30,549 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r1_c1.jpg -2025-03-03 15:25:30,775 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r2_c0.jpg -2025-03-03 15:25:30,776 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-03 15:25:34,058 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r0_c0.jpg -2025-03-03 15:25:34,360 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r1_c0.jpg -2025-03-03 15:25:34,543 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r2_c0.jpg -2025-03-03 15:25:34,781 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r3_c0.jpg -2025-03-03 15:25:34,989 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r4_c0.jpg -2025-03-03 15:25:35,200 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r5_c0.jpg -2025-03-03 15:25:35,200 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-03 15:25:38,544 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r0_c0.jpg -2025-03-03 15:25:38,844 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r0_c1.jpg -2025-03-03 15:25:39,043 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r1_c0.jpg -2025-03-03 15:25:39,275 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r2_c0.jpg -2025-03-03 15:25:39,275 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-03 15:25:42,304 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r0_c0.jpg -2025-03-03 15:25:42,588 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r0_c1.jpg -2025-03-03 15:25:42,837 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r1_c0.jpg -2025-03-03 15:25:43,076 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r1_c1.jpg -2025-03-03 15:25:43,300 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r2_c0.jpg -2025-03-03 15:25:43,545 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r3_c0.jpg -2025-03-03 15:25:43,758 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r4_c0.jpg -2025-03-03 15:25:43,941 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r4_c1.jpg -2025-03-03 15:25:43,941 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=two -2025-03-03 15:25:47,406 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r0_c0.jpg -2025-03-03 15:25:47,702 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r1_c0.jpg -2025-03-03 15:25:47,938 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r2_c0.jpg -2025-03-03 15:25:48,147 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r3_c0.jpg -2025-03-03 15:25:48,150 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/we/we_ars/final_subtopics.json -2025-03-03 15:25:48,440 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 15:25:48,447 [INFO] __main__ - Processing completed successfully. -2025-03-03 15:35:13,179 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 15:35:13,975 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 15:35:13,975 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 15:35:14,274 [INFO] __main__ - Computed global offset: 4 -2025-03-03 15:35:14,275 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 15:36:08,757 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 15:36:10,548 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 15:36:11,124 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 15:36:11,652 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 15:36:12,292 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 15:36:12,807 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 15:36:13,299 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 15:36:13,756 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 15:36:14,268 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 15:36:14,851 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 15:36:15,306 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 15:36:15,825 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 15:36:16,252 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 15:36:16,864 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 15:36:17,409 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 15:36:17,982 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 15:36:18,557 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 15:36:18,855 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 15:36:19,385 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 15:36:19,960 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 15:36:20,618 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 15:36:21,229 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 15:36:21,682 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 15:36:22,202 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 15:36:22,746 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 15:36:23,199 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 15:36:23,616 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 15:36:24,228 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 15:36:24,610 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 15:36:28,771 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 15:36:32,613 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c0.jpg -2025-03-03 15:36:33,042 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r0_c1.jpg -2025-03-03 15:36:33,360 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c0.jpg -2025-03-03 15:36:33,578 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_1.jpg_r1_c1.jpg -2025-03-03 15:36:33,578 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 15:36:36,960 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c0.jpg -2025-03-03 15:36:37,258 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r0_c1.jpg -2025-03-03 15:36:37,495 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r1_c0.jpg -2025-03-03 15:36:37,741 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r2_c0.jpg -2025-03-03 15:36:37,966 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_2.jpg_r3_c0.jpg -2025-03-03 15:36:37,966 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 15:36:42,132 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c0.jpg -2025-03-03 15:36:42,430 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r0_c1.jpg -2025-03-03 15:36:42,677 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_3.jpg_r1_c0.jpg -2025-03-03 15:36:42,678 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 15:36:45,924 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c0.jpg -2025-03-03 15:36:46,226 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r0_c1.jpg -2025-03-03 15:36:46,454 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c0.jpg -2025-03-03 15:36:46,653 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_4.jpg_r1_c1.jpg -2025-03-03 15:36:46,654 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 15:36:50,162 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c0.jpg -2025-03-03 15:36:50,480 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r0_c1.jpg -2025-03-03 15:36:50,684 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c0.jpg -2025-03-03 15:36:50,903 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r1_c1.jpg -2025-03-03 15:36:51,121 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_5.jpg_r2_c0.jpg -2025-03-03 15:36:51,121 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 15:36:54,777 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c0.jpg -2025-03-03 15:36:55,062 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r0_c1.jpg -2025-03-03 15:36:55,316 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c0.jpg -2025-03-03 15:36:55,563 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_6.jpg_r1_c1.jpg -2025-03-03 15:36:55,564 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 15:36:59,067 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c0.jpg -2025-03-03 15:36:59,360 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r0_c1.jpg -2025-03-03 15:36:59,579 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r1_c0.jpg -2025-03-03 15:36:59,812 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c0.jpg -2025-03-03 15:37:00,022 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_7.jpg_r2_c1.jpg -2025-03-03 15:37:00,022 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 15:37:03,516 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c0.jpg -2025-03-03 15:37:03,844 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c1.jpg -2025-03-03 15:37:04,081 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r0_c2.jpg -2025-03-03 15:37:04,297 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c0.jpg -2025-03-03 15:37:04,519 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c1.jpg -2025-03-03 15:37:04,713 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r1_c2.jpg -2025-03-03 15:37:04,938 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c0.jpg -2025-03-03 15:37:05,173 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r2_c1.jpg -2025-03-03 15:37:05,378 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c0.jpg -2025-03-03 15:37:05,606 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r3_c1.jpg -2025-03-03 15:37:05,813 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c0.jpg -2025-03-03 15:37:06,035 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r4_c1.jpg -2025-03-03 15:37:06,230 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c0.jpg -2025-03-03 15:37:06,465 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_8.jpg_r5_c1.jpg -2025-03-03 15:37:06,466 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 15:37:10,452 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c0.jpg -2025-03-03 15:37:10,769 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c1.jpg -2025-03-03 15:37:10,994 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r0_c2.jpg -2025-03-03 15:37:11,186 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c0.jpg -2025-03-03 15:37:11,386 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r1_c1.jpg -2025-03-03 15:37:11,614 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c0.jpg -2025-03-03 15:37:11,854 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r2_c1.jpg -2025-03-03 15:37:12,079 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c0.jpg -2025-03-03 15:37:12,325 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_9.jpg_r3_c1.jpg -2025-03-03 15:37:12,326 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 15:37:15,839 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c0.jpg -2025-03-03 15:37:16,156 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r0_c1.jpg -2025-03-03 15:37:16,364 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r1_c0.jpg -2025-03-03 15:37:16,599 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r2_c0.jpg -2025-03-03 15:37:16,836 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_10.jpg_r3_c0.jpg -2025-03-03 15:37:16,836 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 15:37:20,234 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r0_c0.jpg -2025-03-03 15:37:20,533 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r1_c0.jpg -2025-03-03 15:37:20,755 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r2_c0.jpg -2025-03-03 15:37:20,986 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r3_c0.jpg -2025-03-03 15:37:21,212 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r4_c0.jpg -2025-03-03 15:37:21,460 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_11.jpg_r5_c0.jpg -2025-03-03 15:37:21,461 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 15:37:24,854 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c0.jpg -2025-03-03 15:37:25,123 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r0_c1.jpg -2025-03-03 15:37:25,321 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c0.jpg -2025-03-03 15:37:25,564 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r1_c1.jpg -2025-03-03 15:37:25,784 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c0.jpg -2025-03-03 15:37:26,011 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_12.jpg_r2_c1.jpg -2025-03-03 15:37:26,011 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 15:37:29,369 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c0.jpg -2025-03-03 15:37:29,639 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r0_c1.jpg -2025-03-03 15:37:29,833 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c0.jpg -2025-03-03 15:37:30,067 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r1_c1.jpg -2025-03-03 15:37:30,300 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r2_c0.jpg -2025-03-03 15:37:30,521 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_13.jpg_r3_c0.jpg -2025-03-03 15:37:30,522 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 15:37:34,845 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c0.jpg -2025-03-03 15:37:35,143 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r0_c1.jpg -2025-03-03 15:37:35,369 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c0.jpg -2025-03-03 15:37:35,584 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r1_c1.jpg -2025-03-03 15:37:35,812 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r2_c0.jpg -2025-03-03 15:37:36,047 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r3_c0.jpg -2025-03-03 15:37:36,263 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c0.jpg -2025-03-03 15:37:36,499 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r4_c1.jpg -2025-03-03 15:37:36,731 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_14.jpg_r5_c0.jpg -2025-03-03 15:37:36,731 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 15:37:40,443 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c0.jpg -2025-03-03 15:37:40,739 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r0_c1.jpg -2025-03-03 15:37:40,976 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c0.jpg -2025-03-03 15:37:41,215 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r1_c1.jpg -2025-03-03 15:37:41,432 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r2_c0.jpg -2025-03-03 15:37:41,680 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r3_c0.jpg -2025-03-03 15:37:41,875 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_15.jpg_r4_c0.jpg -2025-03-03 15:37:41,875 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 15:37:47,207 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c0.jpg -2025-03-03 15:37:47,496 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r0_c1.jpg -2025-03-03 15:37:47,709 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c0.jpg -2025-03-03 15:37:47,933 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r1_c1.jpg -2025-03-03 15:37:48,157 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r2_c0.jpg -2025-03-03 15:37:48,378 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c0.jpg -2025-03-03 15:37:48,606 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r3_c1.jpg -2025-03-03 15:37:48,843 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r4_c0.jpg -2025-03-03 15:37:49,659 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_16.jpg_r5_c0.jpg -2025-03-03 15:37:49,660 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 15:37:54,496 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c0.jpg -2025-03-03 15:37:54,789 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r0_c1.jpg -2025-03-03 15:37:55,013 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r1_c0.jpg -2025-03-03 15:37:55,227 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r2_c0.jpg -2025-03-03 15:37:55,435 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r2_c1.jpg -2025-03-03 15:37:55,621 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r3_c0.jpg -2025-03-03 15:37:55,841 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r4_c0.jpg -2025-03-03 15:37:56,031 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_17.jpg_r5_c0.jpg -2025-03-03 15:37:56,032 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 15:37:57,656 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c0.jpg -2025-03-03 15:37:57,843 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r0_c1.jpg -2025-03-03 15:37:58,025 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r1_c0.jpg -2025-03-03 15:37:58,205 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_18.jpg_r1_c1.jpg -2025-03-03 15:37:58,206 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 15:38:01,173 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c0.jpg -2025-03-03 15:38:01,468 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r0_c1.jpg -2025-03-03 15:38:01,717 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r1_c0.jpg -2025-03-03 15:38:01,957 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r1_c1.jpg -2025-03-03 15:38:02,169 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r2_c0.jpg -2025-03-03 15:38:02,390 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_19.jpg_r2_c1.jpg -2025-03-03 15:38:02,391 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 15:38:06,804 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c0.jpg -2025-03-03 15:38:07,126 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r0_c1.jpg -2025-03-03 15:38:07,371 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r1_c0.jpg -2025-03-03 15:38:07,619 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_20.jpg_r1_c1.jpg -2025-03-03 15:38:07,620 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 15:38:11,257 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c0.jpg -2025-03-03 15:38:11,521 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r0_c1.jpg -2025-03-03 15:38:11,778 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r1_c0.jpg -2025-03-03 15:38:11,997 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_21.jpg_r1_c1.jpg -2025-03-03 15:38:11,997 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 15:38:15,768 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c0.jpg -2025-03-03 15:38:16,060 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r0_c1.jpg -2025-03-03 15:38:16,291 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r1_c0.jpg -2025-03-03 15:38:16,510 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r1_c1.jpg -2025-03-03 15:38:16,735 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r2_c0.jpg -2025-03-03 15:38:16,954 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r2_c1.jpg -2025-03-03 15:38:17,167 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_22.jpg_r3_c0.jpg -2025-03-03 15:38:17,167 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 15:38:20,451 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c0.jpg -2025-03-03 15:38:20,752 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r0_c1.jpg -2025-03-03 15:38:20,977 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r1_c0.jpg -2025-03-03 15:38:21,189 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_23.jpg_r1_c1.jpg -2025-03-03 15:38:21,189 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 15:38:24,928 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c0.jpg -2025-03-03 15:38:25,223 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r0_c1.jpg -2025-03-03 15:38:25,467 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r1_c0.jpg -2025-03-03 15:38:25,682 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r1_c1.jpg -2025-03-03 15:38:25,891 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_24.jpg_r2_c0.jpg -2025-03-03 15:38:25,892 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-03 15:38:29,433 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r0_c0.jpg -2025-03-03 15:38:29,727 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r1_c0.jpg -2025-03-03 15:38:29,962 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r2_c0.jpg -2025-03-03 15:38:30,193 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r3_c0.jpg -2025-03-03 15:38:30,418 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r4_c0.jpg -2025-03-03 15:38:30,663 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_25.jpg_r5_c0.jpg -2025-03-03 15:38:30,663 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-03 15:38:33,787 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r0_c0.jpg -2025-03-03 15:38:34,093 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r0_c1.jpg -2025-03-03 15:38:34,340 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r1_c0.jpg -2025-03-03 15:38:34,527 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_26.jpg_r2_c0.jpg -2025-03-03 15:38:34,528 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-03 15:38:37,830 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r0_c0.jpg -2025-03-03 15:38:38,135 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r0_c1.jpg -2025-03-03 15:38:38,348 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r1_c0.jpg -2025-03-03 15:38:38,557 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r1_c1.jpg -2025-03-03 15:38:38,779 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r2_c0.jpg -2025-03-03 15:38:38,985 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r3_c0.jpg -2025-03-03 15:38:39,202 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r4_c0.jpg -2025-03-03 15:38:39,412 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_27.jpg_r4_c1.jpg -2025-03-03 15:38:39,412 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=two -2025-03-03 15:38:42,913 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r0_c0.jpg -2025-03-03 15:38:43,203 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r1_c0.jpg -2025-03-03 15:38:43,415 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r2_c0.jpg -2025-03-03 15:38:43,632 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/table_s3_img_28.jpg_r3_c0.jpg -2025-03-03 15:38:43,635 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/we/we_ars/final_subtopics.json -2025-03-03 15:38:43,937 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 15:38:43,944 [INFO] __main__ - Processing completed successfully. -2025-03-03 15:42:34,738 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 15:42:35,614 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 15:42:35,614 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 15:42:35,901 [INFO] __main__ - Computed global offset: 4 -2025-03-03 15:42:35,901 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 15:43:32,916 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 15:43:34,890 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 15:43:35,454 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 15:43:35,979 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 15:43:36,599 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 15:43:37,168 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 15:43:37,654 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 15:43:38,122 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 15:43:38,625 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 15:43:39,217 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 15:43:39,653 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 15:43:40,219 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 15:43:40,688 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 15:43:41,333 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 15:43:41,810 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 15:43:42,373 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 15:43:42,930 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 15:43:43,215 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 15:43:43,761 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 15:43:44,343 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 15:43:44,916 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 15:43:45,497 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 15:43:46,030 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 15:43:46,531 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 15:43:47,102 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 15:43:47,527 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 15:43:47,978 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 15:43:48,600 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 15:43:48,979 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 15:43:53,041 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 15:43:55,635 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 15:43:58,390 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 15:44:01,626 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 15:44:04,265 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 15:44:07,390 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 15:44:10,407 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 15:44:13,406 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 15:44:16,584 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 15:44:19,774 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 15:44:22,710 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 15:44:25,348 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 15:44:28,209 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 15:44:30,871 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 15:44:34,586 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 15:44:37,728 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 15:44:41,186 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 15:44:44,225 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 15:44:45,215 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 15:44:47,826 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 15:44:51,247 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 15:44:54,347 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 15:44:58,189 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 15:45:01,536 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 15:45:04,841 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-03 15:45:07,739 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-03 15:45:10,526 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-03 15:45:13,345 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=two -2025-03-03 15:45:16,242 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/final_subtopics.json -2025-03-03 15:45:16,577 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 15:45:16,586 [INFO] __main__ - Processing completed successfully. -2025-03-03 15:57:41,431 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 15:57:42,258 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 15:57:42,259 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 15:57:42,742 [INFO] __main__ - Computed global offset: 4 -2025-03-03 15:57:42,743 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 15:58:39,849 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 15:58:41,633 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 15:58:42,153 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 15:58:42,662 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 15:58:43,172 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 15:58:43,731 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 15:58:44,204 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 15:58:44,670 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 15:58:45,137 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 15:58:45,759 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 15:58:46,209 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 15:58:46,777 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 15:58:47,219 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 15:58:47,782 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 15:58:48,283 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 15:58:48,810 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 15:58:49,338 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 15:58:49,627 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 15:58:50,141 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 15:58:50,715 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 15:58:51,276 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 15:58:51,914 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 15:58:52,367 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 15:58:52,816 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 15:58:53,315 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 15:58:53,749 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 15:58:54,300 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 15:58:54,855 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 15:58:55,224 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 15:58:59,109 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 15:59:02,125 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 15:59:04,863 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 15:59:07,805 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 15:59:10,471 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 15:59:13,381 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 15:59:16,869 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 15:59:20,177 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 15:59:23,119 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 15:59:26,242 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 15:59:29,179 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 15:59:32,005 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 15:59:34,843 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 15:59:37,550 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 15:59:41,218 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 15:59:44,176 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 15:59:48,175 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 15:59:51,345 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 15:59:52,204 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 15:59:54,653 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 15:59:58,328 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 16:00:01,476 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 16:00:04,927 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 16:00:07,802 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 16:00:10,839 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-03 16:00:13,583 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-03 16:00:16,261 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-03 16:00:18,834 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=two -2025-03-03 16:00:21,585 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/final_subtopics.json -2025-03-03 16:00:21,894 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 16:00:21,904 [INFO] __main__ - Processing completed successfully. -2025-03-03 16:10:28,834 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 16:10:29,650 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 16:10:29,652 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 16:10:30,160 [INFO] __main__ - Computed global offset: 4 -2025-03-03 16:10:30,161 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 16:11:28,663 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 16:11:30,405 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 16:11:30,930 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 16:11:31,457 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 16:11:32,063 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 16:11:32,602 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 16:11:33,092 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 16:11:33,567 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 16:11:34,048 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 16:11:34,608 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 16:11:35,064 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 16:11:35,615 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 16:11:36,193 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 16:11:36,888 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 16:11:37,467 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 16:11:38,018 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 16:11:38,628 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 16:11:38,959 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 16:11:39,505 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 16:11:40,087 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 16:11:40,633 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 16:11:41,250 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 16:11:41,747 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 16:11:42,227 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 16:11:42,777 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 16:11:43,219 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 16:11:43,659 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 16:11:44,281 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 16:11:44,761 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 16:11:48,542 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 16:11:51,341 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 16:11:54,134 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 16:11:57,533 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 16:12:00,316 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 16:12:03,403 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 16:12:06,708 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 16:12:09,743 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 16:12:12,739 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 16:12:15,944 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 16:12:19,014 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 16:12:22,037 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 16:12:25,060 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 16:12:27,839 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 16:12:31,482 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 16:12:34,595 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 16:12:38,218 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 16:12:41,340 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 16:12:42,340 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 16:12:44,692 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 16:12:48,729 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 16:12:51,870 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 16:12:55,261 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 16:12:58,310 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 16:13:01,583 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-03 16:13:04,585 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-03 16:13:07,386 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-03 16:13:10,093 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=two -2025-03-03 16:13:12,995 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/final_subtopics.json -2025-03-03 16:13:13,319 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 16:13:13,326 [INFO] __main__ - Processing completed successfully. -2025-03-03 16:17:39,572 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 16:17:40,359 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 16:17:40,360 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 16:17:40,832 [INFO] __main__ - Computed global offset: 4 -2025-03-03 16:17:40,832 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 16:18:36,191 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 16:18:38,048 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 16:18:38,583 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 16:18:39,096 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 16:18:39,618 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 16:18:40,200 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 16:18:40,691 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 16:18:41,167 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 16:18:41,618 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 16:18:42,191 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 16:18:42,630 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 16:18:43,147 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 16:18:43,615 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 16:18:44,242 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 16:18:44,697 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 16:18:45,344 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 16:18:45,897 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 16:18:46,196 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 16:18:46,687 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 16:18:47,347 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 16:18:47,886 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 16:18:48,468 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 16:18:48,884 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 16:18:49,349 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 16:18:49,906 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 16:18:50,347 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 16:18:50,747 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 16:18:51,318 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 16:18:51,718 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 16:18:55,669 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 16:18:58,365 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 16:19:01,432 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 16:19:04,431 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 16:19:07,043 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 16:19:09,947 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 16:19:13,069 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 16:19:16,110 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 16:19:19,225 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 16:19:22,435 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 16:19:25,263 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 16:19:27,914 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 16:19:30,818 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 16:19:33,439 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 16:19:37,263 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 16:19:40,284 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 16:19:43,672 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 16:19:46,752 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 16:19:47,590 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 16:19:49,890 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 16:19:53,362 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 16:19:56,514 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 16:19:59,760 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 16:20:02,816 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 16:20:05,964 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-03 16:20:08,853 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-03 16:20:11,414 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-03 16:20:14,113 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=two -2025-03-03 16:20:17,013 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/final_subtopics.json -2025-03-03 16:20:17,302 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 16:20:17,310 [INFO] __main__ - Processing completed successfully. -2025-03-03 16:29:55,533 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 16:29:56,335 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 16:29:56,335 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 16:29:56,683 [INFO] __main__ - Computed global offset: 4 -2025-03-03 16:29:56,684 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 16:30:54,842 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 16:30:56,704 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 16:30:57,248 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 16:30:57,760 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 16:30:58,328 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 16:30:58,896 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 16:30:59,381 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 16:30:59,848 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 16:31:00,350 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 16:31:00,906 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 16:31:01,322 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 16:31:01,858 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 16:31:02,304 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 16:31:02,900 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 16:31:03,362 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 16:31:03,956 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 16:31:04,503 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 16:31:04,806 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 16:31:05,281 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 16:31:05,889 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 16:31:06,437 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 16:31:07,001 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 16:31:07,493 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 16:31:07,962 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 16:31:08,501 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 16:31:09,004 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 16:31:09,512 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 16:31:10,086 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 16:31:10,503 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 16:31:14,364 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 16:31:17,142 [INFO] __main__ - Extracted 2 rows from /tmp/tmp3fu7msc1.jpg -2025-03-03 16:31:17,142 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:31:17,142 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:31:17,240 [WARNING] __main__ - Cell image not found: /tmp/tmp3fu7msc1.jpg_rows/row_0/col_0.jpg -2025-03-03 16:31:17,241 [WARNING] __main__ - Cell image not found: /tmp/tmp3fu7msc1.jpg_rows/row_0/col_1.jpg -2025-03-03 16:31:17,241 [WARNING] __main__ - Cell image not found: /tmp/tmp3fu7msc1.jpg_rows/row_1/col_0.jpg -2025-03-03 16:31:17,241 [WARNING] __main__ - Cell image not found: /tmp/tmp3fu7msc1.jpg_rows/row_1/col_1.jpg -2025-03-03 16:31:17,241 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 16:31:19,966 [INFO] __main__ - Extracted 4 rows from /tmp/tmp24x1tzyf.jpg -2025-03-03 16:31:19,967 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:31:19,967 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:31:19,967 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:31:19,967 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:31:20,069 [WARNING] __main__ - Cell image not found: /tmp/tmp24x1tzyf.jpg_rows/row_0/col_0.jpg -2025-03-03 16:31:20,069 [WARNING] __main__ - Cell image not found: /tmp/tmp24x1tzyf.jpg_rows/row_0/col_1.jpg -2025-03-03 16:31:20,070 [WARNING] __main__ - Cell image not found: /tmp/tmp24x1tzyf.jpg_rows/row_1/col_0.jpg -2025-03-03 16:31:20,070 [WARNING] __main__ - Cell image not found: /tmp/tmp24x1tzyf.jpg_rows/row_2/col_0.jpg -2025-03-03 16:31:20,070 [WARNING] __main__ - Cell image not found: /tmp/tmp24x1tzyf.jpg_rows/row_3/col_0.jpg -2025-03-03 16:31:20,070 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 16:31:22,860 [INFO] __main__ - Extracted 2 rows from /tmp/tmpqj8kzjlh.jpg -2025-03-03 16:31:22,860 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:31:22,861 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:31:22,974 [WARNING] __main__ - Cell image not found: /tmp/tmpqj8kzjlh.jpg_rows/row_0/col_0.jpg -2025-03-03 16:31:22,974 [WARNING] __main__ - Cell image not found: /tmp/tmpqj8kzjlh.jpg_rows/row_0/col_1.jpg -2025-03-03 16:31:22,974 [WARNING] __main__ - Cell image not found: /tmp/tmpqj8kzjlh.jpg_rows/row_1/col_0.jpg -2025-03-03 16:31:22,975 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 16:31:25,631 [INFO] __main__ - Extracted 2 rows from /tmp/tmp3i00mmdz.jpg -2025-03-03 16:31:25,632 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:31:25,632 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:31:25,740 [WARNING] __main__ - Cell image not found: /tmp/tmp3i00mmdz.jpg_rows/row_0/col_0.jpg -2025-03-03 16:31:25,740 [WARNING] __main__ - Cell image not found: /tmp/tmp3i00mmdz.jpg_rows/row_0/col_1.jpg -2025-03-03 16:31:25,741 [WARNING] __main__ - Cell image not found: /tmp/tmp3i00mmdz.jpg_rows/row_1/col_0.jpg -2025-03-03 16:31:25,741 [WARNING] __main__ - Cell image not found: /tmp/tmp3i00mmdz.jpg_rows/row_1/col_1.jpg -2025-03-03 16:31:25,741 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 16:31:29,048 [INFO] __main__ - Extracted 3 rows from /tmp/tmpo6lnusuq.jpg -2025-03-03 16:31:29,048 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:31:29,049 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:31:29,049 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:31:29,152 [WARNING] __main__ - Cell image not found: /tmp/tmpo6lnusuq.jpg_rows/row_0/col_0.jpg -2025-03-03 16:31:29,152 [WARNING] __main__ - Cell image not found: /tmp/tmpo6lnusuq.jpg_rows/row_0/col_1.jpg -2025-03-03 16:31:29,152 [WARNING] __main__ - Cell image not found: /tmp/tmpo6lnusuq.jpg_rows/row_1/col_0.jpg -2025-03-03 16:31:29,152 [WARNING] __main__ - Cell image not found: /tmp/tmpo6lnusuq.jpg_rows/row_1/col_1.jpg -2025-03-03 16:31:29,152 [WARNING] __main__ - Cell image not found: /tmp/tmpo6lnusuq.jpg_rows/row_2/col_0.jpg -2025-03-03 16:31:29,152 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 16:31:32,089 [INFO] __main__ - Extracted 2 rows from /tmp/tmpk3rl8rco.jpg -2025-03-03 16:31:32,089 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:31:32,089 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:31:32,201 [WARNING] __main__ - Cell image not found: /tmp/tmpk3rl8rco.jpg_rows/row_0/col_0.jpg -2025-03-03 16:31:32,201 [WARNING] __main__ - Cell image not found: /tmp/tmpk3rl8rco.jpg_rows/row_0/col_1.jpg -2025-03-03 16:31:32,201 [WARNING] __main__ - Cell image not found: /tmp/tmpk3rl8rco.jpg_rows/row_1/col_0.jpg -2025-03-03 16:31:32,201 [WARNING] __main__ - Cell image not found: /tmp/tmpk3rl8rco.jpg_rows/row_1/col_1.jpg -2025-03-03 16:31:32,202 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 16:31:34,953 [INFO] __main__ - Extracted 3 rows from /tmp/tmp3iuybv9f.jpg -2025-03-03 16:31:34,953 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:31:34,953 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:31:34,953 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:31:35,071 [WARNING] __main__ - Cell image not found: /tmp/tmp3iuybv9f.jpg_rows/row_0/col_0.jpg -2025-03-03 16:31:35,072 [WARNING] __main__ - Cell image not found: /tmp/tmp3iuybv9f.jpg_rows/row_0/col_1.jpg -2025-03-03 16:31:35,072 [WARNING] __main__ - Cell image not found: /tmp/tmp3iuybv9f.jpg_rows/row_1/col_0.jpg -2025-03-03 16:31:35,072 [WARNING] __main__ - Cell image not found: /tmp/tmp3iuybv9f.jpg_rows/row_2/col_0.jpg -2025-03-03 16:31:35,072 [WARNING] __main__ - Cell image not found: /tmp/tmp3iuybv9f.jpg_rows/row_2/col_1.jpg -2025-03-03 16:31:35,073 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 16:31:37,913 [INFO] __main__ - Extracted 6 rows from /tmp/tmpbg2n4l31.jpg -2025-03-03 16:31:37,914 [INFO] __main__ - Row 0 has 3 cells -2025-03-03 16:31:37,914 [INFO] __main__ - Row 1 has 3 cells -2025-03-03 16:31:37,914 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:31:37,914 [INFO] __main__ - Row 3 has 2 cells -2025-03-03 16:31:37,914 [INFO] __main__ - Row 4 has 2 cells -2025-03-03 16:31:37,914 [INFO] __main__ - Row 5 has 2 cells -2025-03-03 16:31:38,027 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_0/col_0.jpg -2025-03-03 16:31:38,028 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_0/col_1.jpg -2025-03-03 16:31:38,028 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_0/col_2.jpg -2025-03-03 16:31:38,028 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_1/col_0.jpg -2025-03-03 16:31:38,028 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_1/col_1.jpg -2025-03-03 16:31:38,029 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_1/col_2.jpg -2025-03-03 16:31:38,029 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_2/col_0.jpg -2025-03-03 16:31:38,029 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_2/col_1.jpg -2025-03-03 16:31:38,029 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_3/col_0.jpg -2025-03-03 16:31:38,029 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_3/col_1.jpg -2025-03-03 16:31:38,030 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_4/col_0.jpg -2025-03-03 16:31:38,030 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_4/col_1.jpg -2025-03-03 16:31:38,030 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_5/col_0.jpg -2025-03-03 16:31:38,030 [WARNING] __main__ - Cell image not found: /tmp/tmpbg2n4l31.jpg_rows/row_5/col_1.jpg -2025-03-03 16:31:38,030 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 16:31:41,370 [INFO] __main__ - Extracted 4 rows from /tmp/tmpuozb49tl.jpg -2025-03-03 16:31:41,370 [INFO] __main__ - Row 0 has 3 cells -2025-03-03 16:31:41,370 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:31:41,370 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:31:41,370 [INFO] __main__ - Row 3 has 2 cells -2025-03-03 16:31:41,490 [WARNING] __main__ - Cell image not found: /tmp/tmpuozb49tl.jpg_rows/row_0/col_0.jpg -2025-03-03 16:31:41,490 [WARNING] __main__ - Cell image not found: /tmp/tmpuozb49tl.jpg_rows/row_0/col_1.jpg -2025-03-03 16:31:41,491 [WARNING] __main__ - Cell image not found: /tmp/tmpuozb49tl.jpg_rows/row_0/col_2.jpg -2025-03-03 16:31:41,491 [WARNING] __main__ - Cell image not found: /tmp/tmpuozb49tl.jpg_rows/row_1/col_0.jpg -2025-03-03 16:31:41,491 [WARNING] __main__ - Cell image not found: /tmp/tmpuozb49tl.jpg_rows/row_1/col_1.jpg -2025-03-03 16:31:41,491 [WARNING] __main__ - Cell image not found: /tmp/tmpuozb49tl.jpg_rows/row_2/col_0.jpg -2025-03-03 16:31:41,491 [WARNING] __main__ - Cell image not found: /tmp/tmpuozb49tl.jpg_rows/row_2/col_1.jpg -2025-03-03 16:31:41,491 [WARNING] __main__ - Cell image not found: /tmp/tmpuozb49tl.jpg_rows/row_3/col_0.jpg -2025-03-03 16:31:41,491 [WARNING] __main__ - Cell image not found: /tmp/tmpuozb49tl.jpg_rows/row_3/col_1.jpg -2025-03-03 16:31:41,492 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 16:31:44,447 [INFO] __main__ - Extracted 4 rows from /tmp/tmplrwra3bt.jpg -2025-03-03 16:31:44,448 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:31:44,448 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:31:44,448 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:31:44,448 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:31:44,554 [WARNING] __main__ - Cell image not found: /tmp/tmplrwra3bt.jpg_rows/row_0/col_0.jpg -2025-03-03 16:31:44,555 [WARNING] __main__ - Cell image not found: /tmp/tmplrwra3bt.jpg_rows/row_0/col_1.jpg -2025-03-03 16:31:44,555 [WARNING] __main__ - Cell image not found: /tmp/tmplrwra3bt.jpg_rows/row_1/col_0.jpg -2025-03-03 16:31:44,555 [WARNING] __main__ - Cell image not found: /tmp/tmplrwra3bt.jpg_rows/row_2/col_0.jpg -2025-03-03 16:31:44,555 [WARNING] __main__ - Cell image not found: /tmp/tmplrwra3bt.jpg_rows/row_3/col_0.jpg -2025-03-03 16:31:44,555 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 16:31:47,726 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 16:38:00,505 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 16:38:01,273 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 16:38:01,274 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 16:38:01,586 [INFO] __main__ - Computed global offset: 4 -2025-03-03 16:38:01,587 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 16:38:55,658 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 16:38:57,514 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 16:38:58,085 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 16:38:58,674 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 16:38:59,259 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 16:38:59,865 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 16:39:00,442 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 16:39:00,917 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 16:39:01,415 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 16:39:01,970 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 16:39:02,391 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 16:39:02,884 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 16:39:03,352 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 16:39:03,934 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 16:39:04,441 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 16:39:04,988 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 16:39:05,540 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 16:39:05,840 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 16:39:06,313 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 16:39:06,928 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 16:39:07,507 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 16:39:08,099 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 16:39:08,546 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 16:39:09,044 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 16:39:09,667 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 16:39:10,150 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 16:39:10,586 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 16:39:11,171 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 16:39:11,553 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 16:39:16,188 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 16:39:18,827 [INFO] __main__ - Extracted 2 rows from /tmp/tmpsadbyp6s.jpg -2025-03-03 16:39:18,828 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:39:18,828 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:39:18,925 [INFO] __main__ - Files in /tmp/tmpsadbyp6s.jpg_rows: -2025-03-03 16:39:18,925 [INFO] __main__ - /tmp/tmpsadbyp6s.jpg_rows: [] -2025-03-03 16:39:18,926 [INFO] __main__ - /tmp/tmpsadbyp6s.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:18,926 [INFO] __main__ - /tmp/tmpsadbyp6s.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:18,926 [WARNING] __main__ - Cell image not found: /tmp/tmpsadbyp6s.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:18,926 [WARNING] __main__ - Cell image not found: /tmp/tmpsadbyp6s.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:18,926 [WARNING] __main__ - Cell image not found: /tmp/tmpsadbyp6s.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:18,926 [WARNING] __main__ - Cell image not found: /tmp/tmpsadbyp6s.jpg_rows/row_1/col_1.jpg -2025-03-03 16:39:18,927 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 16:39:21,658 [INFO] __main__ - Extracted 4 rows from /tmp/tmpbis7s1y9.jpg -2025-03-03 16:39:21,658 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:39:21,658 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:39:21,659 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:39:21,659 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:39:21,761 [INFO] __main__ - Files in /tmp/tmpbis7s1y9.jpg_rows: -2025-03-03 16:39:21,762 [INFO] __main__ - /tmp/tmpbis7s1y9.jpg_rows: [] -2025-03-03 16:39:21,762 [INFO] __main__ - /tmp/tmpbis7s1y9.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:21,763 [INFO] __main__ - /tmp/tmpbis7s1y9.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:39:21,763 [INFO] __main__ - /tmp/tmpbis7s1y9.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:39:21,763 [INFO] __main__ - /tmp/tmpbis7s1y9.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:39:21,763 [WARNING] __main__ - Cell image not found: /tmp/tmpbis7s1y9.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:21,764 [WARNING] __main__ - Cell image not found: /tmp/tmpbis7s1y9.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:21,764 [WARNING] __main__ - Cell image not found: /tmp/tmpbis7s1y9.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:21,764 [WARNING] __main__ - Cell image not found: /tmp/tmpbis7s1y9.jpg_rows/row_2/col_0.jpg -2025-03-03 16:39:21,765 [WARNING] __main__ - Cell image not found: /tmp/tmpbis7s1y9.jpg_rows/row_3/col_0.jpg -2025-03-03 16:39:21,765 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 16:39:24,543 [INFO] __main__ - Extracted 2 rows from /tmp/tmpv_qjs9zo.jpg -2025-03-03 16:39:24,544 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:39:24,544 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:39:24,656 [INFO] __main__ - Files in /tmp/tmpv_qjs9zo.jpg_rows: -2025-03-03 16:39:24,656 [INFO] __main__ - /tmp/tmpv_qjs9zo.jpg_rows: [] -2025-03-03 16:39:24,657 [INFO] __main__ - /tmp/tmpv_qjs9zo.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:24,657 [INFO] __main__ - /tmp/tmpv_qjs9zo.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:39:24,657 [WARNING] __main__ - Cell image not found: /tmp/tmpv_qjs9zo.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:24,657 [WARNING] __main__ - Cell image not found: /tmp/tmpv_qjs9zo.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:24,658 [WARNING] __main__ - Cell image not found: /tmp/tmpv_qjs9zo.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:24,658 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 16:39:27,236 [INFO] __main__ - Extracted 2 rows from /tmp/tmp4wmm7x9f.jpg -2025-03-03 16:39:27,236 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:39:27,237 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:39:27,342 [INFO] __main__ - Files in /tmp/tmp4wmm7x9f.jpg_rows: -2025-03-03 16:39:27,343 [INFO] __main__ - /tmp/tmp4wmm7x9f.jpg_rows: [] -2025-03-03 16:39:27,343 [INFO] __main__ - /tmp/tmp4wmm7x9f.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:27,343 [INFO] __main__ - /tmp/tmp4wmm7x9f.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:27,343 [WARNING] __main__ - Cell image not found: /tmp/tmp4wmm7x9f.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:27,344 [WARNING] __main__ - Cell image not found: /tmp/tmp4wmm7x9f.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:27,344 [WARNING] __main__ - Cell image not found: /tmp/tmp4wmm7x9f.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:27,344 [WARNING] __main__ - Cell image not found: /tmp/tmp4wmm7x9f.jpg_rows/row_1/col_1.jpg -2025-03-03 16:39:27,344 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 16:39:30,319 [INFO] __main__ - Extracted 3 rows from /tmp/tmpl2shhsgo.jpg -2025-03-03 16:39:30,319 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:39:30,319 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:39:30,319 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:39:30,419 [INFO] __main__ - Files in /tmp/tmpl2shhsgo.jpg_rows: -2025-03-03 16:39:30,419 [INFO] __main__ - /tmp/tmpl2shhsgo.jpg_rows: [] -2025-03-03 16:39:30,420 [INFO] __main__ - /tmp/tmpl2shhsgo.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:30,420 [INFO] __main__ - /tmp/tmpl2shhsgo.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:30,420 [INFO] __main__ - /tmp/tmpl2shhsgo.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:39:30,420 [WARNING] __main__ - Cell image not found: /tmp/tmpl2shhsgo.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:30,421 [WARNING] __main__ - Cell image not found: /tmp/tmpl2shhsgo.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:30,421 [WARNING] __main__ - Cell image not found: /tmp/tmpl2shhsgo.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:30,421 [WARNING] __main__ - Cell image not found: /tmp/tmpl2shhsgo.jpg_rows/row_1/col_1.jpg -2025-03-03 16:39:30,421 [WARNING] __main__ - Cell image not found: /tmp/tmpl2shhsgo.jpg_rows/row_2/col_0.jpg -2025-03-03 16:39:30,421 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 16:39:33,398 [INFO] __main__ - Extracted 2 rows from /tmp/tmpqlryvzi3.jpg -2025-03-03 16:39:33,399 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:39:33,399 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:39:33,490 [INFO] __main__ - Files in /tmp/tmpqlryvzi3.jpg_rows: -2025-03-03 16:39:33,490 [INFO] __main__ - /tmp/tmpqlryvzi3.jpg_rows: [] -2025-03-03 16:39:33,490 [INFO] __main__ - /tmp/tmpqlryvzi3.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:33,491 [INFO] __main__ - /tmp/tmpqlryvzi3.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:33,491 [WARNING] __main__ - Cell image not found: /tmp/tmpqlryvzi3.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:33,491 [WARNING] __main__ - Cell image not found: /tmp/tmpqlryvzi3.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:33,491 [WARNING] __main__ - Cell image not found: /tmp/tmpqlryvzi3.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:33,491 [WARNING] __main__ - Cell image not found: /tmp/tmpqlryvzi3.jpg_rows/row_1/col_1.jpg -2025-03-03 16:39:33,491 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 16:39:36,216 [INFO] __main__ - Extracted 3 rows from /tmp/tmpvpurws23.jpg -2025-03-03 16:39:36,216 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:39:36,216 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:39:36,216 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:39:36,355 [INFO] __main__ - Files in /tmp/tmpvpurws23.jpg_rows: -2025-03-03 16:39:36,355 [INFO] __main__ - /tmp/tmpvpurws23.jpg_rows: [] -2025-03-03 16:39:36,356 [INFO] __main__ - /tmp/tmpvpurws23.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:36,356 [INFO] __main__ - /tmp/tmpvpurws23.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:39:36,356 [INFO] __main__ - /tmp/tmpvpurws23.jpg_rows/row_2: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:36,356 [WARNING] __main__ - Cell image not found: /tmp/tmpvpurws23.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:36,357 [WARNING] __main__ - Cell image not found: /tmp/tmpvpurws23.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:36,357 [WARNING] __main__ - Cell image not found: /tmp/tmpvpurws23.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:36,357 [WARNING] __main__ - Cell image not found: /tmp/tmpvpurws23.jpg_rows/row_2/col_0.jpg -2025-03-03 16:39:36,357 [WARNING] __main__ - Cell image not found: /tmp/tmpvpurws23.jpg_rows/row_2/col_1.jpg -2025-03-03 16:39:36,358 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 16:39:39,285 [INFO] __main__ - Extracted 6 rows from /tmp/tmp0m93om1m.jpg -2025-03-03 16:39:39,285 [INFO] __main__ - Row 0 has 3 cells -2025-03-03 16:39:39,286 [INFO] __main__ - Row 1 has 3 cells -2025-03-03 16:39:39,286 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:39:39,286 [INFO] __main__ - Row 3 has 2 cells -2025-03-03 16:39:39,286 [INFO] __main__ - Row 4 has 2 cells -2025-03-03 16:39:39,286 [INFO] __main__ - Row 5 has 2 cells -2025-03-03 16:39:39,398 [INFO] __main__ - Files in /tmp/tmp0m93om1m.jpg_rows: -2025-03-03 16:39:39,399 [INFO] __main__ - /tmp/tmp0m93om1m.jpg_rows: [] -2025-03-03 16:39:39,399 [INFO] __main__ - /tmp/tmp0m93om1m.jpg_rows/row_0: ['col_0.png', 'col_1.png', 'col_2.png'] -2025-03-03 16:39:39,399 [INFO] __main__ - /tmp/tmp0m93om1m.jpg_rows/row_1: ['col_0.png', 'col_1.png', 'col_2.png'] -2025-03-03 16:39:39,400 [INFO] __main__ - /tmp/tmp0m93om1m.jpg_rows/row_2: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:39,400 [INFO] __main__ - /tmp/tmp0m93om1m.jpg_rows/row_3: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:39,400 [INFO] __main__ - /tmp/tmp0m93om1m.jpg_rows/row_4: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:39,400 [INFO] __main__ - /tmp/tmp0m93om1m.jpg_rows/row_5: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:39,401 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:39,401 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:39,401 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_0/col_2.jpg -2025-03-03 16:39:39,402 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:39,402 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_1/col_1.jpg -2025-03-03 16:39:39,402 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_1/col_2.jpg -2025-03-03 16:39:39,402 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_2/col_0.jpg -2025-03-03 16:39:39,403 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_2/col_1.jpg -2025-03-03 16:39:39,403 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_3/col_0.jpg -2025-03-03 16:39:39,403 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_3/col_1.jpg -2025-03-03 16:39:39,403 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_4/col_0.jpg -2025-03-03 16:39:39,404 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_4/col_1.jpg -2025-03-03 16:39:39,404 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_5/col_0.jpg -2025-03-03 16:39:39,404 [WARNING] __main__ - Cell image not found: /tmp/tmp0m93om1m.jpg_rows/row_5/col_1.jpg -2025-03-03 16:39:39,405 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 16:39:42,590 [INFO] __main__ - Extracted 4 rows from /tmp/tmpt381bxab.jpg -2025-03-03 16:39:42,590 [INFO] __main__ - Row 0 has 3 cells -2025-03-03 16:39:42,590 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:39:42,591 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:39:42,591 [INFO] __main__ - Row 3 has 2 cells -2025-03-03 16:39:42,704 [INFO] __main__ - Files in /tmp/tmpt381bxab.jpg_rows: -2025-03-03 16:39:42,705 [INFO] __main__ - /tmp/tmpt381bxab.jpg_rows: [] -2025-03-03 16:39:42,705 [INFO] __main__ - /tmp/tmpt381bxab.jpg_rows/row_0: ['col_0.png', 'col_1.png', 'col_2.png'] -2025-03-03 16:39:42,705 [INFO] __main__ - /tmp/tmpt381bxab.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:42,705 [INFO] __main__ - /tmp/tmpt381bxab.jpg_rows/row_2: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:42,706 [INFO] __main__ - /tmp/tmpt381bxab.jpg_rows/row_3: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:42,706 [WARNING] __main__ - Cell image not found: /tmp/tmpt381bxab.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:42,706 [WARNING] __main__ - Cell image not found: /tmp/tmpt381bxab.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:42,706 [WARNING] __main__ - Cell image not found: /tmp/tmpt381bxab.jpg_rows/row_0/col_2.jpg -2025-03-03 16:39:42,707 [WARNING] __main__ - Cell image not found: /tmp/tmpt381bxab.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:42,707 [WARNING] __main__ - Cell image not found: /tmp/tmpt381bxab.jpg_rows/row_1/col_1.jpg -2025-03-03 16:39:42,707 [WARNING] __main__ - Cell image not found: /tmp/tmpt381bxab.jpg_rows/row_2/col_0.jpg -2025-03-03 16:39:42,707 [WARNING] __main__ - Cell image not found: /tmp/tmpt381bxab.jpg_rows/row_2/col_1.jpg -2025-03-03 16:39:42,708 [WARNING] __main__ - Cell image not found: /tmp/tmpt381bxab.jpg_rows/row_3/col_0.jpg -2025-03-03 16:39:42,708 [WARNING] __main__ - Cell image not found: /tmp/tmpt381bxab.jpg_rows/row_3/col_1.jpg -2025-03-03 16:39:42,708 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 16:39:45,399 [INFO] __main__ - Extracted 4 rows from /tmp/tmpti8mweyh.jpg -2025-03-03 16:39:45,400 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:39:45,400 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:39:45,400 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:39:45,400 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:39:45,500 [INFO] __main__ - Files in /tmp/tmpti8mweyh.jpg_rows: -2025-03-03 16:39:45,501 [INFO] __main__ - /tmp/tmpti8mweyh.jpg_rows: [] -2025-03-03 16:39:45,501 [INFO] __main__ - /tmp/tmpti8mweyh.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:45,501 [INFO] __main__ - /tmp/tmpti8mweyh.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:39:45,501 [INFO] __main__ - /tmp/tmpti8mweyh.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:39:45,502 [INFO] __main__ - /tmp/tmpti8mweyh.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:39:45,502 [WARNING] __main__ - Cell image not found: /tmp/tmpti8mweyh.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:45,502 [WARNING] __main__ - Cell image not found: /tmp/tmpti8mweyh.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:45,502 [WARNING] __main__ - Cell image not found: /tmp/tmpti8mweyh.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:45,503 [WARNING] __main__ - Cell image not found: /tmp/tmpti8mweyh.jpg_rows/row_2/col_0.jpg -2025-03-03 16:39:45,503 [WARNING] __main__ - Cell image not found: /tmp/tmpti8mweyh.jpg_rows/row_3/col_0.jpg -2025-03-03 16:39:45,503 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 16:39:48,245 [INFO] __main__ - Extracted 6 rows from /tmp/tmp_8j1l0bq.jpg -2025-03-03 16:39:48,245 [INFO] __main__ - Row 0 has 1 cells -2025-03-03 16:39:48,245 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:39:48,246 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:39:48,246 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:39:48,246 [INFO] __main__ - Row 4 has 1 cells -2025-03-03 16:39:48,246 [INFO] __main__ - Row 5 has 1 cells -2025-03-03 16:39:48,360 [INFO] __main__ - Files in /tmp/tmp_8j1l0bq.jpg_rows: -2025-03-03 16:39:48,361 [INFO] __main__ - /tmp/tmp_8j1l0bq.jpg_rows: [] -2025-03-03 16:39:48,361 [INFO] __main__ - /tmp/tmp_8j1l0bq.jpg_rows/row_0: ['col_0.png'] -2025-03-03 16:39:48,361 [INFO] __main__ - /tmp/tmp_8j1l0bq.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:39:48,361 [INFO] __main__ - /tmp/tmp_8j1l0bq.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:39:48,362 [INFO] __main__ - /tmp/tmp_8j1l0bq.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:39:48,362 [INFO] __main__ - /tmp/tmp_8j1l0bq.jpg_rows/row_4: ['col_0.png'] -2025-03-03 16:39:48,362 [INFO] __main__ - /tmp/tmp_8j1l0bq.jpg_rows/row_5: ['col_0.png'] -2025-03-03 16:39:48,362 [WARNING] __main__ - Cell image not found: /tmp/tmp_8j1l0bq.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:48,363 [WARNING] __main__ - Cell image not found: /tmp/tmp_8j1l0bq.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:48,363 [WARNING] __main__ - Cell image not found: /tmp/tmp_8j1l0bq.jpg_rows/row_2/col_0.jpg -2025-03-03 16:39:48,363 [WARNING] __main__ - Cell image not found: /tmp/tmp_8j1l0bq.jpg_rows/row_3/col_0.jpg -2025-03-03 16:39:48,363 [WARNING] __main__ - Cell image not found: /tmp/tmp_8j1l0bq.jpg_rows/row_4/col_0.jpg -2025-03-03 16:39:48,363 [WARNING] __main__ - Cell image not found: /tmp/tmp_8j1l0bq.jpg_rows/row_5/col_0.jpg -2025-03-03 16:39:48,364 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 16:39:50,951 [INFO] __main__ - Extracted 3 rows from /tmp/tmp0x9tljbb.jpg -2025-03-03 16:39:50,951 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:39:50,951 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:39:50,952 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:39:51,123 [INFO] __main__ - Files in /tmp/tmp0x9tljbb.jpg_rows: -2025-03-03 16:39:51,123 [INFO] __main__ - /tmp/tmp0x9tljbb.jpg_rows: [] -2025-03-03 16:39:51,124 [INFO] __main__ - /tmp/tmp0x9tljbb.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:51,124 [INFO] __main__ - /tmp/tmp0x9tljbb.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:51,124 [INFO] __main__ - /tmp/tmp0x9tljbb.jpg_rows/row_2: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:51,125 [WARNING] __main__ - Cell image not found: /tmp/tmp0x9tljbb.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:51,125 [WARNING] __main__ - Cell image not found: /tmp/tmp0x9tljbb.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:51,125 [WARNING] __main__ - Cell image not found: /tmp/tmp0x9tljbb.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:51,126 [WARNING] __main__ - Cell image not found: /tmp/tmp0x9tljbb.jpg_rows/row_1/col_1.jpg -2025-03-03 16:39:51,126 [WARNING] __main__ - Cell image not found: /tmp/tmp0x9tljbb.jpg_rows/row_2/col_0.jpg -2025-03-03 16:39:51,126 [WARNING] __main__ - Cell image not found: /tmp/tmp0x9tljbb.jpg_rows/row_2/col_1.jpg -2025-03-03 16:39:51,127 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 16:39:53,832 [INFO] __main__ - Extracted 4 rows from /tmp/tmp0xq8v429.jpg -2025-03-03 16:39:53,832 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:39:53,832 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:39:53,833 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:39:53,833 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:39:53,932 [INFO] __main__ - Files in /tmp/tmp0xq8v429.jpg_rows: -2025-03-03 16:39:53,933 [INFO] __main__ - /tmp/tmp0xq8v429.jpg_rows: [] -2025-03-03 16:39:53,933 [INFO] __main__ - /tmp/tmp0xq8v429.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:53,933 [INFO] __main__ - /tmp/tmp0xq8v429.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:53,933 [INFO] __main__ - /tmp/tmp0xq8v429.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:39:53,934 [INFO] __main__ - /tmp/tmp0xq8v429.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:39:53,934 [WARNING] __main__ - Cell image not found: /tmp/tmp0xq8v429.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:53,934 [WARNING] __main__ - Cell image not found: /tmp/tmp0xq8v429.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:53,934 [WARNING] __main__ - Cell image not found: /tmp/tmp0xq8v429.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:53,934 [WARNING] __main__ - Cell image not found: /tmp/tmp0xq8v429.jpg_rows/row_1/col_1.jpg -2025-03-03 16:39:53,934 [WARNING] __main__ - Cell image not found: /tmp/tmp0xq8v429.jpg_rows/row_2/col_0.jpg -2025-03-03 16:39:53,935 [WARNING] __main__ - Cell image not found: /tmp/tmp0xq8v429.jpg_rows/row_3/col_0.jpg -2025-03-03 16:39:53,935 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 16:39:57,417 [INFO] __main__ - Extracted 6 rows from /tmp/tmp32obdcoa.jpg -2025-03-03 16:39:57,417 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:39:57,417 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:39:57,417 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:39:57,417 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:39:57,417 [INFO] __main__ - Row 4 has 2 cells -2025-03-03 16:39:57,418 [INFO] __main__ - Row 5 has 1 cells -2025-03-03 16:39:57,561 [INFO] __main__ - Files in /tmp/tmp32obdcoa.jpg_rows: -2025-03-03 16:39:57,561 [INFO] __main__ - /tmp/tmp32obdcoa.jpg_rows: [] -2025-03-03 16:39:57,561 [INFO] __main__ - /tmp/tmp32obdcoa.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:57,561 [INFO] __main__ - /tmp/tmp32obdcoa.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:57,562 [INFO] __main__ - /tmp/tmp32obdcoa.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:39:57,562 [INFO] __main__ - /tmp/tmp32obdcoa.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:39:57,562 [INFO] __main__ - /tmp/tmp32obdcoa.jpg_rows/row_4: ['col_0.png', 'col_1.png'] -2025-03-03 16:39:57,562 [INFO] __main__ - /tmp/tmp32obdcoa.jpg_rows/row_5: ['col_0.png'] -2025-03-03 16:39:57,562 [WARNING] __main__ - Cell image not found: /tmp/tmp32obdcoa.jpg_rows/row_0/col_0.jpg -2025-03-03 16:39:57,562 [WARNING] __main__ - Cell image not found: /tmp/tmp32obdcoa.jpg_rows/row_0/col_1.jpg -2025-03-03 16:39:57,562 [WARNING] __main__ - Cell image not found: /tmp/tmp32obdcoa.jpg_rows/row_1/col_0.jpg -2025-03-03 16:39:57,563 [WARNING] __main__ - Cell image not found: /tmp/tmp32obdcoa.jpg_rows/row_1/col_1.jpg -2025-03-03 16:39:57,563 [WARNING] __main__ - Cell image not found: /tmp/tmp32obdcoa.jpg_rows/row_2/col_0.jpg -2025-03-03 16:39:57,563 [WARNING] __main__ - Cell image not found: /tmp/tmp32obdcoa.jpg_rows/row_3/col_0.jpg -2025-03-03 16:39:57,563 [WARNING] __main__ - Cell image not found: /tmp/tmp32obdcoa.jpg_rows/row_4/col_0.jpg -2025-03-03 16:39:57,563 [WARNING] __main__ - Cell image not found: /tmp/tmp32obdcoa.jpg_rows/row_4/col_1.jpg -2025-03-03 16:39:57,563 [WARNING] __main__ - Cell image not found: /tmp/tmp32obdcoa.jpg_rows/row_5/col_0.jpg -2025-03-03 16:39:57,564 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 16:40:00,740 [INFO] __main__ - Extracted 5 rows from /tmp/tmpteq6tjo9.jpg -2025-03-03 16:40:00,741 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:40:00,741 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:40:00,741 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:40:00,741 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:40:00,741 [INFO] __main__ - Row 4 has 1 cells -2025-03-03 16:40:00,855 [INFO] __main__ - Files in /tmp/tmpteq6tjo9.jpg_rows: -2025-03-03 16:40:00,855 [INFO] __main__ - /tmp/tmpteq6tjo9.jpg_rows: [] -2025-03-03 16:40:00,855 [INFO] __main__ - /tmp/tmpteq6tjo9.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:40:00,855 [INFO] __main__ - /tmp/tmpteq6tjo9.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:40:00,856 [INFO] __main__ - /tmp/tmpteq6tjo9.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:40:00,856 [INFO] __main__ - /tmp/tmpteq6tjo9.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:40:00,856 [INFO] __main__ - /tmp/tmpteq6tjo9.jpg_rows/row_4: ['col_0.png'] -2025-03-03 16:40:00,856 [WARNING] __main__ - Cell image not found: /tmp/tmpteq6tjo9.jpg_rows/row_0/col_0.jpg -2025-03-03 16:40:00,856 [WARNING] __main__ - Cell image not found: /tmp/tmpteq6tjo9.jpg_rows/row_0/col_1.jpg -2025-03-03 16:40:00,856 [WARNING] __main__ - Cell image not found: /tmp/tmpteq6tjo9.jpg_rows/row_1/col_0.jpg -2025-03-03 16:40:00,856 [WARNING] __main__ - Cell image not found: /tmp/tmpteq6tjo9.jpg_rows/row_1/col_1.jpg -2025-03-03 16:40:00,857 [WARNING] __main__ - Cell image not found: /tmp/tmpteq6tjo9.jpg_rows/row_2/col_0.jpg -2025-03-03 16:40:00,857 [WARNING] __main__ - Cell image not found: /tmp/tmpteq6tjo9.jpg_rows/row_3/col_0.jpg -2025-03-03 16:40:00,857 [WARNING] __main__ - Cell image not found: /tmp/tmpteq6tjo9.jpg_rows/row_4/col_0.jpg -2025-03-03 16:40:00,857 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 16:40:03,992 [INFO] __main__ - Extracted 6 rows from /tmp/tmpcjn56rie.jpg -2025-03-03 16:40:03,993 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:40:03,993 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:40:03,993 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:40:03,993 [INFO] __main__ - Row 3 has 2 cells -2025-03-03 16:40:03,994 [INFO] __main__ - Row 4 has 1 cells -2025-03-03 16:40:03,994 [INFO] __main__ - Row 5 has 1 cells -2025-03-03 16:40:04,117 [INFO] __main__ - Files in /tmp/tmpcjn56rie.jpg_rows: -2025-03-03 16:40:04,117 [INFO] __main__ - /tmp/tmpcjn56rie.jpg_rows: [] -2025-03-03 16:40:04,117 [INFO] __main__ - /tmp/tmpcjn56rie.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:40:04,118 [INFO] __main__ - /tmp/tmpcjn56rie.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:40:04,118 [INFO] __main__ - /tmp/tmpcjn56rie.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:40:04,118 [INFO] __main__ - /tmp/tmpcjn56rie.jpg_rows/row_3: ['col_0.png', 'col_1.png'] -2025-03-03 16:40:04,118 [INFO] __main__ - /tmp/tmpcjn56rie.jpg_rows/row_4: ['col_0.png'] -2025-03-03 16:40:04,119 [INFO] __main__ - /tmp/tmpcjn56rie.jpg_rows/row_5: ['col_0.png'] -2025-03-03 16:40:04,119 [WARNING] __main__ - Cell image not found: /tmp/tmpcjn56rie.jpg_rows/row_0/col_0.jpg -2025-03-03 16:40:04,119 [WARNING] __main__ - Cell image not found: /tmp/tmpcjn56rie.jpg_rows/row_0/col_1.jpg -2025-03-03 16:40:04,119 [WARNING] __main__ - Cell image not found: /tmp/tmpcjn56rie.jpg_rows/row_1/col_0.jpg -2025-03-03 16:40:04,119 [WARNING] __main__ - Cell image not found: /tmp/tmpcjn56rie.jpg_rows/row_1/col_1.jpg -2025-03-03 16:40:04,120 [WARNING] __main__ - Cell image not found: /tmp/tmpcjn56rie.jpg_rows/row_2/col_0.jpg -2025-03-03 16:40:04,120 [WARNING] __main__ - Cell image not found: /tmp/tmpcjn56rie.jpg_rows/row_3/col_0.jpg -2025-03-03 16:40:04,120 [WARNING] __main__ - Cell image not found: /tmp/tmpcjn56rie.jpg_rows/row_3/col_1.jpg -2025-03-03 16:40:04,120 [WARNING] __main__ - Cell image not found: /tmp/tmpcjn56rie.jpg_rows/row_4/col_0.jpg -2025-03-03 16:40:04,120 [WARNING] __main__ - Cell image not found: /tmp/tmpcjn56rie.jpg_rows/row_5/col_0.jpg -2025-03-03 16:40:04,121 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 16:40:07,145 [INFO] __main__ - Extracted 6 rows from /tmp/tmp_0pwqboj.jpg -2025-03-03 16:40:07,146 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:40:07,146 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:40:07,146 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:40:07,147 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:40:07,147 [INFO] __main__ - Row 4 has 1 cells -2025-03-03 16:40:07,147 [INFO] __main__ - Row 5 has 1 cells -2025-03-03 16:40:07,269 [INFO] __main__ - Files in /tmp/tmp_0pwqboj.jpg_rows: -2025-03-03 16:40:07,270 [INFO] __main__ - /tmp/tmp_0pwqboj.jpg_rows: [] -2025-03-03 16:40:07,270 [INFO] __main__ - /tmp/tmp_0pwqboj.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:40:07,270 [INFO] __main__ - /tmp/tmp_0pwqboj.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:40:07,270 [INFO] __main__ - /tmp/tmp_0pwqboj.jpg_rows/row_2: ['col_0.png', 'col_1.png'] -2025-03-03 16:40:07,271 [INFO] __main__ - /tmp/tmp_0pwqboj.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:40:07,271 [INFO] __main__ - /tmp/tmp_0pwqboj.jpg_rows/row_4: ['col_0.png'] -2025-03-03 16:40:07,271 [INFO] __main__ - /tmp/tmp_0pwqboj.jpg_rows/row_5: ['col_0.png'] -2025-03-03 16:40:07,271 [WARNING] __main__ - Cell image not found: /tmp/tmp_0pwqboj.jpg_rows/row_0/col_0.jpg -2025-03-03 16:40:07,272 [WARNING] __main__ - Cell image not found: /tmp/tmp_0pwqboj.jpg_rows/row_0/col_1.jpg -2025-03-03 16:40:07,272 [WARNING] __main__ - Cell image not found: /tmp/tmp_0pwqboj.jpg_rows/row_1/col_0.jpg -2025-03-03 16:40:07,272 [WARNING] __main__ - Cell image not found: /tmp/tmp_0pwqboj.jpg_rows/row_2/col_0.jpg -2025-03-03 16:40:07,272 [WARNING] __main__ - Cell image not found: /tmp/tmp_0pwqboj.jpg_rows/row_2/col_1.jpg -2025-03-03 16:40:07,272 [WARNING] __main__ - Cell image not found: /tmp/tmp_0pwqboj.jpg_rows/row_3/col_0.jpg -2025-03-03 16:40:07,273 [WARNING] __main__ - Cell image not found: /tmp/tmp_0pwqboj.jpg_rows/row_4/col_0.jpg -2025-03-03 16:40:07,273 [WARNING] __main__ - Cell image not found: /tmp/tmp_0pwqboj.jpg_rows/row_5/col_0.jpg -2025-03-03 16:40:07,273 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 16:40:08,066 [INFO] __main__ - Extracted 2 rows from /tmp/tmpk5z84_tz.jpg -2025-03-03 16:40:08,066 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:40:08,066 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:40:08,100 [INFO] __main__ - Files in /tmp/tmpk5z84_tz.jpg_rows: -2025-03-03 16:40:08,100 [INFO] __main__ - /tmp/tmpk5z84_tz.jpg_rows: [] -2025-03-03 16:40:08,100 [INFO] __main__ - /tmp/tmpk5z84_tz.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:40:08,100 [INFO] __main__ - /tmp/tmpk5z84_tz.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:40:08,100 [WARNING] __main__ - Cell image not found: /tmp/tmpk5z84_tz.jpg_rows/row_0/col_0.jpg -2025-03-03 16:40:08,100 [WARNING] __main__ - Cell image not found: /tmp/tmpk5z84_tz.jpg_rows/row_0/col_1.jpg -2025-03-03 16:40:08,101 [WARNING] __main__ - Cell image not found: /tmp/tmpk5z84_tz.jpg_rows/row_1/col_0.jpg -2025-03-03 16:40:08,101 [WARNING] __main__ - Cell image not found: /tmp/tmpk5z84_tz.jpg_rows/row_1/col_1.jpg -2025-03-03 16:40:08,101 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 16:40:10,563 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 16:40:56,164 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 16:40:56,929 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 16:40:56,930 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 16:40:57,214 [INFO] __main__ - Computed global offset: 4 -2025-03-03 16:40:57,215 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 16:41:50,960 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 16:41:52,844 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 16:41:53,383 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 16:41:53,923 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 16:41:54,525 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 16:41:55,160 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 16:41:55,630 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 16:41:56,152 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 16:41:56,681 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 16:41:57,213 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 16:41:57,792 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 16:41:58,296 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 16:41:58,702 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 16:41:59,409 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 16:41:59,869 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 16:42:00,474 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 16:42:01,030 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 16:42:01,336 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 16:42:01,851 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 16:42:02,414 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 16:42:02,940 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 16:42:03,487 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 16:42:03,925 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 16:42:04,459 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 16:42:04,999 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 16:42:05,488 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 16:42:05,884 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 16:42:06,475 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 16:42:06,866 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 16:42:10,746 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 16:42:13,127 [INFO] __main__ - Extracted 2 rows from /tmp/tmp5g6lny8d.jpg -2025-03-03 16:42:13,127 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:42:13,128 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:42:13,228 [INFO] __main__ - Files in /tmp/tmp5g6lny8d.jpg_rows: -2025-03-03 16:42:13,229 [INFO] __main__ - /tmp/tmp5g6lny8d.jpg_rows: [] -2025-03-03 16:42:13,229 [INFO] __main__ - /tmp/tmp5g6lny8d.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:42:13,229 [INFO] __main__ - /tmp/tmp5g6lny8d.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:42:13,636 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c0.jpg -2025-03-03 16:42:14,677 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:42:14,678 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-03 16:42:15,889 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:42:15,890 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-03 16:42:15,890 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5g6lny8d.jpg_rows/row_0/col_0.png: {'title': '', 'subtopics': []} -2025-03-03 16:42:16,159 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c1.jpg -2025-03-03 16:42:17,348 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:42:17,349 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-03 16:42:18,700 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:42:18,704 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-03 16:42:18,704 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5g6lny8d.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:42:18,961 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.jpg -2025-03-03 16:42:20,132 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "1 Proof", - "subtopics": [] -} -``` -2025-03-03 16:42:20,132 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-03 16:42:21,478 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "1 Proof", - "subtopics": [] -} -``` -2025-03-03 16:42:21,478 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-03 16:42:21,478 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5g6lny8d.jpg_rows/row_1/col_0.png: {'title': '', 'subtopics': []} -2025-03-03 16:42:22,094 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c1.jpg -2025-03-03 16:42:23,562 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["1.1"] -} -``` -2025-03-03 16:42:23,563 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-03 16:42:25,127 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["1.1"] -} -``` -2025-03-03 16:42:25,127 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-03 16:42:25,127 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5g6lny8d.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:42:25,128 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 16:42:27,781 [INFO] __main__ - Extracted 4 rows from /tmp/tmp1l6kamqx.jpg -2025-03-03 16:42:27,781 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:42:27,781 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:42:27,781 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:42:27,781 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:42:27,883 [INFO] __main__ - Files in /tmp/tmp1l6kamqx.jpg_rows: -2025-03-03 16:42:27,883 [INFO] __main__ - /tmp/tmp1l6kamqx.jpg_rows: [] -2025-03-03 16:42:27,883 [INFO] __main__ - /tmp/tmp1l6kamqx.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:42:27,883 [INFO] __main__ - /tmp/tmp1l6kamqx.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:42:27,884 [INFO] __main__ - /tmp/tmp1l6kamqx.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:42:27,884 [INFO] __main__ - /tmp/tmp1l6kamqx.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:42:28,438 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 16:49:13,093 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 16:49:13,879 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 16:49:13,880 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 16:49:14,182 [INFO] __main__ - Computed global offset: 4 -2025-03-03 16:49:14,183 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 16:50:10,112 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 16:50:11,844 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 16:50:12,408 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 16:50:12,921 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 16:50:13,455 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 16:50:13,999 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 16:50:14,487 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 16:50:15,065 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 16:50:15,602 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 16:50:16,141 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 16:50:16,631 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 16:50:17,160 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 16:50:17,578 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 16:50:18,149 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 16:50:18,605 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 16:50:19,209 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 16:50:19,783 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 16:50:20,142 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 16:50:20,770 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 16:50:21,435 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 16:50:22,013 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 16:50:22,559 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 16:50:23,020 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 16:50:23,485 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 16:50:23,997 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 16:50:24,509 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 16:50:24,934 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 16:50:25,538 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 16:50:25,937 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 16:50:29,672 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 16:50:32,751 [INFO] __main__ - Extracted 2 rows from /tmp/tmpjgdv33wy.jpg -2025-03-03 16:50:32,751 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:50:32,751 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:50:32,855 [INFO] __main__ - Files in /tmp/tmpjgdv33wy.jpg_rows: -2025-03-03 16:50:32,855 [INFO] __main__ - /tmp/tmpjgdv33wy.jpg_rows: [] -2025-03-03 16:50:32,855 [INFO] __main__ - /tmp/tmpjgdv33wy.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:50:32,856 [INFO] __main__ - /tmp/tmpjgdv33wy.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:50:33,272 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c0.png -2025-03-03 16:50:33,970 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:50:33,970 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpjgdv33wy.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:50:34,243 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c1.png -2025-03-03 16:50:35,079 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:50:35,080 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpjgdv33wy.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:50:35,343 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.png -2025-03-03 16:50:36,159 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "1 Proof", - "subtopics": [] -} -``` -2025-03-03 16:50:36,159 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpjgdv33wy.jpg_rows/row_1/col_0.png: {'title': '1 Proof', 'subtopics': []} -2025-03-03 16:50:36,773 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c1.png -2025-03-03 16:50:37,869 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["1.1"] -} -``` -2025-03-03 16:50:37,870 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpjgdv33wy.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['1.1']} -2025-03-03 16:50:37,871 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 16:50:41,168 [INFO] __main__ - Extracted 4 rows from /tmp/tmpe6hjqmek.jpg -2025-03-03 16:50:41,169 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:50:41,169 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:50:41,169 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:50:41,169 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:50:41,272 [INFO] __main__ - Files in /tmp/tmpe6hjqmek.jpg_rows: -2025-03-03 16:50:41,272 [INFO] __main__ - /tmp/tmpe6hjqmek.jpg_rows: [] -2025-03-03 16:50:41,273 [INFO] __main__ - /tmp/tmpe6hjqmek.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:50:41,273 [INFO] __main__ - /tmp/tmpe6hjqmek.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:50:41,273 [INFO] __main__ - /tmp/tmpe6hjqmek.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:50:41,273 [INFO] __main__ - /tmp/tmpe6hjqmek.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:50:41,535 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r0_c0.png -2025-03-03 16:50:42,613 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "2 Algebra and functions", - "subtopics": [] -} -``` -2025-03-03 16:50:42,613 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpe6hjqmek.jpg_rows/row_0/col_0.png: {'title': '2 Algebra and functions', 'subtopics': []} -2025-03-03 16:50:43,030 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r0_c1.png -2025-03-03 16:50:44,346 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.1"] -} -``` -2025-03-03 16:50:44,346 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpe6hjqmek.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': ['2.1']} -2025-03-03 16:50:44,812 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c0.png -2025-03-03 16:50:46,137 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.2"] -} -``` -2025-03-03 16:50:46,137 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpe6hjqmek.jpg_rows/row_1/col_0.png: {'title': '', 'subtopics': ['2.2']} -2025-03-03 16:50:46,737 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c0.png -2025-03-03 16:50:48,142 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.3"] -} -``` -2025-03-03 16:50:48,142 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpe6hjqmek.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['2.3']} -2025-03-03 16:50:48,611 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c0.png -2025-03-03 16:50:49,865 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.4"] -} -``` -2025-03-03 16:50:49,865 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpe6hjqmek.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': ['2.4']} -2025-03-03 16:50:49,866 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 16:50:52,925 [INFO] __main__ - Extracted 2 rows from /tmp/tmpj_er9iuh.jpg -2025-03-03 16:50:52,925 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:50:52,925 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:50:53,041 [INFO] __main__ - Files in /tmp/tmpj_er9iuh.jpg_rows: -2025-03-03 16:50:53,041 [INFO] __main__ - /tmp/tmpj_er9iuh.jpg_rows: [] -2025-03-03 16:50:53,041 [INFO] __main__ - /tmp/tmpj_er9iuh.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:50:53,042 [INFO] __main__ - /tmp/tmpj_er9iuh.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:50:53,298 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r0_c0.png -2025-03-03 16:50:54,563 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "2 Algebra and functions continued", - "subtopics": [] -} -``` -2025-03-03 16:50:54,564 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpj_er9iuh.jpg_rows/row_0/col_0.png: {'title': '2 Algebra and functions continued', 'subtopics': []} -2025-03-03 16:50:55,198 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r0_c1.png -2025-03-03 16:50:56,533 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.5"] -} -``` -2025-03-03 16:50:56,534 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpj_er9iuh.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': ['2.5']} -2025-03-03 16:50:57,061 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c0.png -2025-03-03 16:50:58,362 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.6"] -} -``` -2025-03-03 16:50:58,363 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpj_er9iuh.jpg_rows/row_1/col_0.png: {'title': '', 'subtopics': ['2.6']} -2025-03-03 16:50:58,363 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 16:51:01,089 [INFO] __main__ - Extracted 2 rows from /tmp/tmpq70r2u9s.jpg -2025-03-03 16:51:01,089 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:51:01,090 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:51:01,197 [INFO] __main__ - Files in /tmp/tmpq70r2u9s.jpg_rows: -2025-03-03 16:51:01,197 [INFO] __main__ - /tmp/tmpq70r2u9s.jpg_rows: [] -2025-03-03 16:51:01,197 [INFO] __main__ - /tmp/tmpq70r2u9s.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:51:01,198 [INFO] __main__ - /tmp/tmpq70r2u9s.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:51:01,379 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r0_c0.png -2025-03-03 16:51:02,456 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:51:02,456 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpq70r2u9s.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:51:02,731 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r0_c1.png -2025-03-03 16:51:03,820 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:51:03,821 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpq70r2u9s.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:51:04,094 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r1_c0.png -2025-03-03 16:51:05,255 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "2 Algebra and functions", - "subtopics": [] -} -``` -2025-03-03 16:51:05,256 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpq70r2u9s.jpg_rows/row_1/col_0.png: {'title': '2 Algebra and functions', 'subtopics': []} -2025-03-03 16:51:05,863 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r1_c1.png -2025-03-03 16:51:07,343 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.7"] -} -``` -2025-03-03 16:51:07,344 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpq70r2u9s.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['2.7']} -2025-03-03 16:51:07,344 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 16:51:10,254 [INFO] __main__ - Extracted 3 rows from /tmp/tmpayf9pn2l.jpg -2025-03-03 16:51:10,254 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:51:10,255 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:51:10,255 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:51:10,355 [INFO] __main__ - Files in /tmp/tmpayf9pn2l.jpg_rows: -2025-03-03 16:51:10,355 [INFO] __main__ - /tmp/tmpayf9pn2l.jpg_rows: [] -2025-03-03 16:51:10,356 [INFO] __main__ - /tmp/tmpayf9pn2l.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:51:10,356 [INFO] __main__ - /tmp/tmpayf9pn2l.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:51:10,356 [INFO] __main__ - /tmp/tmpayf9pn2l.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:51:10,539 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r0_c0.png -2025-03-03 16:51:11,572 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:51:11,572 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpayf9pn2l.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:51:11,827 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r0_c1.png -2025-03-03 16:51:13,102 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:51:13,102 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpayf9pn2l.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:51:13,378 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c0.png -2025-03-03 16:51:14,495 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "2 Algebra and functions continued", - "subtopics": [] -} -``` -2025-03-03 16:51:14,496 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpayf9pn2l.jpg_rows/row_1/col_0.png: {'title': '2 Algebra and functions continued', 'subtopics': []} -2025-03-03 16:51:15,005 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c1.png -2025-03-03 16:51:16,393 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.8"] -} -``` -2025-03-03 16:51:16,394 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpayf9pn2l.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['2.8']} -2025-03-03 16:51:16,985 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r2_c0.png -2025-03-03 16:51:18,331 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.9"] -} -``` -2025-03-03 16:51:18,331 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpayf9pn2l.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['2.9']} -2025-03-03 16:51:18,331 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 16:51:21,438 [INFO] __main__ - Extracted 2 rows from /tmp/tmpjjjv9dkx.jpg -2025-03-03 16:51:21,438 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:51:21,439 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:51:21,531 [INFO] __main__ - Files in /tmp/tmpjjjv9dkx.jpg_rows: -2025-03-03 16:51:21,532 [INFO] __main__ - /tmp/tmpjjjv9dkx.jpg_rows: [] -2025-03-03 16:51:21,532 [INFO] __main__ - /tmp/tmpjjjv9dkx.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:51:21,532 [INFO] __main__ - /tmp/tmpjjjv9dkx.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:51:21,794 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r0_c0.png -2025-03-03 16:51:22,830 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "2 Algebra and functions continued", - "subtopics": [] -} -``` -2025-03-03 16:51:22,830 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpjjjv9dkx.jpg_rows/row_0/col_0.png: {'title': '2 Algebra and functions continued', 'subtopics': []} -2025-03-03 16:51:23,343 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r0_c1.png -2025-03-03 16:51:24,714 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.11"] -} -``` -2025-03-03 16:51:24,714 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpjjjv9dkx.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': ['2.11']} -2025-03-03 16:51:24,973 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c0.png -2025-03-03 16:51:26,265 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "3 Coordinate geometry in the (x,y) plane", - "subtopics": [] -} -``` -2025-03-03 16:51:26,266 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpjjjv9dkx.jpg_rows/row_1/col_0.png: {'title': '3 Coordinate geometry in the (x,y) plane', 'subtopics': []} -2025-03-03 16:51:26,779 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c1.png -2025-03-03 16:51:28,085 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["3.1"] -} -``` -2025-03-03 16:51:28,086 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpjjjv9dkx.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['3.1']} -2025-03-03 16:51:28,087 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 16:51:31,131 [INFO] __main__ - Extracted 3 rows from /tmp/tmpfgqi548d.jpg -2025-03-03 16:51:31,131 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:51:31,131 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:51:31,131 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:51:31,247 [INFO] __main__ - Files in /tmp/tmpfgqi548d.jpg_rows: -2025-03-03 16:51:31,247 [INFO] __main__ - /tmp/tmpfgqi548d.jpg_rows: [] -2025-03-03 16:51:31,247 [INFO] __main__ - /tmp/tmpfgqi548d.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:51:31,247 [INFO] __main__ - /tmp/tmpfgqi548d.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:51:31,248 [INFO] __main__ - /tmp/tmpfgqi548d.jpg_rows/row_2: ['col_0.png', 'col_1.png'] -2025-03-03 16:51:31,550 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-03 16:51:32,727 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "3 Coordinate geometry in the (x, y) plane continued", - "subtopics": [] -} -``` -2025-03-03 16:51:32,727 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpfgqi548d.jpg_rows/row_0/col_0.png: {'title': '3 Coordinate geometry in the (x, y) plane continued', 'subtopics': []} -2025-03-03 16:51:33,254 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r0_c1.png -2025-03-03 16:51:34,575 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["3.3"] -} -``` -2025-03-03 16:51:34,575 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpfgqi548d.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': ['3.3']} -2025-03-03 16:51:35,089 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c0.png -2025-03-03 16:51:36,398 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["3.4"] -} -``` -2025-03-03 16:51:36,399 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpfgqi548d.jpg_rows/row_1/col_0.png: {'title': '', 'subtopics': ['3.4']} -2025-03-03 16:51:36,672 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c0.png -2025-03-03 16:51:37,921 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "4 Sequences and series", - "subtopics": [] -} -``` -2025-03-03 16:51:37,921 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpfgqi548d.jpg_rows/row_2/col_0.png: {'title': '4 Sequences and series', 'subtopics': []} -2025-03-03 16:51:38,442 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c1.png -2025-03-03 16:51:39,809 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["4.1"] -} -``` -2025-03-03 16:51:39,810 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpfgqi548d.jpg_rows/row_2/col_1.png: {'title': '', 'subtopics': ['4.1']} -2025-03-03 16:51:39,810 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 16:51:42,612 [INFO] __main__ - Extracted 6 rows from /tmp/tmpu9um0x6r.jpg -2025-03-03 16:51:42,612 [INFO] __main__ - Row 0 has 3 cells -2025-03-03 16:51:42,612 [INFO] __main__ - Row 1 has 3 cells -2025-03-03 16:51:42,613 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:51:42,613 [INFO] __main__ - Row 3 has 2 cells -2025-03-03 16:51:42,613 [INFO] __main__ - Row 4 has 2 cells -2025-03-03 16:51:42,613 [INFO] __main__ - Row 5 has 2 cells -2025-03-03 16:51:42,729 [INFO] __main__ - Files in /tmp/tmpu9um0x6r.jpg_rows: -2025-03-03 16:51:42,729 [INFO] __main__ - /tmp/tmpu9um0x6r.jpg_rows: [] -2025-03-03 16:51:42,730 [INFO] __main__ - /tmp/tmpu9um0x6r.jpg_rows/row_0: ['col_0.png', 'col_1.png', 'col_2.png'] -2025-03-03 16:51:42,730 [INFO] __main__ - /tmp/tmpu9um0x6r.jpg_rows/row_1: ['col_0.png', 'col_1.png', 'col_2.png'] -2025-03-03 16:51:42,730 [INFO] __main__ - /tmp/tmpu9um0x6r.jpg_rows/row_2: ['col_0.png', 'col_1.png'] -2025-03-03 16:51:42,730 [INFO] __main__ - /tmp/tmpu9um0x6r.jpg_rows/row_3: ['col_0.png', 'col_1.png'] -2025-03-03 16:51:42,730 [INFO] __main__ - /tmp/tmpu9um0x6r.jpg_rows/row_4: ['col_0.png', 'col_1.png'] -2025-03-03 16:51:42,730 [INFO] __main__ - /tmp/tmpu9um0x6r.jpg_rows/row_5: ['col_0.png', 'col_1.png'] -2025-03-03 16:51:42,912 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r0_c0.png -2025-03-03 16:51:44,062 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:51:44,062 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:51:44,239 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r0_c1.png -2025-03-03 16:51:45,372 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Content", - "subtopics": [] -} -``` -2025-03-03 16:51:45,372 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_0/col_1.png: {'title': 'Content', 'subtopics': []} -2025-03-03 16:51:45,555 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r0_c2.png -2025-03-03 16:51:46,877 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Guidance", - "subtopics": [] -} -``` -2025-03-03 16:51:46,878 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_0/col_2.png: {'title': 'Guidance', 'subtopics': []} -2025-03-03 16:51:47,144 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-03 16:51:48,288 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "4 Sequences and series", - "subtopics": [] -} -``` -2025-03-03 16:51:48,288 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_1/col_0.png: {'title': '4 Sequences and series', 'subtopics': []} -2025-03-03 16:51:48,470 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c1.png -2025-03-03 16:51:49,598 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["4.2"] -} -``` -2025-03-03 16:51:49,599 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['4.2']} -2025-03-03 16:51:50,127 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c2.png -2025-03-03 16:51:51,350 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:51:51,350 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_1/col_2.png: {'title': '', 'subtopics': []} -2025-03-03 16:51:51,530 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c0.png -2025-03-03 16:51:52,562 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["4.3"] -} -``` -2025-03-03 16:51:52,563 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['4.3']} -2025-03-03 16:51:52,899 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c1.png -2025-03-03 16:51:54,046 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:51:54,046 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_2/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:51:54,226 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c0.png -2025-03-03 16:51:55,355 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["4.4"] -} -``` -2025-03-03 16:51:55,355 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': ['4.4']} -2025-03-03 16:51:55,797 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c1.png -2025-03-03 16:51:56,917 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:51:56,917 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_3/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:51:57,100 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-03 16:51:58,267 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["4.5"] -} -``` -2025-03-03 16:51:58,268 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_4/col_0.png: {'title': '', 'subtopics': ['4.5']} -2025-03-03 16:51:58,786 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c1.png -2025-03-03 16:52:00,140 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:52:00,140 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_4/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:52:00,324 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r5_c0.png -2025-03-03 16:52:01,472 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["4.6"] -} -``` -2025-03-03 16:52:01,473 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_5/col_0.png: {'title': '', 'subtopics': ['4.6']} -2025-03-03 16:52:01,921 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r5_c1.png -2025-03-03 16:52:03,113 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:52:03,114 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpu9um0x6r.jpg_rows/row_5/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:52:03,114 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-03 16:52:06,183 [INFO] __main__ - Extracted 4 rows from /tmp/tmpxq0b65ph.jpg -2025-03-03 16:52:06,184 [INFO] __main__ - Row 0 has 3 cells -2025-03-03 16:52:06,184 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:52:06,185 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:52:06,185 [INFO] __main__ - Row 3 has 2 cells -2025-03-03 16:52:06,316 [INFO] __main__ - Files in /tmp/tmpxq0b65ph.jpg_rows: -2025-03-03 16:52:06,316 [INFO] __main__ - /tmp/tmpxq0b65ph.jpg_rows: [] -2025-03-03 16:52:06,316 [INFO] __main__ - /tmp/tmpxq0b65ph.jpg_rows/row_0: ['col_0.png', 'col_1.png', 'col_2.png'] -2025-03-03 16:52:06,317 [INFO] __main__ - /tmp/tmpxq0b65ph.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:52:06,317 [INFO] __main__ - /tmp/tmpxq0b65ph.jpg_rows/row_2: ['col_0.png', 'col_1.png'] -2025-03-03 16:52:06,317 [INFO] __main__ - /tmp/tmpxq0b65ph.jpg_rows/row_3: ['col_0.png', 'col_1.png'] -2025-03-03 16:52:06,496 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r0_c0.png -2025-03-03 16:52:07,616 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "gonometry", - "subtopics": [] -} -``` -2025-03-03 16:52:07,616 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpxq0b65ph.jpg_rows/row_0/col_0.png: {'title': 'gonometry', 'subtopics': []} -2025-03-03 16:52:07,794 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r0_c1.png -2025-03-03 16:52:08,978 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["5.1"] -} -``` -2025-03-03 16:52:08,978 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpxq0b65ph.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': ['5.1']} -2025-03-03 16:52:09,517 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r0_c2.png -2025-03-03 16:52:10,754 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:52:10,755 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpxq0b65ph.jpg_rows/row_0/col_2.png: {'title': '', 'subtopics': []} -2025-03-03 16:52:10,935 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c0.png -2025-03-03 16:52:12,101 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["5.2"] -} -``` -2025-03-03 16:52:12,101 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpxq0b65ph.jpg_rows/row_1/col_0.png: {'title': '', 'subtopics': ['5.2']} -2025-03-03 16:52:12,541 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c1.png -2025-03-03 16:52:13,802 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:52:13,802 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpxq0b65ph.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:52:13,983 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c0.png -2025-03-03 16:52:15,170 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["5.3"] -} -``` -2025-03-03 16:52:15,170 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpxq0b65ph.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['5.3']} -2025-03-03 16:52:15,698 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c1.png -2025-03-03 16:52:17,172 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:52:17,172 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpxq0b65ph.jpg_rows/row_2/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:52:17,354 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c0.png -2025-03-03 16:52:18,459 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["5.4"] -} -``` -2025-03-03 16:52:18,459 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpxq0b65ph.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': ['5.4']} -2025-03-03 16:52:18,888 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c1.png -2025-03-03 16:52:20,032 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:52:20,032 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpxq0b65ph.jpg_rows/row_3/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:52:20,033 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 16:52:22,850 [INFO] __main__ - Extracted 4 rows from /tmp/tmp_2ut8_lc.jpg -2025-03-03 16:52:22,850 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:52:22,851 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:52:22,851 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:52:22,851 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:52:22,952 [INFO] __main__ - Files in /tmp/tmp_2ut8_lc.jpg_rows: -2025-03-03 16:52:22,952 [INFO] __main__ - /tmp/tmp_2ut8_lc.jpg_rows: [] -2025-03-03 16:52:22,952 [INFO] __main__ - /tmp/tmp_2ut8_lc.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:52:22,952 [INFO] __main__ - /tmp/tmp_2ut8_lc.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:52:22,952 [INFO] __main__ - /tmp/tmp_2ut8_lc.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:52:22,953 [INFO] __main__ - /tmp/tmp_2ut8_lc.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:52:23,217 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r0_c0.png -2025-03-03 16:52:24,288 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "5 Trigonometry continued", - "subtopics": [] -} -``` -2025-03-03 16:52:24,289 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp_2ut8_lc.jpg_rows/row_0/col_0.png: {'title': '5 Trigonometry continued', 'subtopics': []} -2025-03-03 16:52:24,735 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r0_c1.png -2025-03-03 16:52:25,983 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["5.5"] -} -``` -2025-03-03 16:52:25,983 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp_2ut8_lc.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': ['5.5']} -2025-03-03 16:52:26,510 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c0.png -2025-03-03 16:52:27,832 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["5.6"] -} -``` -2025-03-03 16:52:27,832 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp_2ut8_lc.jpg_rows/row_1/col_0.png: {'title': '', 'subtopics': ['5.6']} -2025-03-03 16:52:28,355 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-03 16:52:29,721 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [ - "5.7" - ] -} -``` -2025-03-03 16:52:29,721 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp_2ut8_lc.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['5.7']} -2025-03-03 16:52:30,067 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r3_c0.png -2025-03-03 16:52:31,247 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["5.8"] -} -``` -2025-03-03 16:52:31,247 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp_2ut8_lc.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': ['5.8']} -2025-03-03 16:52:31,247 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 16:52:33,946 [INFO] __main__ - Extracted 6 rows from /tmp/tmpdzof0nqh.jpg -2025-03-03 16:52:33,946 [INFO] __main__ - Row 0 has 1 cells -2025-03-03 16:52:33,947 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:52:33,947 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:52:33,947 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:52:33,947 [INFO] __main__ - Row 4 has 1 cells -2025-03-03 16:52:33,947 [INFO] __main__ - Row 5 has 1 cells -2025-03-03 16:52:34,053 [INFO] __main__ - Files in /tmp/tmpdzof0nqh.jpg_rows: -2025-03-03 16:52:34,053 [INFO] __main__ - /tmp/tmpdzof0nqh.jpg_rows: [] -2025-03-03 16:52:34,053 [INFO] __main__ - /tmp/tmpdzof0nqh.jpg_rows/row_0: ['col_0.png'] -2025-03-03 16:52:34,054 [INFO] __main__ - /tmp/tmpdzof0nqh.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:52:34,054 [INFO] __main__ - /tmp/tmpdzof0nqh.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:52:34,054 [INFO] __main__ - /tmp/tmpdzof0nqh.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:52:34,054 [INFO] __main__ - /tmp/tmpdzof0nqh.jpg_rows/row_4: ['col_0.png'] -2025-03-03 16:52:34,054 [INFO] __main__ - /tmp/tmpdzof0nqh.jpg_rows/row_5: ['col_0.png'] -2025-03-03 16:52:34,507 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r0_c0.png -2025-03-03 16:52:35,769 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["6.1"] -} -``` -2025-03-03 16:52:35,769 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpdzof0nqh.jpg_rows/row_0/col_0.png: {'title': '', 'subtopics': ['6.1']} -2025-03-03 16:52:36,208 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c0.png -2025-03-03 16:52:37,386 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["6.2"] -} -``` -2025-03-03 16:52:37,386 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpdzof0nqh.jpg_rows/row_1/col_0.png: {'title': '', 'subtopics': ['6.2']} -2025-03-03 16:52:37,873 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-03 16:52:39,097 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["6.3"] -} -``` -2025-03-03 16:52:39,097 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpdzof0nqh.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['6.3']} -2025-03-03 16:52:39,543 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r3_c0.png -2025-03-03 16:52:40,805 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["6.4"] -} -``` -2025-03-03 16:52:40,805 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpdzof0nqh.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': ['6.4']} -2025-03-03 16:52:41,148 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r4_c0.png -2025-03-03 16:52:42,398 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["6.5"] -} -``` -2025-03-03 16:52:42,399 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpdzof0nqh.jpg_rows/row_4/col_0.png: {'title': '', 'subtopics': ['6.5']} -2025-03-03 16:52:42,855 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r5_c0.png -2025-03-03 16:52:44,182 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["6.6"] -} -``` -2025-03-03 16:52:44,182 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpdzof0nqh.jpg_rows/row_5/col_0.png: {'title': '', 'subtopics': ['6.6']} -2025-03-03 16:52:44,183 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 16:52:47,439 [INFO] __main__ - Extracted 3 rows from /tmp/tmpmgvkecwg.jpg -2025-03-03 16:52:47,439 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:52:47,439 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:52:47,440 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:52:47,555 [INFO] __main__ - Files in /tmp/tmpmgvkecwg.jpg_rows: -2025-03-03 16:52:47,555 [INFO] __main__ - /tmp/tmpmgvkecwg.jpg_rows: [] -2025-03-03 16:52:47,556 [INFO] __main__ - /tmp/tmpmgvkecwg.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:52:47,556 [INFO] __main__ - /tmp/tmpmgvkecwg.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:52:47,556 [INFO] __main__ - /tmp/tmpmgvkecwg.jpg_rows/row_2: ['col_0.png', 'col_1.png'] -2025-03-03 16:52:47,736 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r0_c0.png -2025-03-03 16:52:48,841 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:52:48,841 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpmgvkecwg.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:52:49,100 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r0_c1.png -2025-03-03 16:52:50,449 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:52:50,449 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpmgvkecwg.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:52:50,729 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c0.png -2025-03-03 16:52:51,872 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "6 Exponentials and logarithms continued", - "subtopics": [] -} -``` -2025-03-03 16:52:51,872 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpmgvkecwg.jpg_rows/row_1/col_0.png: {'title': '6 Exponentials and logarithms continued', 'subtopics': []} -2025-03-03 16:52:52,401 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c1.png -2025-03-03 16:52:53,784 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["6.7"] -} -``` -2025-03-03 16:52:53,784 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpmgvkecwg.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['6.7']} -2025-03-03 16:52:54,061 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-03 16:52:55,153 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "7 Differentiation", - "subtopics": [] -} -``` -2025-03-03 16:52:55,153 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpmgvkecwg.jpg_rows/row_2/col_0.png: {'title': '7 Differentiation', 'subtopics': []} -2025-03-03 16:52:55,749 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c1.png -2025-03-03 16:52:57,134 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["7.1"] -} -``` -2025-03-03 16:52:57,135 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpmgvkecwg.jpg_rows/row_2/col_1.png: {'title': '', 'subtopics': ['7.1']} -2025-03-03 16:52:57,135 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 16:52:59,666 [INFO] __main__ - Extracted 4 rows from /tmp/tmpnbsei4yv.jpg -2025-03-03 16:52:59,666 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:52:59,666 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:52:59,666 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:52:59,667 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:52:59,765 [INFO] __main__ - Files in /tmp/tmpnbsei4yv.jpg_rows: -2025-03-03 16:52:59,765 [INFO] __main__ - /tmp/tmpnbsei4yv.jpg_rows: [] -2025-03-03 16:52:59,766 [INFO] __main__ - /tmp/tmpnbsei4yv.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:52:59,766 [INFO] __main__ - /tmp/tmpnbsei4yv.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:52:59,766 [INFO] __main__ - /tmp/tmpnbsei4yv.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:52:59,766 [INFO] __main__ - /tmp/tmpnbsei4yv.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:52:59,947 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-03 16:53:00,682 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:53:00,682 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpnbsei4yv.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:53:00,945 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c1.png -2025-03-03 16:53:01,803 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:53:01,803 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpnbsei4yv.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:53:02,084 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c0.png -2025-03-03 16:53:03,507 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "7 Differentiation continued", - "subtopics": [] -} -``` -2025-03-03 16:53:03,507 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpnbsei4yv.jpg_rows/row_1/col_0.png: {'title': '7 Differentiation continued', 'subtopics': []} -2025-03-03 16:53:04,007 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c1.png -2025-03-03 16:53:05,305 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["7.1"] -} -``` -2025-03-03 16:53:05,306 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpnbsei4yv.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['7.1']} -2025-03-03 16:53:05,843 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-03 16:53:07,128 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["7.2"] -} -``` -2025-03-03 16:53:07,129 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpnbsei4yv.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['7.2']} -2025-03-03 16:53:07,644 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c0.png -2025-03-03 16:53:08,941 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["7.3"] -} -``` -2025-03-03 16:53:08,941 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpnbsei4yv.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': ['7.3']} -2025-03-03 16:53:08,942 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 16:53:12,348 [INFO] __main__ - Extracted 6 rows from /tmp/tmp5544wa8m.jpg -2025-03-03 16:53:12,348 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:53:12,348 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:53:12,348 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:53:12,348 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:53:12,348 [INFO] __main__ - Row 4 has 2 cells -2025-03-03 16:53:12,348 [INFO] __main__ - Row 5 has 1 cells -2025-03-03 16:53:12,526 [INFO] __main__ - Files in /tmp/tmp5544wa8m.jpg_rows: -2025-03-03 16:53:12,527 [INFO] __main__ - /tmp/tmp5544wa8m.jpg_rows: [] -2025-03-03 16:53:12,527 [INFO] __main__ - /tmp/tmp5544wa8m.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:53:12,527 [INFO] __main__ - /tmp/tmp5544wa8m.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:53:12,528 [INFO] __main__ - /tmp/tmp5544wa8m.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:53:12,528 [INFO] __main__ - /tmp/tmp5544wa8m.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:53:12,528 [INFO] __main__ - /tmp/tmp5544wa8m.jpg_rows/row_4: ['col_0.png', 'col_1.png'] -2025-03-03 16:53:12,528 [INFO] __main__ - /tmp/tmp5544wa8m.jpg_rows/row_5: ['col_0.png'] -2025-03-03 16:53:12,707 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r0_c0.png -2025-03-03 16:53:13,953 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:53:13,953 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5544wa8m.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:53:14,214 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r0_c1.png -2025-03-03 16:53:15,293 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:53:15,294 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5544wa8m.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:53:15,552 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c0.png -2025-03-03 16:53:16,760 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "7 Differentiation continued", - "subtopics": [] -} -``` -2025-03-03 16:53:16,760 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5544wa8m.jpg_rows/row_1/col_0.png: {'title': '7 Differentiation continued', 'subtopics': []} -2025-03-03 16:53:17,274 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c1.png -2025-03-03 16:53:18,608 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["7.4"] -} -``` -2025-03-03 16:53:18,608 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5544wa8m.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['7.4']} -2025-03-03 16:53:19,047 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-03 16:53:20,289 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["7.5"] -} -``` -2025-03-03 16:53:20,289 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5544wa8m.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['7.5']} -2025-03-03 16:53:20,780 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r3_c0.png -2025-03-03 16:53:22,021 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["7.6"] -} -``` -2025-03-03 16:53:22,021 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5544wa8m.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': ['7.6']} -2025-03-03 16:53:22,205 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r4_c0.png -2025-03-03 16:53:23,216 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "8 Integration", - "subtopics": [] -} -``` -2025-03-03 16:53:23,216 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5544wa8m.jpg_rows/row_4/col_0.png: {'title': '8 Integration', 'subtopics': []} -2025-03-03 16:53:23,631 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r4_c1.png -2025-03-03 16:53:26,472 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.1"] -} -``` -2025-03-03 16:53:26,473 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5544wa8m.jpg_rows/row_4/col_1.png: {'title': '', 'subtopics': ['8.1']} -2025-03-03 16:53:27,016 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r5_c0.png -2025-03-03 16:53:28,277 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.2"] -} -``` -2025-03-03 16:53:28,278 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp5544wa8m.jpg_rows/row_5/col_0.png: {'title': '', 'subtopics': ['8.2']} -2025-03-03 16:53:28,278 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 16:53:31,395 [INFO] __main__ - Extracted 5 rows from /tmp/tmpqhuw60_q.jpg -2025-03-03 16:53:31,395 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:53:31,396 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:53:31,396 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:53:31,396 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:53:31,396 [INFO] __main__ - Row 4 has 1 cells -2025-03-03 16:53:31,510 [INFO] __main__ - Files in /tmp/tmpqhuw60_q.jpg_rows: -2025-03-03 16:53:31,510 [INFO] __main__ - /tmp/tmpqhuw60_q.jpg_rows: [] -2025-03-03 16:53:31,510 [INFO] __main__ - /tmp/tmpqhuw60_q.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:53:31,510 [INFO] __main__ - /tmp/tmpqhuw60_q.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:53:31,510 [INFO] __main__ - /tmp/tmpqhuw60_q.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:53:31,511 [INFO] __main__ - /tmp/tmpqhuw60_q.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:53:31,511 [INFO] __main__ - /tmp/tmpqhuw60_q.jpg_rows/row_4: ['col_0.png'] -2025-03-03 16:53:31,691 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r0_c0.png -2025-03-03 16:53:32,525 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:53:32,526 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqhuw60_q.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:53:32,790 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r0_c1.png -2025-03-03 16:53:33,921 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:53:33,922 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqhuw60_q.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:53:34,198 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c0.png -2025-03-03 16:53:35,309 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "8 Integration continued", - "subtopics": [] -} -``` -2025-03-03 16:53:35,309 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqhuw60_q.jpg_rows/row_1/col_0.png: {'title': '8 Integration continued', 'subtopics': []} -2025-03-03 16:53:35,881 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c1.png -2025-03-03 16:53:37,127 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.3"] -} -``` -2025-03-03 16:53:37,128 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqhuw60_q.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['8.3']} -2025-03-03 16:53:37,462 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c0.png -2025-03-03 16:53:37,506 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: 503 UNAVAILABLE. {'error': {'code': 503, 'message': 'The service is currently unavailable.', 'status': 'UNAVAILABLE'}} -2025-03-03 16:53:39,255 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.4"] -} -``` -2025-03-03 16:53:39,256 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqhuw60_q.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['8.4']} -2025-03-03 16:53:39,782 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c0.png -2025-03-03 16:53:41,051 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.5"] -} -``` -2025-03-03 16:53:41,051 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqhuw60_q.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': ['8.5']} -2025-03-03 16:53:41,491 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-03 16:53:42,743 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.6"] -} -``` -2025-03-03 16:53:42,743 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqhuw60_q.jpg_rows/row_4/col_0.png: {'title': '', 'subtopics': ['8.6']} -2025-03-03 16:53:42,744 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 16:53:45,953 [INFO] __main__ - Extracted 6 rows from /tmp/tmp87vvblht.jpg -2025-03-03 16:53:45,954 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:53:45,954 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:53:45,954 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:53:45,954 [INFO] __main__ - Row 3 has 2 cells -2025-03-03 16:53:45,954 [INFO] __main__ - Row 4 has 1 cells -2025-03-03 16:53:45,955 [INFO] __main__ - Row 5 has 1 cells -2025-03-03 16:53:46,080 [INFO] __main__ - Files in /tmp/tmp87vvblht.jpg_rows: -2025-03-03 16:53:46,080 [INFO] __main__ - /tmp/tmp87vvblht.jpg_rows: [] -2025-03-03 16:53:46,080 [INFO] __main__ - /tmp/tmp87vvblht.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:53:46,080 [INFO] __main__ - /tmp/tmp87vvblht.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:53:46,081 [INFO] __main__ - /tmp/tmp87vvblht.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:53:46,081 [INFO] __main__ - /tmp/tmp87vvblht.jpg_rows/row_3: ['col_0.png', 'col_1.png'] -2025-03-03 16:53:46,081 [INFO] __main__ - /tmp/tmp87vvblht.jpg_rows/row_4: ['col_0.png'] -2025-03-03 16:53:46,081 [INFO] __main__ - /tmp/tmp87vvblht.jpg_rows/row_5: ['col_0.png'] -2025-03-03 16:53:46,267 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r0_c0.png -2025-03-03 16:53:47,366 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:53:47,367 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp87vvblht.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:53:47,624 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r0_c1.png -2025-03-03 16:53:48,766 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:53:48,766 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp87vvblht.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:53:49,039 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c0.png -2025-03-03 16:53:50,123 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "8 Integration continued", - "subtopics": [] -} -``` -2025-03-03 16:53:50,123 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp87vvblht.jpg_rows/row_1/col_0.png: {'title': '8 Integration continued', 'subtopics': []} -2025-03-03 16:53:50,559 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c1.png -2025-03-03 16:53:51,776 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.7"] -} -``` -2025-03-03 16:53:51,776 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp87vvblht.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['8.7']} -2025-03-03 16:53:52,251 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c0.png -2025-03-03 16:53:53,411 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.8"] -} -``` -2025-03-03 16:53:53,411 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp87vvblht.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['8.8']} -2025-03-03 16:53:53,671 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c0.png -2025-03-03 16:53:54,777 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "9 Numerical methods", - "subtopics": [] -} -``` -2025-03-03 16:53:54,778 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp87vvblht.jpg_rows/row_3/col_0.png: {'title': '9 Numerical methods', 'subtopics': []} -2025-03-03 16:53:55,312 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c1.png -2025-03-03 16:53:56,621 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["9.1"] -} -``` -2025-03-03 16:53:56,621 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp87vvblht.jpg_rows/row_3/col_1.png: {'title': '', 'subtopics': ['9.1']} -2025-03-03 16:53:57,153 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c0.png -2025-03-03 16:53:58,446 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["9.2"] -} -``` -2025-03-03 16:53:58,446 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp87vvblht.jpg_rows/row_4/col_0.png: {'title': '', 'subtopics': ['9.2']} -2025-03-03 16:53:59,191 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-03 16:54:00,384 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["9.3"] -} -``` -2025-03-03 16:54:00,384 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp87vvblht.jpg_rows/row_5/col_0.png: {'title': '', 'subtopics': ['9.3']} -2025-03-03 16:54:00,385 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 16:54:03,446 [INFO] __main__ - Extracted 6 rows from /tmp/tmpbrhxkkvc.jpg -2025-03-03 16:54:03,447 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:54:03,447 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:54:03,447 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:54:03,447 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:54:03,448 [INFO] __main__ - Row 4 has 1 cells -2025-03-03 16:54:03,448 [INFO] __main__ - Row 5 has 1 cells -2025-03-03 16:54:03,563 [INFO] __main__ - Files in /tmp/tmpbrhxkkvc.jpg_rows: -2025-03-03 16:54:03,564 [INFO] __main__ - /tmp/tmpbrhxkkvc.jpg_rows: [] -2025-03-03 16:54:03,564 [INFO] __main__ - /tmp/tmpbrhxkkvc.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:03,564 [INFO] __main__ - /tmp/tmpbrhxkkvc.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:54:03,564 [INFO] __main__ - /tmp/tmpbrhxkkvc.jpg_rows/row_2: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:03,565 [INFO] __main__ - /tmp/tmpbrhxkkvc.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:54:03,565 [INFO] __main__ - /tmp/tmpbrhxkkvc.jpg_rows/row_4: ['col_0.png'] -2025-03-03 16:54:03,565 [INFO] __main__ - /tmp/tmpbrhxkkvc.jpg_rows/row_5: ['col_0.png'] -2025-03-03 16:54:03,817 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r0_c0.png -2025-03-03 16:54:04,894 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "9 Numerical methods", - "subtopics": [] -} -``` -2025-03-03 16:54:04,894 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpbrhxkkvc.jpg_rows/row_0/col_0.png: {'title': '9 Numerical methods', 'subtopics': []} -2025-03-03 16:54:05,415 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r0_c1.png -2025-03-03 16:54:06,749 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["9.4"] -} -``` -2025-03-03 16:54:06,749 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpbrhxkkvc.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': ['9.4']} -2025-03-03 16:54:07,085 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-03 16:54:08,362 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["9.5"] -} -``` -2025-03-03 16:54:08,362 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpbrhxkkvc.jpg_rows/row_1/col_0.png: {'title': '', 'subtopics': ['9.5']} -2025-03-03 16:54:08,542 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c0.png -2025-03-03 16:54:09,637 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "10 Vectors", - "subtopics": [] -} -``` -2025-03-03 16:54:09,638 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpbrhxkkvc.jpg_rows/row_2/col_0.png: {'title': '10 Vectors', 'subtopics': []} -2025-03-03 16:54:10,104 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c1.png -2025-03-03 16:54:11,240 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["10.1"] -} -``` -2025-03-03 16:54:11,240 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpbrhxkkvc.jpg_rows/row_2/col_1.png: {'title': '', 'subtopics': ['10.1']} -2025-03-03 16:54:11,662 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c0.png -2025-03-03 16:54:12,846 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["10.2"] -} -``` -2025-03-03 16:54:12,847 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpbrhxkkvc.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': ['10.2']} -2025-03-03 16:54:13,284 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-03 16:54:14,499 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["10.3"] -} -``` -2025-03-03 16:54:14,499 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpbrhxkkvc.jpg_rows/row_4/col_0.png: {'title': '', 'subtopics': ['10.3']} -2025-03-03 16:54:15,037 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r5_c0.png -2025-03-03 16:54:16,323 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["10.4"] -} -``` -2025-03-03 16:54:16,323 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpbrhxkkvc.jpg_rows/row_5/col_0.png: {'title': '', 'subtopics': ['10.4']} -2025-03-03 16:54:16,323 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 16:54:17,306 [INFO] __main__ - Extracted 2 rows from /tmp/tmp12cg6m8z.jpg -2025-03-03 16:54:17,307 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:54:17,307 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:54:17,340 [INFO] __main__ - Files in /tmp/tmp12cg6m8z.jpg_rows: -2025-03-03 16:54:17,340 [INFO] __main__ - /tmp/tmp12cg6m8z.jpg_rows: [] -2025-03-03 16:54:17,341 [INFO] __main__ - /tmp/tmp12cg6m8z.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:17,341 [INFO] __main__ - /tmp/tmp12cg6m8z.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:17,522 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r0_c0.png -2025-03-03 16:54:18,295 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:54:18,296 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp12cg6m8z.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:54:18,552 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r0_c1.png -2025-03-03 16:54:19,901 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:54:19,901 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp12cg6m8z.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:54:20,165 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c0.png -2025-03-03 16:54:21,332 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "10 Vectors continued", - "subtopics": [] -} -``` -2025-03-03 16:54:21,332 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp12cg6m8z.jpg_rows/row_1/col_0.png: {'title': '10 Vectors continued', 'subtopics': []} -2025-03-03 16:54:21,814 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c1.png -2025-03-03 16:54:23,092 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["10.5"] -} -``` -2025-03-03 16:54:23,092 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp12cg6m8z.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['10.5']} -2025-03-03 16:54:23,092 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 16:54:25,573 [INFO] __main__ - Extracted 3 rows from /tmp/tmp4alh1pi2.jpg -2025-03-03 16:54:25,574 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:54:25,574 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:54:25,574 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:54:25,667 [INFO] __main__ - Files in /tmp/tmp4alh1pi2.jpg_rows: -2025-03-03 16:54:25,668 [INFO] __main__ - /tmp/tmp4alh1pi2.jpg_rows: [] -2025-03-03 16:54:25,668 [INFO] __main__ - /tmp/tmp4alh1pi2.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:25,668 [INFO] __main__ - /tmp/tmp4alh1pi2.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:25,668 [INFO] __main__ - /tmp/tmp4alh1pi2.jpg_rows/row_2: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:25,847 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r0_c0.png -2025-03-03 16:54:26,909 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:54:26,909 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp4alh1pi2.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:54:27,163 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r0_c1.png -2025-03-03 16:54:28,386 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:54:28,386 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp4alh1pi2.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:54:28,640 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c0.png -2025-03-03 16:54:29,916 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "1 Statistical sampling", - "subtopics": [] -} -``` -2025-03-03 16:54:29,916 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp4alh1pi2.jpg_rows/row_1/col_0.png: {'title': '1 Statistical sampling', 'subtopics': []} -2025-03-03 16:54:30,526 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c1.png -2025-03-03 16:54:31,805 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["1.1"] -} -``` -2025-03-03 16:54:31,805 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp4alh1pi2.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['1.1']} -2025-03-03 16:54:32,060 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c0.png -2025-03-03 16:54:33,114 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "2 Data presentation and interpretation", - "subtopics": [] -} -``` -2025-03-03 16:54:33,114 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp4alh1pi2.jpg_rows/row_2/col_0.png: {'title': '2 Data presentation and interpretation', 'subtopics': []} -2025-03-03 16:54:33,602 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c1.png -2025-03-03 16:54:34,916 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.1"] -} -``` -2025-03-03 16:54:34,916 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp4alh1pi2.jpg_rows/row_2/col_1.png: {'title': '', 'subtopics': ['2.1']} -2025-03-03 16:54:34,917 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 16:54:38,618 [INFO] __main__ - Extracted 2 rows from /tmp/tmphhr8lzjg.jpg -2025-03-03 16:54:38,618 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:54:38,618 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:54:38,765 [INFO] __main__ - Files in /tmp/tmphhr8lzjg.jpg_rows: -2025-03-03 16:54:38,766 [INFO] __main__ - /tmp/tmphhr8lzjg.jpg_rows: [] -2025-03-03 16:54:38,766 [INFO] __main__ - /tmp/tmphhr8lzjg.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:38,767 [INFO] __main__ - /tmp/tmphhr8lzjg.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:38,946 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r0_c0.png -2025-03-03 16:54:40,123 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:54:40,124 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmphhr8lzjg.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:54:40,380 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r0_c1.png -2025-03-03 16:54:41,509 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:54:41,510 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmphhr8lzjg.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:54:41,814 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c0.png -2025-03-03 16:54:43,026 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "2 Data presentation and interpretation continued", - "subtopics": [] -} -``` -2025-03-03 16:54:43,027 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmphhr8lzjg.jpg_rows/row_1/col_0.png: {'title': '2 Data presentation and interpretation continued', 'subtopics': []} -2025-03-03 16:54:43,617 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c1.png -2025-03-03 16:54:44,968 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.2"] -} -``` -2025-03-03 16:54:44,969 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmphhr8lzjg.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['2.2']} -2025-03-03 16:54:44,969 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 16:54:47,957 [INFO] __main__ - Extracted 2 rows from /tmp/tmp4c9val7_.jpg -2025-03-03 16:54:47,958 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:54:47,958 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:54:48,061 [INFO] __main__ - Files in /tmp/tmp4c9val7_.jpg_rows: -2025-03-03 16:54:48,061 [INFO] __main__ - /tmp/tmp4c9val7_.jpg_rows: [] -2025-03-03 16:54:48,061 [INFO] __main__ - /tmp/tmp4c9val7_.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:48,062 [INFO] __main__ - /tmp/tmp4c9val7_.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:48,353 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r0_c0.png -2025-03-03 16:54:49,591 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "2 Data presentation and interpretation continued", - "subtopics": [] -} -``` -2025-03-03 16:54:49,592 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp4c9val7_.jpg_rows/row_0/col_0.png: {'title': '2 Data presentation and interpretation continued', 'subtopics': []} -2025-03-03 16:54:50,191 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r0_c1.png -2025-03-03 16:54:51,535 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["2.4"] -} -``` -2025-03-03 16:54:51,535 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp4c9val7_.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': ['2.4']} -2025-03-03 16:54:51,792 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c0.png -2025-03-03 16:54:53,147 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "3 Probability", - "subtopics": [] -} -``` -2025-03-03 16:54:53,148 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp4c9val7_.jpg_rows/row_1/col_0.png: {'title': '3 Probability', 'subtopics': []} -2025-03-03 16:54:53,679 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c1.png -2025-03-03 16:54:54,919 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["3.1"] -} -``` -2025-03-03 16:54:54,920 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp4c9val7_.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['3.1']} -2025-03-03 16:54:54,920 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 16:54:58,097 [INFO] __main__ - Extracted 4 rows from /tmp/tmp63gon_7v.jpg -2025-03-03 16:54:58,097 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:54:58,098 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:54:58,098 [INFO] __main__ - Row 2 has 2 cells -2025-03-03 16:54:58,098 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:54:58,238 [INFO] __main__ - Files in /tmp/tmp63gon_7v.jpg_rows: -2025-03-03 16:54:58,238 [INFO] __main__ - /tmp/tmp63gon_7v.jpg_rows: [] -2025-03-03 16:54:58,239 [INFO] __main__ - /tmp/tmp63gon_7v.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:58,239 [INFO] __main__ - /tmp/tmp63gon_7v.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:58,239 [INFO] __main__ - /tmp/tmp63gon_7v.jpg_rows/row_2: ['col_0.png', 'col_1.png'] -2025-03-03 16:54:58,239 [INFO] __main__ - /tmp/tmp63gon_7v.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:54:58,418 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r0_c0.png -2025-03-03 16:54:59,458 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:54:59,459 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp63gon_7v.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:54:59,731 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r0_c1.png -2025-03-03 16:55:00,929 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:55:00,929 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp63gon_7v.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:55:01,182 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c0.png -2025-03-03 16:55:02,236 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "3 Probability", - "subtopics": [] -} -``` -2025-03-03 16:55:02,236 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp63gon_7v.jpg_rows/row_1/col_0.png: {'title': '3 Probability', 'subtopics': []} -2025-03-03 16:55:02,676 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c1.png -2025-03-03 16:55:04,054 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["3.3"] -} -``` -2025-03-03 16:55:04,055 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp63gon_7v.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['3.3']} -2025-03-03 16:55:04,310 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-03 16:55:05,396 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "4 Statistical distributions", - "subtopics": [] -} -``` -2025-03-03 16:55:05,396 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp63gon_7v.jpg_rows/row_2/col_0.png: {'title': '4 Statistical distributions', 'subtopics': []} -2025-03-03 16:55:05,930 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r2_c1.png -2025-03-03 16:55:07,185 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["4.1"] -} -``` -2025-03-03 16:55:07,185 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp63gon_7v.jpg_rows/row_2/col_1.png: {'title': '', 'subtopics': ['4.1']} -2025-03-03 16:55:07,751 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r3_c0.png -2025-03-03 16:55:09,218 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [ - "4.2" - ] -} -``` -2025-03-03 16:55:09,219 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp63gon_7v.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': ['4.2']} -2025-03-03 16:55:09,219 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 16:55:11,902 [INFO] __main__ - Extracted 2 rows from /tmp/tmpt88s2l84.jpg -2025-03-03 16:55:11,903 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:55:11,903 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:55:12,010 [INFO] __main__ - Files in /tmp/tmpt88s2l84.jpg_rows: -2025-03-03 16:55:12,011 [INFO] __main__ - /tmp/tmpt88s2l84.jpg_rows: [] -2025-03-03 16:55:12,011 [INFO] __main__ - /tmp/tmpt88s2l84.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:55:12,011 [INFO] __main__ - /tmp/tmpt88s2l84.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:55:12,266 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r0_c0.png -2025-03-03 16:55:13,323 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "4 Statistical distributions continued", - "subtopics": [] -} -``` -2025-03-03 16:55:13,323 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpt88s2l84.jpg_rows/row_0/col_0.png: {'title': '4 Statistical distributions continued', 'subtopics': []} -2025-03-03 16:55:13,779 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r0_c1.png -2025-03-03 16:55:15,008 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["4.3"] -} -``` -2025-03-03 16:55:15,009 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpt88s2l84.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': ['4.3']} -2025-03-03 16:55:15,280 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c0.png -2025-03-03 16:55:16,590 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "5 Statistical hypothesis testing", - "subtopics": [] -} -``` -2025-03-03 16:55:16,590 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpt88s2l84.jpg_rows/row_1/col_0.png: {'title': '5 Statistical hypothesis testing', 'subtopics': []} -2025-03-03 16:55:17,196 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c1.png -2025-03-03 16:55:18,652 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["5.1"] -} -``` -2025-03-03 16:55:18,652 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpt88s2l84.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['5.1']} -2025-03-03 16:55:18,653 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 16:55:22,326 [INFO] __main__ - Extracted 3 rows from /tmp/tmphwi4k81_.jpg -2025-03-03 16:55:22,327 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:55:22,327 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:55:22,327 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:55:22,474 [INFO] __main__ - Files in /tmp/tmphwi4k81_.jpg_rows: -2025-03-03 16:55:22,474 [INFO] __main__ - /tmp/tmphwi4k81_.jpg_rows: [] -2025-03-03 16:55:22,474 [INFO] __main__ - /tmp/tmphwi4k81_.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:55:22,474 [INFO] __main__ - /tmp/tmphwi4k81_.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:55:22,474 [INFO] __main__ - /tmp/tmphwi4k81_.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:55:22,653 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r0_c0.png -2025-03-03 16:55:23,761 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:55:23,761 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmphwi4k81_.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:55:24,016 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r0_c1.png -2025-03-03 16:55:25,173 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:55:25,174 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmphwi4k81_.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:55:25,433 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c0.png -2025-03-03 16:55:26,618 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "5 Statistical hypothesis testing continued", - "subtopics": [] -} -``` -2025-03-03 16:55:26,618 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmphwi4k81_.jpg_rows/row_1/col_0.png: {'title': '5 Statistical hypothesis testing continued', 'subtopics': []} -2025-03-03 16:55:27,138 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c1.png -2025-03-03 16:55:28,361 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["5.2"] -} -``` -2025-03-03 16:55:28,361 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmphwi4k81_.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['5.2']} -2025-03-03 16:55:28,869 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r2_c0.png -2025-03-03 16:55:30,150 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["5.3"] -} -``` -2025-03-03 16:55:30,151 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmphwi4k81_.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['5.3']} -2025-03-03 16:55:30,152 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-03 16:55:33,707 [INFO] __main__ - Extracted 6 rows from /tmp/tmp8k4gz82r.jpg -2025-03-03 16:55:33,707 [INFO] __main__ - Row 0 has 1 cells -2025-03-03 16:55:33,708 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:55:33,708 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:55:33,708 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:55:33,708 [INFO] __main__ - Row 4 has 1 cells -2025-03-03 16:55:33,708 [INFO] __main__ - Row 5 has 1 cells -2025-03-03 16:55:33,869 [INFO] __main__ - Files in /tmp/tmp8k4gz82r.jpg_rows: -2025-03-03 16:55:33,870 [INFO] __main__ - /tmp/tmp8k4gz82r.jpg_rows: [] -2025-03-03 16:55:33,871 [INFO] __main__ - /tmp/tmp8k4gz82r.jpg_rows/row_0: ['col_0.png'] -2025-03-03 16:55:33,871 [INFO] __main__ - /tmp/tmp8k4gz82r.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:55:33,872 [INFO] __main__ - /tmp/tmp8k4gz82r.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:55:33,872 [INFO] __main__ - /tmp/tmp8k4gz82r.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:55:33,872 [INFO] __main__ - /tmp/tmp8k4gz82r.jpg_rows/row_4: ['col_0.png'] -2025-03-03 16:55:33,873 [INFO] __main__ - /tmp/tmp8k4gz82r.jpg_rows/row_5: ['col_0.png'] -2025-03-03 16:55:34,135 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r0_c0.png -2025-03-03 16:55:35,500 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:55:35,500 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp8k4gz82r.jpg_rows/row_0/col_0.png: {'title': '', 'subtopics': []} -2025-03-03 16:55:35,912 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r1_c0.png -2025-03-03 16:55:37,173 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["7.1"] -} -``` -2025-03-03 16:55:37,174 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp8k4gz82r.jpg_rows/row_1/col_0.png: {'title': '', 'subtopics': ['7.1']} -2025-03-03 16:55:37,617 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c0.png -2025-03-03 16:55:38,970 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["7.2"] -} -``` -2025-03-03 16:55:38,970 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp8k4gz82r.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['7.2']} -2025-03-03 16:55:39,477 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-03 16:55:40,895 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["7.3"] -} -``` -2025-03-03 16:55:40,895 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp8k4gz82r.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': ['7.3']} -2025-03-03 16:55:41,404 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r4_c0.png -2025-03-03 16:55:42,657 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["7.4"] -} -``` -2025-03-03 16:55:42,657 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp8k4gz82r.jpg_rows/row_4/col_0.png: {'title': '', 'subtopics': ['7.4']} -2025-03-03 16:55:43,074 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r5_c0.png -2025-03-03 16:55:44,291 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["7.5"] -} -``` -2025-03-03 16:55:44,292 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp8k4gz82r.jpg_rows/row_5/col_0.png: {'title': '', 'subtopics': ['7.5']} -2025-03-03 16:55:44,292 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-03 16:55:46,975 [INFO] __main__ - Extracted 3 rows from /tmp/tmp3psmonev.jpg -2025-03-03 16:55:46,975 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:55:46,975 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:55:46,976 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:55:47,068 [INFO] __main__ - Files in /tmp/tmp3psmonev.jpg_rows: -2025-03-03 16:55:47,068 [INFO] __main__ - /tmp/tmp3psmonev.jpg_rows: [] -2025-03-03 16:55:47,068 [INFO] __main__ - /tmp/tmp3psmonev.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:55:47,068 [INFO] __main__ - /tmp/tmp3psmonev.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:55:47,068 [INFO] __main__ - /tmp/tmp3psmonev.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:55:47,326 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r0_c0.png -2025-03-03 16:55:48,475 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "8 Forces and Newton's laws", - "subtopics": [] -} -``` -2025-03-03 16:55:48,476 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp3psmonev.jpg_rows/row_0/col_0.png: {'title': "8 Forces and Newton's laws", 'subtopics': []} -2025-03-03 16:55:48,812 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r0_c1.png -2025-03-03 16:55:50,036 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.1"] -} -``` -2025-03-03 16:55:50,036 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp3psmonev.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': ['8.1']} -2025-03-03 16:55:50,547 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c0.png -2025-03-03 16:55:51,893 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.2"] -} -``` -2025-03-03 16:55:51,893 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp3psmonev.jpg_rows/row_1/col_0.png: {'title': '', 'subtopics': ['8.2']} -2025-03-03 16:55:52,415 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r2_c0.png -2025-03-03 16:55:53,833 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.3"] -} -``` -2025-03-03 16:55:53,834 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp3psmonev.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['8.3']} -2025-03-03 16:55:53,834 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-03 16:55:56,214 [INFO] __main__ - Extracted 5 rows from /tmp/tmpqchaiboq.jpg -2025-03-03 16:55:56,214 [INFO] __main__ - Row 0 has 2 cells -2025-03-03 16:55:56,214 [INFO] __main__ - Row 1 has 2 cells -2025-03-03 16:55:56,214 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:55:56,215 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:55:56,215 [INFO] __main__ - Row 4 has 2 cells -2025-03-03 16:55:56,327 [INFO] __main__ - Files in /tmp/tmpqchaiboq.jpg_rows: -2025-03-03 16:55:56,327 [INFO] __main__ - /tmp/tmpqchaiboq.jpg_rows: [] -2025-03-03 16:55:56,327 [INFO] __main__ - /tmp/tmpqchaiboq.jpg_rows/row_0: ['col_0.png', 'col_1.png'] -2025-03-03 16:55:56,328 [INFO] __main__ - /tmp/tmpqchaiboq.jpg_rows/row_1: ['col_0.png', 'col_1.png'] -2025-03-03 16:55:56,328 [INFO] __main__ - /tmp/tmpqchaiboq.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:55:56,328 [INFO] __main__ - /tmp/tmpqchaiboq.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:55:56,329 [INFO] __main__ - /tmp/tmpqchaiboq.jpg_rows/row_4: ['col_0.png', 'col_1.png'] -2025-03-03 16:55:56,508 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r0_c0.png -2025-03-03 16:55:57,607 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Topics", - "subtopics": [] -} -``` -2025-03-03 16:55:57,607 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqchaiboq.jpg_rows/row_0/col_0.png: {'title': 'Topics', 'subtopics': []} -2025-03-03 16:55:57,861 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r0_c1.png -2025-03-03 16:55:59,108 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:55:59,108 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqchaiboq.jpg_rows/row_0/col_1.png: {'title': '', 'subtopics': []} -2025-03-03 16:55:59,367 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c0.png -2025-03-03 16:56:00,531 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "8 Forces and Newton's laws continued", - "subtopics": [] -} -``` -2025-03-03 16:56:00,532 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqchaiboq.jpg_rows/row_1/col_0.png: {'title': "8 Forces and Newton's laws continued", 'subtopics': []} -2025-03-03 16:56:01,098 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c1.png -2025-03-03 16:56:02,438 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.4"] -} -``` -2025-03-03 16:56:02,438 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqchaiboq.jpg_rows/row_1/col_1.png: {'title': '', 'subtopics': ['8.4']} -2025-03-03 16:56:02,909 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c0.png -2025-03-03 16:56:04,246 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.5"] -} -``` -2025-03-03 16:56:04,247 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqchaiboq.jpg_rows/row_2/col_0.png: {'title': '', 'subtopics': ['8.5']} -2025-03-03 16:56:04,679 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-03 16:56:06,141 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["8.6"] -} -``` -2025-03-03 16:56:06,141 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqchaiboq.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': ['8.6']} -2025-03-03 16:56:06,320 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r4_c0.png -2025-03-03 16:56:07,361 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "9 Moments", - "subtopics": [] -} -``` -2025-03-03 16:56:07,361 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqchaiboq.jpg_rows/row_4/col_0.png: {'title': '9 Moments', 'subtopics': []} -2025-03-03 16:56:07,768 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r4_c1.png -2025-03-03 16:56:09,003 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": ["9.1"] -} -``` -2025-03-03 16:56:09,004 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmpqchaiboq.jpg_rows/row_4/col_1.png: {'title': '', 'subtopics': ['9.1']} -2025-03-03 16:56:09,004 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=two -2025-03-03 16:56:11,650 [INFO] __main__ - Extracted 4 rows from /tmp/tmp2wlfsyzp.jpg -2025-03-03 16:56:11,650 [INFO] __main__ - Row 0 has 1 cells -2025-03-03 16:56:11,650 [INFO] __main__ - Row 1 has 1 cells -2025-03-03 16:56:11,651 [INFO] __main__ - Row 2 has 1 cells -2025-03-03 16:56:11,651 [INFO] __main__ - Row 3 has 1 cells -2025-03-03 16:56:11,791 [INFO] __main__ - Files in /tmp/tmp2wlfsyzp.jpg_rows: -2025-03-03 16:56:11,792 [INFO] __main__ - /tmp/tmp2wlfsyzp.jpg_rows: [] -2025-03-03 16:56:11,792 [INFO] __main__ - /tmp/tmp2wlfsyzp.jpg_rows/row_0: ['col_0.png'] -2025-03-03 16:56:11,792 [INFO] __main__ - /tmp/tmp2wlfsyzp.jpg_rows/row_1: ['col_0.png'] -2025-03-03 16:56:11,792 [INFO] __main__ - /tmp/tmp2wlfsyzp.jpg_rows/row_2: ['col_0.png'] -2025-03-03 16:56:11,792 [INFO] __main__ - /tmp/tmp2wlfsyzp.jpg_rows/row_3: ['col_0.png'] -2025-03-03 16:56:12,204 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r0_c0.png -2025-03-03 16:56:13,408 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:56:13,408 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp2wlfsyzp.jpg_rows/row_0/col_0.png: {'title': '', 'subtopics': []} -2025-03-03 16:56:14,006 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c0.png -2025-03-03 16:56:15,317 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Reason, interpret and communicate mathematically", - "subtopics": [] -} -``` -2025-03-03 16:56:15,317 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp2wlfsyzp.jpg_rows/row_1/col_0.png: {'title': 'Reason, interpret and communicate mathematically', 'subtopics': []} -2025-03-03 16:56:15,918 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-03 16:56:17,351 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "Solve problems within mathematics and in other contexts", - "subtopics": [] -} -``` -2025-03-03 16:56:17,352 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp2wlfsyzp.jpg_rows/row_2/col_0.png: {'title': 'Solve problems within mathematics and in other contexts', 'subtopics': []} -2025-03-03 16:56:17,530 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r3_c0.png -2025-03-03 16:56:18,630 [INFO] __main__ - Gemini subtopic extraction raw response: ```json -{ - "title": "", - "subtopics": [] -} -``` -2025-03-03 16:56:18,630 [INFO] __main__ - Gemini subtopic extraction result for cell /tmp/tmp2wlfsyzp.jpg_rows/row_3/col_0.png: {'title': '', 'subtopics': []} -2025-03-03 16:56:18,636 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/final_subtopics.json -2025-03-03 16:56:18,994 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 16:56:19,001 [INFO] __main__ - Processing completed successfully. -2025-03-03 17:06:40,772 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 17:06:41,552 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 17:06:41,553 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 17:06:41,838 [INFO] __main__ - Computed global offset: 4 -2025-03-03 17:06:41,838 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 17:07:37,751 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 17:07:39,610 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 17:07:40,176 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 17:07:40,700 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 17:07:41,274 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 17:07:41,865 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 17:07:42,325 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 17:07:42,814 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 17:07:43,336 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 17:07:43,875 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 17:07:44,352 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 17:07:44,863 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 17:07:45,368 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 17:07:45,949 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 17:07:46,449 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 17:07:47,062 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 17:07:47,635 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 17:07:47,954 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 17:07:48,473 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 17:07:49,058 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 17:07:49,631 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 17:07:50,237 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 17:07:50,722 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 17:07:51,356 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 17:10:00,962 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-03 17:10:00,963 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 17:10:01,747 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 17:10:01,748 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 17:10:02,192 [INFO] __main__ - Computed global offset: 4 -2025-03-03 17:10:02,193 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 17:10:59,443 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 17:10:59,444 [ERROR] __main__ - Processing failed: 'MineruNoTextProcessor' object has no attribute 's3_writer' -2025-03-03 17:12:12,164 [INFO] __main__ - Running in test mode: using local image writer. -2025-03-03 17:12:12,164 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 17:12:12,974 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 17:12:12,975 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 17:12:13,337 [INFO] __main__ - Computed global offset: 4 -2025-03-03 17:12:13,338 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 17:13:11,161 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 17:13:13,235 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 17:13:13,932 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 17:13:14,578 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 17:13:15,234 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 17:13:15,951 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 17:13:16,553 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 17:13:17,094 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 17:13:17,632 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 17:13:18,251 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 17:13:18,728 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 17:13:19,255 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 17:13:19,780 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 17:13:20,358 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 17:13:20,851 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 17:13:21,407 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 17:13:21,930 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 17:13:22,234 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 17:13:22,728 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 17:13:23,287 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 17:13:23,861 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 17:13:24,500 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 17:13:25,077 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 17:13:25,555 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 17:13:26,100 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 17:13:26,547 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 17:13:27,032 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 17:13:27,654 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 17:13:28,028 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 17:13:33,336 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 17:13:36,419 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c0.png -2025-03-03 17:13:37,812 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c1.png -2025-03-03 17:13:39,435 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.png -2025-03-03 17:13:41,175 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c1.png -2025-03-03 17:13:42,923 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 17:18:11,461 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 17:18:12,239 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 17:18:12,240 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 17:18:12,539 [INFO] __main__ - Computed global offset: 4 -2025-03-03 17:18:12,540 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 17:19:17,416 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 17:19:44,737 [INFO] __main__ - Processing table image => img_1.jpg, columns=three -2025-03-03 17:19:47,862 [INFO] __main__ - Processing table image => img_2.jpg, columns=three -2025-03-03 17:19:50,752 [INFO] __main__ - Processing table image => img_3.jpg, columns=three -2025-03-03 17:19:53,861 [INFO] __main__ - Processing table image => img_4.jpg, columns=three -2025-03-03 17:19:56,969 [INFO] __main__ - Processing table image => img_5.jpg, columns=three -2025-03-03 17:20:00,413 [INFO] __main__ - Processing table image => img_6.jpg, columns=three -2025-03-03 17:20:04,193 [INFO] __main__ - Processing table image => img_7.jpg, columns=three -2025-03-03 17:20:07,968 [INFO] __main__ - Processing table image => img_8.jpg, columns=three -2025-03-03 17:20:11,413 [INFO] __main__ - Processing table image => img_9.jpg, columns=three -2025-03-03 17:20:14,895 [INFO] __main__ - Processing table image => img_10.jpg, columns=three -2025-03-03 17:20:18,856 [INFO] __main__ - Processing table image => img_11.jpg, columns=three -2025-03-03 17:20:22,697 [INFO] __main__ - Processing table image => img_12.jpg, columns=three -2025-03-03 17:20:25,904 [INFO] __main__ - Processing table image => img_13.jpg, columns=three -2025-03-03 17:20:29,171 [INFO] __main__ - Processing table image => img_14.jpg, columns=two -2025-03-03 17:20:34,203 [INFO] __main__ - Processing table image => img_15.jpg, columns=three -2025-03-03 17:20:39,122 [INFO] __main__ - Processing table image => img_16.jpg, columns=three -2025-03-03 17:20:43,452 [INFO] __main__ - Processing table image => img_17.jpg, columns=three -2025-03-03 17:20:47,324 [INFO] __main__ - Processing table image => img_18.jpg, columns=three -2025-03-03 17:20:48,254 [INFO] __main__ - Processing table image => img_19.jpg, columns=three -2025-03-03 17:20:50,879 [INFO] __main__ - Processing table image => img_20.jpg, columns=two -2025-03-03 17:20:54,283 [INFO] __main__ - Processing table image => img_21.jpg, columns=three -2025-03-03 17:20:57,784 [INFO] __main__ - Processing table image => img_22.jpg, columns=three -2025-03-03 17:21:01,431 [INFO] __main__ - Processing table image => img_23.jpg, columns=three -2025-03-03 17:21:04,472 [INFO] __main__ - Processing table image => img_24.jpg, columns=three -2025-03-03 17:21:07,947 [INFO] __main__ - Processing table image => img_25.jpg, columns=three -2025-03-03 17:21:11,586 [INFO] __main__ - Processing table image => img_26.jpg, columns=three -2025-03-03 17:21:14,419 [INFO] __main__ - Processing table image => img_27.jpg, columns=three -2025-03-03 17:21:17,542 [INFO] __main__ - Processing table image => img_28.jpg, columns=two -2025-03-03 17:21:21,034 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 17:21:21,035 [ERROR] __main__ - Processing failed: 'LocalImageWriter' object has no attribute 'extracted_subtopics' -2025-03-03 17:23:42,864 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 17:23:43,614 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 17:23:43,615 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 17:23:43,926 [INFO] __main__ - Computed global offset: 4 -2025-03-03 17:23:43,927 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 17:24:49,282 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 17:25:17,902 [INFO] __main__ - Processing table image => img_1.jpg, columns=three -2025-03-03 17:25:20,696 [INFO] __main__ - Processing table image => img_2.jpg, columns=three -2025-03-03 17:25:23,663 [INFO] __main__ - Processing table image => img_3.jpg, columns=three -2025-03-03 17:25:26,866 [INFO] __main__ - Processing table image => img_4.jpg, columns=three -2025-03-03 17:25:30,504 [INFO] __main__ - Processing table image => img_5.jpg, columns=three -2025-03-03 17:25:35,390 [INFO] __main__ - Processing table image => img_6.jpg, columns=three -2025-03-03 17:25:39,596 [INFO] __main__ - Processing table image => img_7.jpg, columns=three -2025-03-03 17:25:43,260 [INFO] __main__ - Processing table image => img_8.jpg, columns=three -2025-03-03 17:25:46,325 [INFO] __main__ - Processing table image => img_9.jpg, columns=three -2025-03-03 17:25:50,400 [INFO] __main__ - Processing table image => img_10.jpg, columns=three -2025-03-03 17:25:53,575 [INFO] __main__ - Processing table image => img_11.jpg, columns=three -2025-03-03 17:25:57,251 [INFO] __main__ - Processing table image => img_12.jpg, columns=three -2025-03-03 17:26:00,714 [INFO] __main__ - Processing table image => img_13.jpg, columns=three -2025-03-03 17:26:03,441 [INFO] __main__ - Processing table image => img_14.jpg, columns=two -2025-03-03 17:26:06,715 [INFO] __main__ - Processing table image => img_15.jpg, columns=three -2025-03-03 17:26:09,908 [INFO] __main__ - Processing table image => img_16.jpg, columns=three -2025-03-03 17:26:13,326 [INFO] __main__ - Processing table image => img_17.jpg, columns=three -2025-03-03 17:26:16,837 [INFO] __main__ - Processing table image => img_18.jpg, columns=three -2025-03-03 17:26:17,845 [INFO] __main__ - Processing table image => img_19.jpg, columns=three -2025-03-03 17:26:20,525 [INFO] __main__ - Processing table image => img_20.jpg, columns=two -2025-03-03 17:26:25,219 [INFO] __main__ - Processing table image => img_21.jpg, columns=three -2025-03-03 17:26:29,725 [INFO] __main__ - Processing table image => img_22.jpg, columns=three -2025-03-03 17:26:33,240 [INFO] __main__ - Processing table image => img_23.jpg, columns=three -2025-03-03 17:26:36,140 [INFO] __main__ - Processing table image => img_24.jpg, columns=three -2025-03-03 17:26:39,453 [INFO] __main__ - Processing table image => img_25.jpg, columns=three -2025-03-03 17:26:42,611 [INFO] __main__ - Processing table image => img_26.jpg, columns=three -2025-03-03 17:26:45,575 [INFO] __main__ - Processing table image => img_27.jpg, columns=three -2025-03-03 17:26:48,347 [INFO] __main__ - Processing table image => img_28.jpg, columns=two -2025-03-03 17:26:52,250 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 17:26:52,250 [ERROR] __main__ - Processing failed: 'LocalImageWriter' object has no attribute 'extracted_subtopics' -2025-03-03 17:29:15,373 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 17:29:16,145 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 17:29:16,146 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 17:29:16,646 [INFO] __main__ - Computed global offset: 4 -2025-03-03 17:29:16,647 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 17:30:37,147 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 17:31:06,356 [INFO] __main__ - Processing table image => img_1.jpg, columns=three -2025-03-03 17:31:10,484 [INFO] __main__ - Processing table image => img_2.jpg, columns=three -2025-03-03 17:31:14,433 [INFO] __main__ - Processing table image => img_3.jpg, columns=three -2025-03-03 17:31:18,298 [INFO] __main__ - Processing table image => img_4.jpg, columns=three -2025-03-03 17:31:22,342 [INFO] __main__ - Processing table image => img_5.jpg, columns=three -2025-03-03 17:31:25,875 [INFO] __main__ - Processing table image => img_6.jpg, columns=three -2025-03-03 17:31:30,066 [INFO] __main__ - Processing table image => img_7.jpg, columns=three -2025-03-03 17:31:34,442 [INFO] __main__ - Processing table image => img_8.jpg, columns=three -2025-03-03 17:31:38,680 [INFO] __main__ - Processing table image => img_9.jpg, columns=three -2025-03-03 17:31:42,870 [INFO] __main__ - Processing table image => img_10.jpg, columns=three -2025-03-03 17:31:46,601 [INFO] __main__ - Processing table image => img_11.jpg, columns=three -2025-03-03 17:31:50,338 [INFO] __main__ - Processing table image => img_12.jpg, columns=three -2025-03-03 17:31:54,258 [INFO] __main__ - Processing table image => img_13.jpg, columns=three -2025-03-03 17:31:57,952 [INFO] __main__ - Processing table image => img_14.jpg, columns=two -2025-03-03 17:32:02,306 [INFO] __main__ - Processing table image => img_15.jpg, columns=three -2025-03-03 17:32:06,554 [INFO] __main__ - Processing table image => img_16.jpg, columns=three -2025-03-03 17:32:11,743 [INFO] __main__ - Processing table image => img_17.jpg, columns=three -2025-03-03 17:32:16,224 [INFO] __main__ - Processing table image => img_18.jpg, columns=three -2025-03-03 17:32:17,405 [INFO] __main__ - Processing table image => img_19.jpg, columns=three -2025-03-03 17:32:20,461 [INFO] __main__ - Processing table image => img_20.jpg, columns=two -2025-03-03 17:32:24,355 [INFO] __main__ - Processing table image => img_21.jpg, columns=three -2025-03-03 17:32:28,605 [INFO] __main__ - Processing table image => img_22.jpg, columns=three -2025-03-03 17:32:33,370 [INFO] __main__ - Processing table image => img_23.jpg, columns=three -2025-03-03 17:32:36,960 [INFO] __main__ - Processing table image => img_24.jpg, columns=three -2025-03-03 17:32:41,190 [INFO] __main__ - Processing table image => img_25.jpg, columns=three -2025-03-03 17:32:45,789 [INFO] __main__ - Processing table image => img_26.jpg, columns=three -2025-03-03 17:32:49,145 [INFO] __main__ - Processing table image => img_27.jpg, columns=three -2025-03-03 17:32:51,859 [INFO] __main__ - Processing table image => img_28.jpg, columns=two -2025-03-03 17:32:55,099 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 17:32:55,099 [ERROR] __main__ - Processing failed: 'LocalImageWriter' object has no attribute 'extracted_subtopics' -2025-03-03 18:01:47,365 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-03 18:01:48,166 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-03 18:01:48,167 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-03 18:01:48,529 [INFO] __main__ - Computed global offset: 4 -2025-03-03 18:01:48,530 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-03 18:02:45,151 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-03 18:02:47,389 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-03 18:02:47,996 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-03 18:02:48,658 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-03 18:02:49,352 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-03 18:02:49,960 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-03 18:02:50,659 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-03 18:02:51,254 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-03 18:02:51,742 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-03 18:02:52,344 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-03 18:02:52,901 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-03 18:02:53,548 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-03 18:02:54,179 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-03 18:02:54,858 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-03 18:02:55,462 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-03 18:02:56,140 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-03 18:02:56,834 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-03 18:02:57,186 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-03 18:02:57,895 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-03 18:02:58,699 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-03 18:02:59,469 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-03 18:03:00,063 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-03 18:03:00,715 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-03 18:03:01,305 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-03 18:03:01,790 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-03 18:03:02,427 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-03 18:03:03,086 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-03 18:03:03,840 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-03 18:03:04,394 [INFO] __main__ - Classifying images to detect tables. -2025-03-03 18:03:09,642 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-03 18:03:13,344 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c0.png -2025-03-03 18:03:14,713 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c1.png -2025-03-03 18:03:16,386 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.png -2025-03-03 18:03:18,238 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c1.png -2025-03-03 18:03:19,729 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-03 18:03:23,829 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r0_c0.png -2025-03-03 18:03:25,255 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r0_c1.png -2025-03-03 18:03:26,663 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c0.png -2025-03-03 18:03:28,211 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c1.png -2025-03-03 18:03:29,861 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c0.png -2025-03-03 18:03:31,766 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c0.png -2025-03-03 18:03:33,633 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r4_c0.png -2025-03-03 18:03:35,112 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-03 18:03:38,486 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r0_c0.png -2025-03-03 18:03:39,547 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r0_c1.png -2025-03-03 18:03:40,727 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c0.png -2025-03-03 18:03:42,551 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c1.png -2025-03-03 18:03:43,977 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-03 18:03:47,007 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r0_c0.png -2025-03-03 18:03:47,944 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r0_c1.png -2025-03-03 18:03:49,417 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r1_c0.png -2025-03-03 18:03:51,202 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r1_c1.png -2025-03-03 18:03:52,822 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-03 18:03:57,664 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r0_c0.png -2025-03-03 18:03:59,070 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r0_c1.png -2025-03-03 18:04:00,494 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c0.png -2025-03-03 18:04:02,240 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c1.png -2025-03-03 18:04:04,099 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r2_c0.png -2025-03-03 18:04:05,512 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-03 18:04:09,932 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r0_c0.png -2025-03-03 18:04:11,364 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r0_c1.png -2025-03-03 18:04:12,780 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c0.png -2025-03-03 18:04:14,521 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c1.png -2025-03-03 18:04:16,038 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-03 18:04:17,799 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r2_c1.png -2025-03-03 18:04:19,184 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-03 18:04:23,663 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-03 18:04:24,739 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r0_c1.png -2025-03-03 18:04:26,232 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c0.png -2025-03-03 18:04:28,388 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c1.png -2025-03-03 18:04:30,206 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c0.png -2025-03-03 18:04:31,473 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-03 18:04:34,576 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r0_c0.png -2025-03-03 18:04:35,800 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r0_c1.png -2025-03-03 18:04:37,238 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r0_c2.png -2025-03-03 18:04:38,721 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-03 18:04:40,069 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c1.png -2025-03-03 18:04:41,915 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c2.png -2025-03-03 18:04:43,317 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c0.png -2025-03-03 18:04:44,758 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c1.png -2025-03-03 18:04:46,354 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c0.png -2025-03-03 18:04:47,962 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c1.png -2025-03-03 18:04:49,441 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-03 18:04:51,291 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c1.png -2025-03-03 18:04:52,840 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=two -2025-03-03 18:04:56,905 [WARNING] __main__ - Cell image not found: /tmp/tmp5hkh4jpv.jpg_rows/row_0/col_0.png -2025-03-03 18:04:57,168 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r0_c1.png -2025-03-03 18:04:58,554 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c0.png -2025-03-03 18:05:00,216 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c1.png -2025-03-03 18:05:02,113 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c0.png -2025-03-03 18:05:03,482 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-03 18:05:07,505 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r0_c0.png -2025-03-03 18:05:08,634 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r0_c1.png -2025-03-03 18:05:09,820 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c0.png -2025-03-03 18:05:11,434 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c1.png -2025-03-03 18:05:13,181 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-03 18:05:15,198 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r3_c0.png -2025-03-03 18:05:16,840 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r4_c0.png -2025-03-03 18:05:18,564 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r5_c0.png -2025-03-03 18:05:19,773 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-03 18:05:23,555 [WARNING] __main__ - Cell image not found: /tmp/tmpmqfa7baf.jpg_rows/row_0/col_0.png -2025-03-03 18:05:23,816 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r0_c1.png -2025-03-03 18:05:25,422 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c0.png -2025-03-03 18:05:27,100 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-03 18:05:28,749 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r3_c0.png -2025-03-03 18:05:30,528 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r4_c0.png -2025-03-03 18:05:32,582 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r5_c0.png -2025-03-03 18:05:34,027 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-03 18:05:38,149 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r0_c0.png -2025-03-03 18:05:39,165 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r0_c1.png -2025-03-03 18:05:40,592 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c0.png -2025-03-03 18:05:42,353 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c1.png -2025-03-03 18:05:44,001 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-03 18:05:45,790 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c1.png -2025-03-03 18:05:47,303 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-03 18:05:50,408 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-03 18:05:51,470 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c1.png -2025-03-03 18:05:53,168 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c0.png -2025-03-03 18:05:54,502 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c1.png -2025-03-03 18:05:56,341 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-03 18:05:58,223 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c0.png -2025-03-03 18:05:59,584 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-03 18:06:02,910 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r0_c0.png -2025-03-03 18:06:03,811 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r0_c1.png -2025-03-03 18:06:04,999 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c0.png -2025-03-03 18:06:06,682 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c1.png -2025-03-03 18:06:08,714 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-03 18:06:10,540 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r3_c0.png -2025-03-03 18:06:12,055 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r4_c0.png -2025-03-03 18:06:13,546 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r4_c1.png -2025-03-03 18:06:14,846 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-03 18:06:18,269 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r0_c0.png -2025-03-03 18:06:19,578 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r0_c1.png -2025-03-03 18:06:20,905 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r0_c2.png -2025-03-03 18:06:22,274 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c0.png -2025-03-03 18:06:23,653 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c1.png -2025-03-03 18:06:25,318 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c2.png -2025-03-03 18:06:26,709 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c0.png -2025-03-03 18:06:28,080 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c1.png -2025-03-03 18:06:29,474 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c0.png -2025-03-03 18:06:31,739 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c1.png -2025-03-03 18:06:33,031 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-03 18:06:36,448 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r0_c0.png -2025-03-03 18:06:37,446 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r0_c1.png -2025-03-03 18:06:38,581 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c0.png -2025-03-03 18:06:40,126 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c1.png -2025-03-03 18:06:41,796 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c0.png -2025-03-03 18:06:43,330 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c0.png -2025-03-03 18:06:45,025 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c1.png -2025-03-03 18:06:46,832 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c0.png -2025-03-03 18:06:48,096 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-03 18:06:51,269 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r0_c0.png -2025-03-03 18:06:52,315 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r0_c1.png -2025-03-03 18:06:53,734 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-03 18:06:55,297 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r1_c1.png -2025-03-03 18:06:57,056 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c0.png -2025-03-03 18:06:58,396 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c0.png -2025-03-03 18:06:59,922 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c1.png -2025-03-03 18:07:01,718 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-03 18:07:03,388 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r5_c0.png -2025-03-03 18:07:04,651 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-03 18:07:05,752 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r0_c0.png -2025-03-03 18:07:06,841 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r0_c1.png -2025-03-03 18:07:08,263 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c0.png -2025-03-03 18:07:09,611 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c1.png -2025-03-03 18:07:11,064 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-03 18:07:13,613 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r0_c0.png -2025-03-03 18:07:14,973 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r0_c1.png -2025-03-03 18:07:16,339 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c0.png -2025-03-03 18:07:18,493 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c1.png -2025-03-03 18:07:20,139 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c0.png -2025-03-03 18:07:21,807 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c1.png -2025-03-03 18:07:23,091 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-03 18:07:26,528 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r0_c0.png -2025-03-03 18:07:27,947 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r0_c1.png -2025-03-03 18:07:29,439 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c0.png -2025-03-03 18:07:31,257 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c1.png -2025-03-03 18:07:32,705 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-03 18:07:35,869 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r0_c0.png -2025-03-03 18:07:37,403 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r0_c1.png -2025-03-03 18:07:38,804 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c0.png -2025-03-03 18:07:40,651 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c1.png -2025-03-03 18:07:42,292 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-03 18:07:43,964 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r2_c1.png -2025-03-03 18:07:45,304 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-03 18:07:48,679 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r0_c0.png -2025-03-03 18:07:49,776 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r0_c1.png -2025-03-03 18:07:51,108 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c0.png -2025-03-03 18:07:52,641 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c1.png -2025-03-03 18:07:54,136 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-03 18:07:55,772 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r2_c1.png -2025-03-03 18:07:57,135 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-03 18:08:00,271 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r0_c0.png -2025-03-03 18:08:01,671 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r0_c1.png -2025-03-03 18:08:03,176 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c0.png -2025-03-03 18:08:04,801 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c1.png -2025-03-03 18:08:06,459 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r2_c0.png -2025-03-03 18:08:08,351 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r2_c1.png -2025-03-03 18:08:10,016 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-03 18:08:13,320 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r0_c0.png -2025-03-03 18:08:14,451 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r0_c1.png -2025-03-03 18:08:15,533 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c0.png -2025-03-03 18:08:17,196 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c1.png -2025-03-03 18:08:19,105 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r2_c0.png -2025-03-03 18:08:20,442 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=three -2025-03-03 18:08:23,613 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r0_c0.png -2025-03-03 18:08:24,927 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r0_c1.png -2025-03-03 18:08:26,271 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r0_c2.png -2025-03-03 18:08:27,756 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r1_c0.png -2025-03-03 18:08:29,059 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r1_c1.png -2025-03-03 18:08:30,619 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r1_c2.png -2025-03-03 18:08:32,028 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c0.png -2025-03-03 18:08:33,702 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c1.png -2025-03-03 18:08:35,546 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-03 18:08:37,241 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r3_c1.png -2025-03-03 18:08:38,602 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-03 18:08:41,789 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r0_c0.png -2025-03-03 18:08:42,904 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r0_c1.png -2025-03-03 18:08:44,299 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c0.png -2025-03-03 18:08:45,765 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c1.png -2025-03-03 18:08:47,625 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r2_c0.png -2025-03-03 18:08:49,450 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r3_c0.png -2025-03-03 18:08:50,706 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-03 18:08:53,864 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r0_c0.png -2025-03-03 18:08:55,294 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r0_c1.png -2025-03-03 18:08:56,673 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c0.png -2025-03-03 18:08:58,397 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c1.png -2025-03-03 18:09:00,147 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c0.png -2025-03-03 18:09:01,840 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-03 18:09:03,256 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r4_c0.png -2025-03-03 18:09:04,820 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r4_c1.png -2025-03-03 18:09:06,037 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=two -2025-03-03 18:09:09,419 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r0_c0.png -2025-03-03 18:09:11,243 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c0.png -2025-03-03 18:09:13,257 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-03 18:09:15,022 [INFO] __main__ - GPU memory cleaned up. -2025-03-03 18:09:15,023 [ERROR] __main__ - Processing failed: name 'merge_topics' is not defined -2025-03-04 14:56:39,218 [INFO] __main__ - Processing PDF: /home/user/app/input_output/a-level-pearson-mathematics-specification.pdf -2025-03-04 14:56:40,018 [INFO] __main__ - Gemini returned subtopics: {'Paper 1 and Paper 2: Pure Mathematics': [11, 29], 'Paper 3: Statistics and Mechanics': [30, 40]} -2025-03-04 14:56:40,019 [INFO] __main__ - Loaded 1135473 bytes from local file '/home/user/app/input_output/a-level-pearson-mathematics-specification.pdf' -2025-03-04 14:56:40,316 [INFO] __main__ - Computed global offset: 4 -2025-03-04 14:56:40,316 [INFO] __main__ - Processing pages (0-based): [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] -2025-03-04 14:58:48,246 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-04 14:58:50,037 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-04 14:58:50,583 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-04 14:58:51,114 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-04 14:58:51,657 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-04 14:58:52,211 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-04 14:58:52,686 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-04 14:58:53,167 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-04 14:58:53,667 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-04 14:58:54,285 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-04 14:58:54,850 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-04 14:58:55,401 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-04 14:58:55,916 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-04 14:58:56,524 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-04 14:58:56,999 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-04 14:58:57,542 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-04 14:58:58,071 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-04 14:58:58,366 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-04 14:58:58,849 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-04 14:58:59,428 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-04 14:58:59,995 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-04 14:59:00,597 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-04 14:59:01,070 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-04 14:59:01,567 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-04 14:59:02,141 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-04 14:59:02,569 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-04 14:59:03,024 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-04 14:59:03,607 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-04 14:59:04,016 [INFO] __main__ - Classifying images to detect tables. -2025-03-04 14:59:20,581 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-04 14:59:23,252 [WARNING] __main__ - Cell image not found: /tmp/tmpijzc040v.jpg_rows/row_0/col_0.png -2025-03-04 14:59:23,252 [WARNING] __main__ - Cell image not found: /tmp/tmpijzc040v.jpg_rows/row_0/col_1.png -2025-03-04 14:59:23,748 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.png -2025-03-04 14:59:25,146 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c1.png -2025-03-04 14:59:26,469 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c0.png -2025-03-04 14:59:27,272 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-04 14:59:30,158 [WARNING] __main__ - Cell image not found: /tmp/tmplbse6rk2.jpg_rows/row_0/col_0.png -2025-03-04 14:59:30,158 [WARNING] __main__ - Cell image not found: /tmp/tmplbse6rk2.jpg_rows/row_0/col_1.png -2025-03-04 14:59:30,420 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c0.png -2025-03-04 14:59:31,612 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c1.png -2025-03-04 14:59:34,174 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c0.png -2025-03-04 14:59:35,585 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c0.png -2025-03-04 14:59:36,908 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r4_c0.png -2025-03-04 14:59:38,024 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r5_c0.png -2025-03-04 14:59:38,783 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-04 14:59:41,887 [WARNING] __main__ - Cell image not found: /tmp/tmp9jfrqv6f.jpg_rows/row_0/col_0.png -2025-03-04 14:59:41,887 [WARNING] __main__ - Cell image not found: /tmp/tmp9jfrqv6f.jpg_rows/row_0/col_1.png -2025-03-04 14:59:42,148 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c0.png -2025-03-04 14:59:43,551 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c1.png -2025-03-04 14:59:45,241 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r2_c0.png -2025-03-04 14:59:46,499 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r3_c0.png -2025-03-04 14:59:47,500 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r4_c0.png -2025-03-04 14:59:48,309 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-04 14:59:51,311 [WARNING] __main__ - Cell image not found: /tmp/tmpbrv43l7_.jpg_rows/row_0/col_0.png -2025-03-04 14:59:51,311 [WARNING] __main__ - Cell image not found: /tmp/tmpbrv43l7_.jpg_rows/row_0/col_1.png -2025-03-04 14:59:51,311 [WARNING] __main__ - Cell image not found: /tmp/tmpbrv43l7_.jpg_rows/row_1/col_0.png -2025-03-04 14:59:51,311 [WARNING] __main__ - Cell image not found: /tmp/tmpbrv43l7_.jpg_rows/row_1/col_1.png -2025-03-04 14:59:51,579 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c0.png -2025-03-04 14:59:53,042 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c1.png -2025-03-04 14:59:54,470 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c0.png -2025-03-04 14:59:55,460 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-04 14:59:58,401 [WARNING] __main__ - Cell image not found: /tmp/tmpdj8vn5v4.jpg_rows/row_0/col_0.png -2025-03-04 14:59:58,401 [WARNING] __main__ - Cell image not found: /tmp/tmpdj8vn5v4.jpg_rows/row_0/col_1.png -2025-03-04 14:59:58,659 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c0.png -2025-03-04 15:00:00,036 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c1.png -2025-03-04 15:00:01,411 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r2_c0.png -2025-03-04 15:00:02,747 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r3_c0.png -2025-03-04 15:00:03,656 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-04 15:00:06,880 [WARNING] __main__ - Cell image not found: /tmp/tmpw4hdm_vm.jpg_rows/row_0/col_0.png -2025-03-04 15:00:06,881 [WARNING] __main__ - Cell image not found: /tmp/tmpw4hdm_vm.jpg_rows/row_0/col_1.png -2025-03-04 15:00:07,144 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c0.png -2025-03-04 15:00:08,578 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c1.png -2025-03-04 15:00:09,789 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 15:00:12,763 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r2_c1.png -2025-03-04 15:00:14,173 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r3_c0.png -2025-03-04 15:00:15,229 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-04 15:00:18,336 [WARNING] __main__ - Cell image not found: /tmp/tmpier2e_jn.jpg_rows/row_0/col_0.png -2025-03-04 15:00:18,336 [WARNING] __main__ - Cell image not found: /tmp/tmpier2e_jn.jpg_rows/row_0/col_1.png -2025-03-04 15:00:18,607 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c0.png -2025-03-04 15:00:19,964 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c1.png -2025-03-04 15:00:21,423 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c0.png -2025-03-04 15:00:22,514 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 15:00:23,784 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r3_c1.png -2025-03-04 15:00:25,023 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r4_c0.png -2025-03-04 15:00:26,014 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-04 15:00:30,110 [WARNING] __main__ - Cell image not found: /tmp/tmpwzp5zo9m.jpg_rows/row_0/col_0.png -2025-03-04 15:00:30,295 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r0_c1.png -2025-03-04 15:00:30,957 [WARNING] __main__ - Cell image not found: /tmp/tmpwzp5zo9m.jpg_rows/row_1/col_0.png -2025-03-04 15:00:30,958 [WARNING] __main__ - Cell image not found: /tmp/tmpwzp5zo9m.jpg_rows/row_1/col_1.png -2025-03-04 15:00:30,958 [WARNING] __main__ - Cell image not found: /tmp/tmpwzp5zo9m.jpg_rows/row_1/col_2.png -2025-03-04 15:00:31,219 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c0.png -2025-03-04 15:00:32,311 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c1.png -2025-03-04 15:00:33,619 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c2.png -2025-03-04 15:00:34,694 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c0.png -2025-03-04 15:00:35,762 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c1.png -2025-03-04 15:00:36,796 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 15:00:37,972 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c1.png -2025-03-04 15:00:39,110 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r5_c0.png -2025-03-04 15:00:40,404 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r5_c1.png -2025-03-04 15:00:41,716 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:00:43,487 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:00:43,665 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r6_c0.png -2025-03-04 15:00:44,879 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r6_c1.png -2025-03-04 15:00:45,862 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:00:47,337 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:00:47,338 [WARNING] __main__ - Cell image not found: /tmp/tmpwzp5zo9m.jpg_rows/row_7/col_0.png -2025-03-04 15:00:47,338 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-04 15:00:50,852 [WARNING] __main__ - Cell image not found: /tmp/tmp45kbg898.jpg_rows/row_0/col_0.png -2025-03-04 15:00:50,853 [WARNING] __main__ - Cell image not found: /tmp/tmp45kbg898.jpg_rows/row_0/col_1.png -2025-03-04 15:00:50,853 [WARNING] __main__ - Cell image not found: /tmp/tmp45kbg898.jpg_rows/row_0/col_2.png -2025-03-04 15:00:52,290 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c0.png -2025-03-04 15:00:53,354 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c1.png -2025-03-04 15:00:54,709 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c2.png -2025-03-04 15:00:55,877 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c0.png -2025-03-04 15:00:57,178 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c1.png -2025-03-04 15:00:58,304 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c0.png -2025-03-04 15:00:59,735 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c1.png -2025-03-04 15:01:00,944 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 15:01:02,239 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r4_c1.png -2025-03-04 15:01:03,416 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r5_c0.png -2025-03-04 15:01:04,618 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r5_c1.png -2025-03-04 15:01:05,434 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-04 15:01:08,588 [WARNING] __main__ - Cell image not found: /tmp/tmpqskyhmda.jpg_rows/row_0/col_0.png -2025-03-04 15:01:08,588 [WARNING] __main__ - Cell image not found: /tmp/tmpqskyhmda.jpg_rows/row_0/col_1.png -2025-03-04 15:01:08,855 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c0.png -2025-03-04 15:01:10,100 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c1.png -2025-03-04 15:01:11,458 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 15:01:13,002 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r3_c0.png -2025-03-04 15:01:14,421 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r4_c0.png -2025-03-04 15:01:15,795 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r5_c0.png -2025-03-04 15:01:16,778 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=two -2025-03-04 15:01:19,849 [WARNING] __main__ - Cell image not found: /tmp/tmpragajvqv.jpg_rows/row_0/col_0.png -2025-03-04 15:01:20,292 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c0.png -2025-03-04 15:01:21,681 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 15:01:23,001 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r3_c0.png -2025-03-04 15:01:24,256 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r4_c0.png -2025-03-04 15:01:25,614 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r5_c0.png -2025-03-04 15:01:26,879 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r6_c0.png -2025-03-04 15:01:28,027 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r7_c0.png -2025-03-04 15:01:28,867 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-04 15:01:31,707 [WARNING] __main__ - Cell image not found: /tmp/tmptajrb9oq.jpg_rows/row_0/col_0.png -2025-03-04 15:01:31,708 [WARNING] __main__ - Cell image not found: /tmp/tmptajrb9oq.jpg_rows/row_0/col_1.png -2025-03-04 15:01:31,708 [WARNING] __main__ - Cell image not found: /tmp/tmptajrb9oq.jpg_rows/row_1/col_0.png -2025-03-04 15:01:31,708 [WARNING] __main__ - Cell image not found: /tmp/tmptajrb9oq.jpg_rows/row_1/col_1.png -2025-03-04 15:01:31,968 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 15:01:33,379 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c1.png -2025-03-04 15:01:34,597 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r3_c0.png -2025-03-04 15:01:35,923 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r3_c1.png -2025-03-04 15:01:37,229 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r4_c0.png -2025-03-04 15:01:38,254 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r5_c0.png -2025-03-04 15:01:39,166 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-04 15:01:42,003 [WARNING] __main__ - Cell image not found: /tmp/tmpzd8rmysx.jpg_rows/row_0/col_0.png -2025-03-04 15:01:42,004 [WARNING] __main__ - Cell image not found: /tmp/tmpzd8rmysx.jpg_rows/row_0/col_1.png -2025-03-04 15:01:42,004 [WARNING] __main__ - Cell image not found: /tmp/tmpzd8rmysx.jpg_rows/row_1/col_0.png -2025-03-04 15:01:42,004 [WARNING] __main__ - Cell image not found: /tmp/tmpzd8rmysx.jpg_rows/row_1/col_1.png -2025-03-04 15:01:42,258 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 15:01:43,581 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r2_c1.png -2025-03-04 15:01:44,840 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c0.png -2025-03-04 15:01:46,192 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 15:01:47,564 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r5_c0.png -2025-03-04 15:01:48,735 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r6_c0.png -2025-03-04 15:01:49,480 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-04 15:01:53,309 [WARNING] __main__ - Cell image not found: /tmp/tmp6agbobyu.jpg_rows/row_0/col_0.png -2025-03-04 15:01:53,310 [WARNING] __main__ - Cell image not found: /tmp/tmp6agbobyu.jpg_rows/row_0/col_1.png -2025-03-04 15:01:53,583 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c0.png -2025-03-04 15:01:54,959 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c1.png -2025-03-04 15:01:56,286 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 15:01:57,618 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r3_c0.png -2025-03-04 15:01:58,711 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r4_c0.png -2025-03-04 15:01:59,972 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r4_c1.png -2025-03-04 15:02:01,443 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r5_c0.png -2025-03-04 15:02:02,711 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r6_c0.png -2025-03-04 15:02:03,674 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-04 15:02:06,780 [WARNING] __main__ - Cell image not found: /tmp/tmp3lbuxp25.jpg_rows/row_0/col_0.png -2025-03-04 15:02:06,781 [WARNING] __main__ - Cell image not found: /tmp/tmp3lbuxp25.jpg_rows/row_0/col_1.png -2025-03-04 15:02:07,040 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c0.png -2025-03-04 15:02:08,455 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c1.png -2025-03-04 15:02:09,838 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c0.png -2025-03-04 15:02:11,221 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c0.png -2025-03-04 15:02:12,570 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 15:02:13,800 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r5_c0.png -2025-03-04 15:02:14,741 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-04 15:02:18,051 [WARNING] __main__ - Cell image not found: /tmp/tmpqve047e1.jpg_rows/row_0/col_0.png -2025-03-04 15:02:18,051 [WARNING] __main__ - Cell image not found: /tmp/tmpqve047e1.jpg_rows/row_0/col_1.png -2025-03-04 15:02:18,051 [WARNING] __main__ - Cell image not found: /tmp/tmpqve047e1.jpg_rows/row_1/col_0.png -2025-03-04 15:02:18,052 [WARNING] __main__ - Cell image not found: /tmp/tmpqve047e1.jpg_rows/row_1/col_1.png -2025-03-04 15:02:18,310 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c0.png -2025-03-04 15:02:19,484 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c1.png -2025-03-04 15:02:20,750 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c0.png -2025-03-04 15:02:21,962 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c0.png -2025-03-04 15:02:23,279 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c1.png -2025-03-04 15:02:24,677 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 15:02:25,990 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r6_c0.png -2025-03-04 15:02:27,144 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r7_c0.png -2025-03-04 15:02:27,953 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-04 15:02:31,142 [WARNING] __main__ - Cell image not found: /tmp/tmp580zpmu1.jpg_rows/row_0/col_0.png -2025-03-04 15:02:31,142 [WARNING] __main__ - Cell image not found: /tmp/tmp580zpmu1.jpg_rows/row_0/col_1.png -2025-03-04 15:02:31,397 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-04 15:02:32,685 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r1_c1.png -2025-03-04 15:02:34,235 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c0.png -2025-03-04 15:02:35,330 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c0.png -2025-03-04 15:02:36,635 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c1.png -2025-03-04 15:02:37,985 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 15:02:39,401 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r5_c0.png -2025-03-04 15:02:40,763 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r6_c0.png -2025-03-04 15:02:41,985 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r7_c0.png -2025-03-04 15:02:42,875 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-04 15:02:43,771 [WARNING] __main__ - Cell image not found: /tmp/tmpccm4skpd.jpg_rows/row_0/col_0.png -2025-03-04 15:02:43,772 [WARNING] __main__ - Cell image not found: /tmp/tmpccm4skpd.jpg_rows/row_0/col_1.png -2025-03-04 15:02:43,772 [WARNING] __main__ - Cell image not found: /tmp/tmpccm4skpd.jpg_rows/row_1/col_0.png -2025-03-04 15:02:43,772 [WARNING] __main__ - Cell image not found: /tmp/tmpccm4skpd.jpg_rows/row_1/col_1.png -2025-03-04 15:02:44,032 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c0.png -2025-03-04 15:02:45,366 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c1.png -2025-03-04 15:02:46,585 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 15:02:47,559 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-04 15:02:50,123 [WARNING] __main__ - Cell image not found: /tmp/tmpclhr29f1.jpg_rows/row_0/col_0.png -2025-03-04 15:02:50,124 [WARNING] __main__ - Cell image not found: /tmp/tmpclhr29f1.jpg_rows/row_0/col_1.png -2025-03-04 15:02:50,124 [WARNING] __main__ - Cell image not found: /tmp/tmpclhr29f1.jpg_rows/row_1/col_0.png -2025-03-04 15:02:50,124 [WARNING] __main__ - Cell image not found: /tmp/tmpclhr29f1.jpg_rows/row_1/col_1.png -2025-03-04 15:02:50,378 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c0.png -2025-03-04 15:02:51,859 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c1.png -2025-03-04 15:02:53,257 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 15:02:54,584 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r3_c1.png -2025-03-04 15:02:55,736 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r4_c0.png -2025-03-04 15:02:56,672 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-04 15:03:00,454 [WARNING] __main__ - Cell image not found: /tmp/tmptx9dz9xc.jpg_rows/row_0/col_0.png -2025-03-04 15:03:00,454 [WARNING] __main__ - Cell image not found: /tmp/tmptx9dz9xc.jpg_rows/row_0/col_1.png -2025-03-04 15:03:00,737 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c0.png -2025-03-04 15:03:02,337 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c1.png -2025-03-04 15:03:03,839 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c0.png -2025-03-04 15:03:04,889 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-04 15:03:08,043 [WARNING] __main__ - Cell image not found: /tmp/tmp18_5p4lj.jpg_rows/row_0/col_0.png -2025-03-04 15:03:08,044 [WARNING] __main__ - Cell image not found: /tmp/tmp18_5p4lj.jpg_rows/row_0/col_1.png -2025-03-04 15:03:08,322 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c0.png -2025-03-04 15:03:09,913 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c1.png -2025-03-04 15:03:11,063 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-04 15:03:12,387 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r2_c1.png -2025-03-04 15:03:13,743 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r3_c0.png -2025-03-04 15:03:14,671 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-04 15:03:17,999 [WARNING] __main__ - Cell image not found: /tmp/tmppc_cs35e.jpg_rows/row_0/col_0.png -2025-03-04 15:03:18,000 [WARNING] __main__ - Cell image not found: /tmp/tmppc_cs35e.jpg_rows/row_0/col_1.png -2025-03-04 15:03:18,271 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c0.png -2025-03-04 15:03:19,493 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c1.png -2025-03-04 15:03:20,669 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-04 15:03:22,038 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r2_c1.png -2025-03-04 15:03:23,431 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r3_c0.png -2025-03-04 15:03:24,490 [WARNING] __main__ - Cell image not found: /tmp/tmppc_cs35e.jpg_rows/row_4/col_0.png -2025-03-04 15:03:24,491 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-04 15:03:27,293 [WARNING] __main__ - Cell image not found: /tmp/tmpk98o_fpp.jpg_rows/row_0/col_0.png -2025-03-04 15:03:27,294 [WARNING] __main__ - Cell image not found: /tmp/tmpk98o_fpp.jpg_rows/row_0/col_1.png -2025-03-04 15:03:27,553 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c0.png -2025-03-04 15:03:28,769 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c1.png -2025-03-04 15:03:29,940 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r2_c0.png -2025-03-04 15:03:31,452 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r2_c1.png -2025-03-04 15:03:32,738 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r3_c0.png -2025-03-04 15:03:33,643 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-04 15:03:36,892 [WARNING] __main__ - Cell image not found: /tmp/tmpsdjidh_w.jpg_rows/row_0/col_0.png -2025-03-04 15:03:36,892 [WARNING] __main__ - Cell image not found: /tmp/tmpsdjidh_w.jpg_rows/row_0/col_1.png -2025-03-04 15:03:36,892 [WARNING] __main__ - Cell image not found: /tmp/tmpsdjidh_w.jpg_rows/row_1/col_0.png -2025-03-04 15:03:36,892 [WARNING] __main__ - Cell image not found: /tmp/tmpsdjidh_w.jpg_rows/row_1/col_1.png -2025-03-04 15:03:37,188 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r2_c0.png -2025-03-04 15:03:38,642 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r2_c1.png -2025-03-04 15:03:40,017 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r3_c0.png -2025-03-04 15:03:41,095 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r4_c0.png -2025-03-04 15:03:42,514 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r4_c1.png -2025-03-04 15:03:43,481 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=two -2025-03-04 15:03:46,397 [WARNING] __main__ - Cell image not found: /tmp/tmpt9roe876.jpg_rows/row_0/col_0.png -2025-03-04 15:03:46,809 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r1_c0.png -2025-03-04 15:03:48,153 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c0.png -2025-03-04 15:03:49,855 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-04 15:03:51,232 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r4_c0.png -2025-03-04 15:03:52,577 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r5_c0.png -2025-03-04 15:03:53,542 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r6_c0.png -2025-03-04 15:03:54,702 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-04 15:03:57,292 [WARNING] __main__ - Cell image not found: /tmp/tmpkt4w7cqg.jpg_rows/row_0/col_0.png -2025-03-04 15:03:57,292 [WARNING] __main__ - Cell image not found: /tmp/tmpkt4w7cqg.jpg_rows/row_0/col_1.png -2025-03-04 15:03:57,547 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c0.png -2025-03-04 15:03:58,694 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c1.png -2025-03-04 15:04:00,096 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r2_c0.png -2025-03-04 15:04:01,892 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r3_c0.png -2025-03-04 15:04:03,198 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r4_c0.png -2025-03-04 15:04:04,066 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-04 15:04:06,633 [WARNING] __main__ - Cell image not found: /tmp/tmp1z8ov49i.jpg_rows/row_0/col_0.png -2025-03-04 15:04:06,633 [WARNING] __main__ - Cell image not found: /tmp/tmp1z8ov49i.jpg_rows/row_0/col_1.png -2025-03-04 15:04:06,892 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c0.png -2025-03-04 15:04:08,314 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c1.png -2025-03-04 15:04:09,655 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c0.png -2025-03-04 15:04:10,910 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-04 15:04:12,042 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r4_c0.png -2025-03-04 15:04:13,234 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r4_c1.png -2025-03-04 15:04:14,345 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r5_c0.png -2025-03-04 15:04:15,180 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=three -2025-03-04 15:04:18,179 [WARNING] __main__ - Cell image not found: /tmp/tmpsij1nmfi.jpg_rows/row_0/col_0.png -2025-03-04 15:04:18,179 [WARNING] __main__ - Cell image not found: /tmp/tmpsij1nmfi.jpg_rows/row_0/col_1.png -2025-03-04 15:04:18,363 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c0.png -2025-03-04 15:04:19,871 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c1.png -2025-03-04 15:04:21,379 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-04 15:04:23,137 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r2_c1.png -2025-03-04 15:04:24,801 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r3_c0.png -2025-03-04 15:04:26,569 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r3_c1.png -2025-03-04 15:04:28,289 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r4_c0.png -2025-03-04 15:04:29,718 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r4_c1.png -2025-03-04 15:04:31,009 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r5_c0.png -2025-03-04 15:04:31,836 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/subtopics.json -2025-03-04 15:04:32,192 [INFO] __main__ - GPU memory cleaned up. -2025-03-04 15:04:32,199 [INFO] __main__ - Processing completed successfully. -2025-03-04 15:07:34,872 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf -2025-03-04 15:07:35,728 [INFO] __main__ - Gemini returned subtopics: {'2.1 AS units': [7, 22], '2.2 A2 units': [23, 43]} -2025-03-04 15:07:35,729 [INFO] __main__ - Loaded 3543551 bytes from local file '/home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf' -2025-03-04 15:07:36,005 [INFO] __main__ - Computed global offset: 0 -2025-03-04 15:07:36,005 [INFO] __main__ - Processing pages (0-based): [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42] -2025-03-04 15:08:18,476 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-04 15:08:19,052 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-04 15:08:19,485 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-04 15:08:19,996 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-04 15:08:20,330 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-04 15:08:20,785 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-04 15:08:21,362 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-04 15:08:21,835 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-04 15:08:22,330 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-04 15:08:22,610 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-04 15:08:23,113 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-04 15:08:23,648 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-04 15:08:24,228 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-04 15:08:24,516 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-04 15:08:25,163 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-04 15:08:25,702 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-04 15:08:26,226 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-04 15:08:26,622 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-04 15:08:27,198 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-04 15:08:27,589 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-04 15:08:28,157 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-04 15:08:28,694 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-04 15:08:29,074 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-04 15:08:29,674 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-04 15:08:30,304 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-04 15:08:30,699 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-04 15:08:31,319 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-04 15:08:31,856 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-04 15:08:32,225 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_29.jpg -2025-03-04 15:08:32,654 [INFO] __main__ - Classifying images to detect tables. -2025-03-04 15:08:38,155 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-04 15:08:41,440 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c0.png -2025-03-04 15:08:43,051 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.png -2025-03-04 15:08:44,790 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c1.png -2025-03-04 15:08:46,344 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c0.png -2025-03-04 15:08:48,289 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c1.png -2025-03-04 15:08:50,030 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c0.png -2025-03-04 15:08:51,608 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c1.png -2025-03-04 15:08:53,091 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 15:08:54,178 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-04 15:08:57,728 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r0_c0.png -2025-03-04 15:08:59,235 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c0.png -2025-03-04 15:09:01,301 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c1.png -2025-03-04 15:09:03,017 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c0.png -2025-03-04 15:09:04,724 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c1.png -2025-03-04 15:09:06,344 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c0.png -2025-03-04 15:09:08,083 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c1.png -2025-03-04 15:09:09,461 [WARNING] __main__ - Cell image not found: /tmp/tmpuzc7lqpg.jpg_rows/row_4/col_0.png -2025-03-04 15:09:09,462 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-04 15:09:11,637 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r0_c0.png -2025-03-04 15:09:13,267 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c0.png -2025-03-04 15:09:15,340 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c1.png -2025-03-04 15:09:16,878 [WARNING] __main__ - Cell image not found: /tmp/tmpbgf7aoqt.jpg_rows/row_2/col_0.png -2025-03-04 15:09:16,878 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-04 15:09:19,372 [WARNING] __main__ - Cell image not found: /tmp/tmpgwefauyg.jpg_rows/row_0/col_0.png -2025-03-04 15:09:19,372 [WARNING] __main__ - Cell image not found: /tmp/tmpgwefauyg.jpg_rows/row_0/col_1.png -2025-03-04 15:09:19,653 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r1_c0.png -2025-03-04 15:09:20,932 [WARNING] __main__ - Cell image not found: /tmp/tmpgwefauyg.jpg_rows/row_1/col_1.png -2025-03-04 15:09:21,190 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c0.png -2025-03-04 15:09:23,180 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c1.png -2025-03-04 15:09:24,778 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c0.png -2025-03-04 15:09:26,580 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c1.png -2025-03-04 15:09:28,290 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 15:09:29,503 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-04 15:09:30,733 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r0_c0.png -2025-03-04 15:09:32,418 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c0.png -2025-03-04 15:09:34,126 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c1.png -2025-03-04 15:09:35,427 [WARNING] __main__ - Cell image not found: /tmp/tmp3qju9xi4.jpg_rows/row_2/col_0.png -2025-03-04 15:09:35,428 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-04 15:09:37,453 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r0_c0.png -2025-03-04 15:09:38,887 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c0.png -2025-03-04 15:09:40,661 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c1.png -2025-03-04 15:09:42,154 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 15:09:43,258 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-04 15:09:46,654 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-04 15:09:48,255 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c0.png -2025-03-04 15:09:50,127 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c1.png -2025-03-04 15:09:51,929 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c0.png -2025-03-04 15:09:53,594 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c1.png -2025-03-04 15:09:55,004 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 15:09:56,133 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-04 15:09:58,374 [WARNING] __main__ - Cell image not found: /tmp/tmpnmurttjv.jpg_rows/row_0/col_0.png -2025-03-04 15:09:58,374 [WARNING] __main__ - Cell image not found: /tmp/tmpnmurttjv.jpg_rows/row_0/col_1.png -2025-03-04 15:09:58,644 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-04 15:10:00,157 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c0.png -2025-03-04 15:10:02,018 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c1.png -2025-03-04 15:10:03,587 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c0.png -2025-03-04 15:10:05,453 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c1.png -2025-03-04 15:10:07,094 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 15:10:08,483 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-04 15:10:10,643 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r0_c0.png -2025-03-04 15:10:12,162 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c0.png -2025-03-04 15:10:13,746 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c1.png -2025-03-04 15:10:15,368 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c0.png -2025-03-04 15:10:17,221 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c1.png -2025-03-04 15:10:18,799 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c0.png -2025-03-04 15:10:20,352 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c1.png -2025-03-04 15:10:21,780 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 15:10:22,945 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-04 15:10:23,949 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r0_c0.png -2025-03-04 15:10:25,481 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c0.png -2025-03-04 15:10:27,282 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c1.png -2025-03-04 15:10:28,772 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:10:30,373 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:10:30,561 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 15:10:31,738 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=three -2025-03-04 15:10:35,608 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r0_c0.png -2025-03-04 15:10:37,150 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c0.png -2025-03-04 15:10:39,138 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c1.png -2025-03-04 15:10:40,817 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 15:10:41,894 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-04 15:10:44,480 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r0_c0.png -2025-03-04 15:10:45,992 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c0.png -2025-03-04 15:10:47,943 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c1.png -2025-03-04 15:10:49,728 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 15:10:50,952 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-04 15:10:54,196 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-04 15:10:55,559 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c1.png -2025-03-04 15:10:57,073 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c0.png -2025-03-04 15:10:58,980 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c1.png -2025-03-04 15:11:00,622 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 15:11:01,971 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c0.png -2025-03-04 15:11:03,664 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c1.png -2025-03-04 15:11:05,231 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 15:11:06,385 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-04 15:11:07,531 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r0_c0.png -2025-03-04 15:11:09,149 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c0.png -2025-03-04 15:11:10,975 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c1.png -2025-03-04 15:11:12,556 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 15:11:13,674 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-04 15:11:17,626 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r0_c0.png -2025-03-04 15:11:19,192 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c0.png -2025-03-04 15:11:21,217 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c1.png -2025-03-04 15:11:22,882 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c0.png -2025-03-04 15:11:24,676 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c1.png -2025-03-04 15:11:26,155 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c0.png -2025-03-04 15:11:27,848 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c1.png -2025-03-04 15:11:29,314 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 15:11:30,495 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-04 15:11:33,844 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r0_c0.png -2025-03-04 15:11:35,696 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c0.png -2025-03-04 15:11:37,307 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c1.png -2025-03-04 15:11:38,716 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:11:40,191 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:11:40,464 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c0.png -2025-03-04 15:11:41,992 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c1.png -2025-03-04 15:11:43,718 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c0.png -2025-03-04 15:11:45,485 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c1.png -2025-03-04 15:11:47,113 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c0.png -2025-03-04 15:11:48,839 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c1.png -2025-03-04 15:11:50,381 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 15:11:51,513 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-04 15:11:54,800 [WARNING] __main__ - Cell image not found: /tmp/tmpcglkchog.jpg_rows/row_0/col_0.png -2025-03-04 15:11:55,060 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-04 15:11:56,583 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c0.png -2025-03-04 15:11:58,616 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c1.png -2025-03-04 15:12:00,304 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c0.png -2025-03-04 15:12:02,202 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c1.png -2025-03-04 15:12:03,783 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 15:12:04,862 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-04 15:12:06,725 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r0_c0.png -2025-03-04 15:12:08,394 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c0.png -2025-03-04 15:12:10,083 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c1.png -2025-03-04 15:12:11,845 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c0.png -2025-03-04 15:12:13,639 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c1.png -2025-03-04 15:12:15,212 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 15:12:16,263 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-04 15:12:18,983 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r0_c0.png -2025-03-04 15:12:20,534 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c0.png -2025-03-04 15:12:22,357 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c1.png -2025-03-04 15:12:23,998 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c0.png -2025-03-04 15:12:25,852 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c1.png -2025-03-04 15:12:27,374 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 15:12:28,500 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-04 15:12:30,317 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r0_c0.png -2025-03-04 15:12:32,200 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c0.png -2025-03-04 15:12:33,801 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c1.png -2025-03-04 15:12:35,456 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c0.png -2025-03-04 15:12:37,330 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c1.png -2025-03-04 15:12:38,568 [WARNING] __main__ - Cell image not found: /tmp/tmpeotevv3y.jpg_rows/row_3/col_0.png -2025-03-04 15:12:38,568 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-04 15:12:41,505 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r0_c0.png -2025-03-04 15:12:43,047 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c0.png -2025-03-04 15:12:45,139 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c1.png -2025-03-04 15:12:47,041 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-04 15:12:48,152 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-04 15:12:50,853 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r0_c0.png -2025-03-04 15:12:52,481 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c0.png -2025-03-04 15:12:54,553 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c1.png -2025-03-04 15:12:56,211 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-04 15:12:57,326 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-04 15:12:59,497 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r0_c0.png -2025-03-04 15:13:01,006 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c0.png -2025-03-04 15:13:02,796 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c1.png -2025-03-04 15:13:04,145 [WARNING] __main__ - Cell image not found: /tmp/tmpzevajlre.jpg_rows/row_2/col_0.png -2025-03-04 15:13:04,146 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-04 15:13:07,820 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r0_c0.png -2025-03-04 15:13:09,496 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c0.png -2025-03-04 15:13:12,256 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c1.png -2025-03-04 15:13:13,968 [WARNING] __main__ - Cell image not found: /tmp/tmpi6nq05bn.jpg_rows/row_2/col_0.png -2025-03-04 15:13:13,969 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=three -2025-03-04 15:13:16,358 [WARNING] __main__ - Cell image not found: /tmp/tmpixy8r3f3.jpg_rows/row_0/col_0.png -2025-03-04 15:13:16,614 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r1_c0.png -2025-03-04 15:13:18,269 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c0.png -2025-03-04 15:13:20,315 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c1.png -2025-03-04 15:13:22,010 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-04 15:13:23,175 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-04 15:13:25,470 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r0_c0.png -2025-03-04 15:13:27,019 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c0.png -2025-03-04 15:13:28,851 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c1.png -2025-03-04 15:13:30,395 [WARNING] __main__ - Cell image not found: /tmp/tmpbifewf1l.jpg_rows/row_2/col_0.png -2025-03-04 15:13:30,396 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-04 15:13:33,786 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r0_c0.png -2025-03-04 15:13:35,319 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c0.png -2025-03-04 15:13:37,434 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c1.png -2025-03-04 15:13:39,063 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c0.png -2025-03-04 15:13:40,922 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c1.png -2025-03-04 15:13:42,619 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-04 15:13:43,978 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=three -2025-03-04 15:13:46,723 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r0_c0.png -2025-03-04 15:13:49,624 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c0.png -2025-03-04 15:13:51,484 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c1.png -2025-03-04 15:13:53,174 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-04 15:13:54,373 [INFO] __main__ - Processing table image: /topic-extraction/img_29.jpg, columns=three -2025-03-04 15:13:56,158 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r0_c0.png -2025-03-04 15:13:57,658 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r1_c0.png -2025-03-04 15:13:59,475 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r1_c1.png -2025-03-04 15:14:00,939 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r2_c0.png -2025-03-04 15:14:02,128 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/_subtopics.json -2025-03-04 15:14:02,444 [INFO] __main__ - GPU memory cleaned up. -2025-03-04 15:14:02,450 [INFO] __main__ - Processing completed successfully. -2025-03-04 15:26:11,606 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf -2025-03-04 15:26:12,410 [INFO] __main__ - Gemini returned subtopics: {'2.1AS units': [7, 22], '2.2A2 units': [23, 43]} -2025-03-04 15:26:12,412 [INFO] __main__ - Loaded 3543551 bytes from local file '/home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf' -2025-03-04 15:26:12,650 [INFO] __main__ - Computed global offset: 0 -2025-03-04 15:26:12,650 [INFO] __main__ - Processing pages (0-based): [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42] -2025-03-04 15:26:55,517 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-04 15:26:56,096 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-04 15:26:56,477 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-04 15:26:57,075 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-04 15:26:57,408 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-04 15:26:57,830 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-04 15:26:58,382 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-04 15:26:58,898 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-04 15:26:59,417 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-04 15:26:59,725 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-04 15:27:00,296 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-04 15:27:00,791 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-04 15:27:01,351 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-04 15:27:01,702 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-04 15:27:02,383 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-04 15:27:02,941 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-04 15:27:03,499 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-04 15:27:03,970 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-04 15:27:04,560 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-04 15:27:04,953 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-04 15:27:05,456 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-04 15:27:05,960 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-04 15:27:06,338 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-04 15:27:06,923 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-04 15:27:07,442 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-04 15:27:07,865 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-04 15:27:08,441 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-04 15:27:08,965 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-04 15:27:09,345 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_29.jpg -2025-03-04 15:27:09,786 [INFO] __main__ - Classifying images to detect tables. -2025-03-04 15:27:13,955 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-04 15:27:17,352 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c0.png -2025-03-04 15:27:18,387 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.png -2025-03-04 15:27:19,645 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c1.png -2025-03-04 15:27:20,863 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c0.png -2025-03-04 15:27:22,510 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c1.png -2025-03-04 15:27:23,673 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c0.png -2025-03-04 15:27:24,916 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c1.png -2025-03-04 15:27:26,082 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 15:27:26,985 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-04 15:27:30,378 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r0_c0.png -2025-03-04 15:27:31,595 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c0.png -2025-03-04 15:27:33,178 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c1.png -2025-03-04 15:27:34,516 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c0.png -2025-03-04 15:27:35,797 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c1.png -2025-03-04 15:27:37,197 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c0.png -2025-03-04 15:27:38,530 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c1.png -2025-03-04 15:27:39,449 [WARNING] __main__ - Cell image not found: /tmp/tmpze_p_nup.jpg_rows/row_4/col_0.png -2025-03-04 15:27:39,450 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-04 15:27:41,705 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r0_c0.png -2025-03-04 15:27:42,889 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c0.png -2025-03-04 15:27:44,424 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c1.png -2025-03-04 15:27:45,569 [WARNING] __main__ - Cell image not found: /tmp/tmpo8ru6sy_.jpg_rows/row_2/col_0.png -2025-03-04 15:27:45,569 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-04 15:27:48,061 [WARNING] __main__ - Cell image not found: /tmp/tmpt5jss0kc.jpg_rows/row_0/col_0.png -2025-03-04 15:27:48,062 [WARNING] __main__ - Cell image not found: /tmp/tmpt5jss0kc.jpg_rows/row_0/col_1.png -2025-03-04 15:27:48,323 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r1_c0.png -2025-03-04 15:27:49,192 [WARNING] __main__ - Cell image not found: /tmp/tmpt5jss0kc.jpg_rows/row_1/col_1.png -2025-03-04 15:27:49,474 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c0.png -2025-03-04 15:27:50,966 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c1.png -2025-03-04 15:27:52,206 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c0.png -2025-03-04 15:27:53,715 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c1.png -2025-03-04 15:27:55,062 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 15:27:55,785 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-04 15:27:56,907 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r0_c0.png -2025-03-04 15:27:58,162 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c0.png -2025-03-04 15:27:59,610 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c1.png -2025-03-04 15:28:00,577 [WARNING] __main__ - Cell image not found: /tmp/tmprv2zp1hm.jpg_rows/row_2/col_0.png -2025-03-04 15:28:00,577 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-04 15:28:02,610 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r0_c0.png -2025-03-04 15:28:03,731 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c0.png -2025-03-04 15:28:05,127 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c1.png -2025-03-04 15:28:06,463 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 15:28:07,196 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-04 15:28:10,593 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-04 15:28:11,716 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c0.png -2025-03-04 15:28:13,375 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c1.png -2025-03-04 15:28:14,728 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c0.png -2025-03-04 15:28:16,139 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c1.png -2025-03-04 15:28:17,217 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 15:28:18,058 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-04 15:28:20,457 [WARNING] __main__ - Cell image not found: /tmp/tmp6ltfh6qj.jpg_rows/row_0/col_0.png -2025-03-04 15:28:20,457 [WARNING] __main__ - Cell image not found: /tmp/tmp6ltfh6qj.jpg_rows/row_0/col_1.png -2025-03-04 15:28:20,714 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-04 15:28:21,842 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c0.png -2025-03-04 15:28:23,262 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c1.png -2025-03-04 15:28:24,534 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c0.png -2025-03-04 15:28:25,938 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c1.png -2025-03-04 15:28:27,061 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 15:28:28,042 [INFO] __main__ - GPU memory cleaned up. -2025-03-04 15:31:20,307 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf -2025-03-04 15:31:21,077 [INFO] __main__ - Gemini returned subtopics: {'2.1AS units': [7, 22], '2.2A2 units': [23, 43]} -2025-03-04 15:31:21,078 [INFO] __main__ - Loaded 3543551 bytes from local file '/home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf' -2025-03-04 15:31:21,319 [INFO] __main__ - Computed global offset: 0 -2025-03-04 15:31:21,320 [INFO] __main__ - Processing pages (0-based): [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42] -2025-03-04 15:32:04,082 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-04 15:32:04,698 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-04 15:32:05,123 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-04 15:32:05,683 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-04 15:32:06,004 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-04 15:32:06,411 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-04 15:32:07,043 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-04 15:32:07,551 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-04 15:32:08,012 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-04 15:32:08,366 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-04 15:32:08,923 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-04 15:32:09,551 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-04 15:32:10,148 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-04 15:32:10,450 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-04 15:32:11,087 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-04 15:32:11,680 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-04 15:32:12,187 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-04 15:32:12,536 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-04 15:32:13,115 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-04 15:32:13,515 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-04 15:32:14,070 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-04 15:32:14,595 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-04 15:32:15,024 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-04 15:32:15,650 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-04 15:32:16,166 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-04 15:32:16,606 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-04 15:32:17,167 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-04 15:32:17,859 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-04 15:32:18,307 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_29.jpg -2025-03-04 15:32:18,743 [INFO] __main__ - Classifying images to detect tables. -2025-03-04 15:32:22,809 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-04 15:32:26,320 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c0.png -2025-03-04 15:32:27,408 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.png -2025-03-04 15:32:28,804 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c1.png -2025-03-04 15:32:30,110 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c0.png -2025-03-04 15:32:31,667 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c1.png -2025-03-04 15:32:33,090 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c0.png -2025-03-04 15:32:34,232 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c1.png -2025-03-04 15:32:35,692 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 15:32:36,518 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-04 15:32:39,939 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r0_c0.png -2025-03-04 15:32:41,144 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c0.png -2025-03-04 15:32:42,689 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c1.png -2025-03-04 15:32:44,209 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c0.png -2025-03-04 15:32:45,583 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c1.png -2025-03-04 15:32:46,772 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c0.png -2025-03-04 15:32:48,343 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c1.png -2025-03-04 15:32:49,593 [WARNING] __main__ - Cell image not found: /tmp/tmpcjta4ejx.jpg_rows/row_4/col_0.png -2025-03-04 15:32:49,594 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-04 15:32:52,006 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r0_c0.png -2025-03-04 15:32:53,384 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c0.png -2025-03-04 15:32:55,013 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c1.png -2025-03-04 15:32:56,392 [WARNING] __main__ - Cell image not found: /tmp/tmpjceth5ap.jpg_rows/row_2/col_0.png -2025-03-04 15:32:56,392 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-04 15:32:58,916 [WARNING] __main__ - Cell image not found: /tmp/tmpjesxg7w1.jpg_rows/row_0/col_0.png -2025-03-04 15:32:58,916 [WARNING] __main__ - Cell image not found: /tmp/tmpjesxg7w1.jpg_rows/row_0/col_1.png -2025-03-04 15:32:59,184 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r1_c0.png -2025-03-04 15:32:59,957 [WARNING] __main__ - Cell image not found: /tmp/tmpjesxg7w1.jpg_rows/row_1/col_1.png -2025-03-04 15:33:00,223 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c0.png -2025-03-04 15:33:01,718 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c1.png -2025-03-04 15:33:02,863 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c0.png -2025-03-04 15:33:04,278 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c1.png -2025-03-04 15:33:05,792 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 15:33:06,945 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-04 15:33:08,060 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r0_c0.png -2025-03-04 15:33:09,333 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c0.png -2025-03-04 15:33:10,656 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c1.png -2025-03-04 15:33:11,725 [WARNING] __main__ - Cell image not found: /tmp/tmpdrwulp2b.jpg_rows/row_2/col_0.png -2025-03-04 15:33:11,725 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-04 15:33:13,831 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r0_c0.png -2025-03-04 15:33:14,899 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c0.png -2025-03-04 15:33:16,339 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c1.png -2025-03-04 15:33:17,708 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 15:33:18,529 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-04 15:33:21,947 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-04 15:33:23,272 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c0.png -2025-03-04 15:33:24,983 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c1.png -2025-03-04 15:33:26,412 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c0.png -2025-03-04 15:33:27,848 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c1.png -2025-03-04 15:33:28,968 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 15:33:29,794 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-04 15:33:32,150 [WARNING] __main__ - Cell image not found: /tmp/tmpoymnq201.jpg_rows/row_0/col_0.png -2025-03-04 15:33:32,150 [WARNING] __main__ - Cell image not found: /tmp/tmpoymnq201.jpg_rows/row_0/col_1.png -2025-03-04 15:33:32,419 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-04 15:33:33,526 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c0.png -2025-03-04 15:33:35,020 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c1.png -2025-03-04 15:33:36,182 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c0.png -2025-03-04 15:33:37,673 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c1.png -2025-03-04 15:33:39,074 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 15:33:40,353 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-04 15:33:42,592 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r0_c0.png -2025-03-04 15:33:43,721 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c0.png -2025-03-04 15:33:45,061 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c1.png -2025-03-04 15:33:46,416 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c0.png -2025-03-04 15:33:48,214 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c1.png -2025-03-04 15:33:49,543 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c0.png -2025-03-04 15:33:50,716 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c1.png -2025-03-04 15:33:51,950 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 15:33:52,707 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-04 15:33:53,740 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r0_c0.png -2025-03-04 15:33:54,807 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c0.png -2025-03-04 15:33:56,303 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c1.png -2025-03-04 15:33:57,601 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:33:59,454 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:33:59,641 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 15:34:00,464 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=three -2025-03-04 15:34:03,055 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r0_c0.png -2025-03-04 15:34:04,328 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c0.png -2025-03-04 15:34:05,909 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c1.png -2025-03-04 15:34:07,407 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 15:34:08,360 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-04 15:34:11,053 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r0_c0.png -2025-03-04 15:34:12,353 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c0.png -2025-03-04 15:34:14,028 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c1.png -2025-03-04 15:34:15,330 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 15:34:16,155 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-04 15:34:19,216 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-04 15:34:20,181 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c1.png -2025-03-04 15:34:21,365 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c0.png -2025-03-04 15:34:22,972 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c1.png -2025-03-04 15:34:24,271 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 15:34:25,591 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c0.png -2025-03-04 15:34:26,970 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c1.png -2025-03-04 15:34:28,287 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 15:34:29,173 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-04 15:34:30,310 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r0_c0.png -2025-03-04 15:34:31,707 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c0.png -2025-03-04 15:34:33,077 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c1.png -2025-03-04 15:34:34,432 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 15:34:35,419 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-04 15:34:39,349 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r0_c0.png -2025-03-04 15:34:40,557 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c0.png -2025-03-04 15:34:42,144 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c1.png -2025-03-04 15:34:43,543 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c0.png -2025-03-04 15:34:44,837 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c1.png -2025-03-04 15:34:46,038 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c0.png -2025-03-04 15:34:47,350 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c1.png -2025-03-04 15:34:48,415 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 15:34:49,444 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-04 15:34:52,808 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r0_c0.png -2025-03-04 15:34:54,069 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c0.png -2025-03-04 15:34:55,414 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c1.png -2025-03-04 15:34:56,389 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:34:58,032 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:34:58,307 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c0.png -2025-03-04 15:34:59,644 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c1.png -2025-03-04 15:35:00,806 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c0.png -2025-03-04 15:35:02,244 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c1.png -2025-03-04 15:35:03,525 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c0.png -2025-03-04 15:35:05,011 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c1.png -2025-03-04 15:35:06,227 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 15:35:07,220 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-04 15:35:10,467 [WARNING] __main__ - Cell image not found: /tmp/tmpd939azq8.jpg_rows/row_0/col_0.png -2025-03-04 15:35:10,753 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-04 15:35:11,823 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c0.png -2025-03-04 15:35:13,210 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c1.png -2025-03-04 15:35:14,509 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c0.png -2025-03-04 15:35:16,175 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c1.png -2025-03-04 15:35:17,299 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 15:35:18,697 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-04 15:35:20,625 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r0_c0.png -2025-03-04 15:35:21,938 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c0.png -2025-03-04 15:35:23,301 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c1.png -2025-03-04 15:35:24,620 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c0.png -2025-03-04 15:35:25,928 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c1.png -2025-03-04 15:35:27,213 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 15:35:28,427 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-04 15:35:32,305 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r0_c0.png -2025-03-04 15:35:33,482 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c0.png -2025-03-04 15:35:34,846 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c1.png -2025-03-04 15:35:36,210 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c0.png -2025-03-04 15:35:37,612 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c1.png -2025-03-04 15:35:38,720 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 15:35:39,472 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-04 15:35:41,593 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r0_c0.png -2025-03-04 15:35:42,686 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c0.png -2025-03-04 15:35:43,820 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c1.png -2025-03-04 15:35:45,122 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c0.png -2025-03-04 15:35:46,695 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c1.png -2025-03-04 15:35:47,832 [WARNING] __main__ - Cell image not found: /tmp/tmpa3p4artc.jpg_rows/row_3/col_0.png -2025-03-04 15:35:47,833 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-04 15:35:50,764 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r0_c0.png -2025-03-04 15:35:52,001 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c0.png -2025-03-04 15:35:53,633 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c1.png -2025-03-04 15:35:54,988 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-04 15:35:55,981 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-04 15:35:58,560 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r0_c0.png -2025-03-04 15:35:59,764 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c0.png -2025-03-04 15:36:01,449 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c1.png -2025-03-04 15:36:02,774 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-04 15:36:03,574 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-04 15:36:05,658 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r0_c0.png -2025-03-04 15:36:06,688 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c0.png -2025-03-04 15:36:08,081 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c1.png -2025-03-04 15:36:09,126 [WARNING] __main__ - Cell image not found: /tmp/tmp0rg93e80.jpg_rows/row_2/col_0.png -2025-03-04 15:36:09,126 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-04 15:36:12,796 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r0_c0.png -2025-03-04 15:36:13,985 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c0.png -2025-03-04 15:36:15,540 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c1.png -2025-03-04 15:36:16,836 [WARNING] __main__ - Cell image not found: /tmp/tmpqgp1wyzs.jpg_rows/row_2/col_0.png -2025-03-04 15:36:16,837 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=three -2025-03-04 15:36:19,174 [WARNING] __main__ - Cell image not found: /tmp/tmppozr1vmo.jpg_rows/row_0/col_0.png -2025-03-04 15:36:19,454 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r1_c0.png -2025-03-04 15:36:20,842 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c0.png -2025-03-04 15:36:22,358 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c1.png -2025-03-04 15:36:23,648 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-04 15:36:24,679 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-04 15:36:26,976 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r0_c0.png -2025-03-04 15:36:28,135 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c0.png -2025-03-04 15:36:29,633 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c1.png -2025-03-04 15:36:30,704 [WARNING] __main__ - Cell image not found: /tmp/tmp1jkuoh_j.jpg_rows/row_2/col_0.png -2025-03-04 15:36:30,704 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-04 15:36:34,134 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r0_c0.png -2025-03-04 15:36:35,688 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c0.png -2025-03-04 15:36:37,106 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c1.png -2025-03-04 15:36:38,347 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c0.png -2025-03-04 15:36:39,930 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c1.png -2025-03-04 15:36:41,543 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-04 15:36:42,479 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=three -2025-03-04 15:36:45,183 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r0_c0.png -2025-03-04 15:36:46,805 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c0.png -2025-03-04 15:36:48,340 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c1.png -2025-03-04 15:36:49,963 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-04 15:36:50,923 [INFO] __main__ - Processing table image: /topic-extraction/img_29.jpg, columns=three -2025-03-04 15:36:52,729 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r0_c0.png -2025-03-04 15:36:53,854 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r1_c0.png -2025-03-04 15:36:55,251 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r1_c1.png -2025-03-04 15:36:56,613 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r2_c0.png -2025-03-04 15:36:57,328 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/_subtopics.json -2025-03-04 15:36:57,652 [INFO] __main__ - GPU memory cleaned up. -2025-03-04 15:36:57,657 [INFO] __main__ - Processing completed successfully. -2025-03-04 15:44:00,650 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf -2025-03-04 15:44:01,442 [INFO] __main__ - Gemini returned subtopics: {'2.1AS units': [7, 22], '2.2A2 units': [23, 43]} -2025-03-04 15:44:01,443 [INFO] __main__ - Loaded 3543551 bytes from local file '/home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf' -2025-03-04 15:44:01,715 [INFO] __main__ - Computed global offset: 0 -2025-03-04 15:44:01,716 [INFO] __main__ - Processing pages (0-based): [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42] -2025-03-04 15:44:43,933 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-04 15:44:44,549 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-04 15:44:44,933 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-04 15:44:45,551 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-04 15:44:45,878 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-04 15:44:46,265 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-04 15:44:46,842 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-04 15:44:47,339 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-04 15:44:47,774 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-04 15:44:48,059 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-04 15:44:48,605 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-04 15:44:49,108 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-04 15:44:49,671 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-04 15:44:50,018 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-04 15:44:50,653 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-04 15:44:51,235 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-04 15:44:51,736 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-04 15:44:52,128 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-04 15:44:52,671 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-04 15:44:53,117 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-04 15:44:53,630 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-04 15:44:54,158 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-04 15:44:54,526 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-04 15:44:55,169 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-04 15:44:55,684 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-04 15:44:56,086 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-04 15:44:56,643 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-04 15:44:57,208 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-04 15:44:57,645 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_29.jpg -2025-03-04 15:44:58,114 [INFO] __main__ - Classifying images to detect tables. -2025-03-04 15:45:02,187 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-04 15:45:05,587 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c0.png -2025-03-04 15:45:06,682 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.png -2025-03-04 15:45:08,038 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c1.png -2025-03-04 15:45:09,231 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c0.png -2025-03-04 15:45:10,729 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c1.png -2025-03-04 15:45:12,249 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c0.png -2025-03-04 15:45:13,912 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c1.png -2025-03-04 15:45:15,013 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 15:45:15,824 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-04 15:45:19,264 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r0_c0.png -2025-03-04 15:45:20,487 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c0.png -2025-03-04 15:45:22,028 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c1.png -2025-03-04 15:45:23,388 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c0.png -2025-03-04 15:45:24,683 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c1.png -2025-03-04 15:45:25,895 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c0.png -2025-03-04 15:45:27,299 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c1.png -2025-03-04 15:45:28,354 [WARNING] __main__ - Cell image not found: /tmp/tmp7p9w5xjt.jpg_rows/row_4/col_0.png -2025-03-04 15:45:28,354 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-04 15:45:31,250 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r0_c0.png -2025-03-04 15:45:32,461 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c0.png -2025-03-04 15:45:34,067 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c1.png -2025-03-04 15:45:35,248 [WARNING] __main__ - Cell image not found: /tmp/tmp83wn5naq.jpg_rows/row_2/col_0.png -2025-03-04 15:45:35,248 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-04 15:45:38,582 [WARNING] __main__ - Cell image not found: /tmp/tmp2yankvwp.jpg_rows/row_0/col_0.png -2025-03-04 15:45:38,582 [WARNING] __main__ - Cell image not found: /tmp/tmp2yankvwp.jpg_rows/row_0/col_1.png -2025-03-04 15:45:38,844 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r1_c0.png -2025-03-04 15:45:39,738 [WARNING] __main__ - Cell image not found: /tmp/tmp2yankvwp.jpg_rows/row_1/col_1.png -2025-03-04 15:45:40,012 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c0.png -2025-03-04 15:45:41,254 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c1.png -2025-03-04 15:45:42,576 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c0.png -2025-03-04 15:45:44,007 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c1.png -2025-03-04 15:45:45,165 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 15:45:46,063 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-04 15:45:47,192 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r0_c0.png -2025-03-04 15:45:48,417 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c0.png -2025-03-04 15:45:50,004 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c1.png -2025-03-04 15:45:50,923 [WARNING] __main__ - Cell image not found: /tmp/tmpbto794g4.jpg_rows/row_2/col_0.png -2025-03-04 15:45:50,924 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-04 15:45:52,949 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r0_c0.png -2025-03-04 15:45:54,097 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c0.png -2025-03-04 15:45:55,733 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c1.png -2025-03-04 15:45:57,005 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 15:45:57,698 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-04 15:46:01,244 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-04 15:46:02,420 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c0.png -2025-03-04 15:46:04,105 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c1.png -2025-03-04 15:46:05,590 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c0.png -2025-03-04 15:46:07,238 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c1.png -2025-03-04 15:46:08,382 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 15:46:09,222 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-04 15:46:11,549 [WARNING] __main__ - Cell image not found: /tmp/tmpaw7kcrz0.jpg_rows/row_0/col_0.png -2025-03-04 15:46:11,550 [WARNING] __main__ - Cell image not found: /tmp/tmpaw7kcrz0.jpg_rows/row_0/col_1.png -2025-03-04 15:46:11,840 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-04 15:46:12,964 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c0.png -2025-03-04 15:46:14,263 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c1.png -2025-03-04 15:46:15,260 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c0.png -2025-03-04 15:46:16,748 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c1.png -2025-03-04 15:46:17,811 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 15:46:18,625 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-04 15:46:21,056 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r0_c0.png -2025-03-04 15:46:22,143 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c0.png -2025-03-04 15:46:23,340 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c1.png -2025-03-04 15:46:24,596 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c0.png -2025-03-04 15:46:25,932 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c1.png -2025-03-04 15:46:27,152 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c0.png -2025-03-04 15:46:28,333 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c1.png -2025-03-04 15:46:29,514 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 15:46:30,282 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-04 15:46:31,306 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r0_c0.png -2025-03-04 15:46:32,938 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c0.png -2025-03-04 15:46:34,269 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c1.png -2025-03-04 15:46:35,344 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:46:36,964 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:46:37,144 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 15:46:38,000 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=three -2025-03-04 15:46:40,503 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r0_c0.png -2025-03-04 15:46:41,691 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c0.png -2025-03-04 15:46:43,095 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c1.png -2025-03-04 15:46:44,434 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 15:46:45,336 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-04 15:46:48,012 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r0_c0.png -2025-03-04 15:46:49,147 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c0.png -2025-03-04 15:46:50,868 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c1.png -2025-03-04 15:46:52,065 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 15:46:53,034 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-04 15:46:56,122 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-04 15:46:57,193 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c1.png -2025-03-04 15:46:58,373 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c0.png -2025-03-04 15:46:59,942 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c1.png -2025-03-04 15:47:01,171 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 15:47:02,323 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c0.png -2025-03-04 15:47:03,731 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c1.png -2025-03-04 15:47:05,010 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 15:47:05,931 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-04 15:47:06,990 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r0_c0.png -2025-03-04 15:47:08,069 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c0.png -2025-03-04 15:47:09,567 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c1.png -2025-03-04 15:47:10,897 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 15:47:11,705 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-04 15:47:15,681 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r0_c0.png -2025-03-04 15:47:16,762 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c0.png -2025-03-04 15:47:18,634 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c1.png -2025-03-04 15:47:19,946 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c0.png -2025-03-04 15:47:21,228 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c1.png -2025-03-04 15:47:22,312 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c0.png -2025-03-04 15:47:23,556 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c1.png -2025-03-04 15:47:24,710 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 15:47:25,584 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-04 15:47:28,985 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r0_c0.png -2025-03-04 15:47:30,058 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c0.png -2025-03-04 15:47:31,329 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c1.png -2025-03-04 15:47:32,430 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:47:33,984 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-04 15:47:34,242 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c0.png -2025-03-04 15:47:35,512 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c1.png -2025-03-04 15:47:36,669 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c0.png -2025-03-04 15:47:37,917 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c1.png -2025-03-04 15:47:39,213 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c0.png -2025-03-04 15:47:40,627 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c1.png -2025-03-04 15:47:41,853 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 15:47:42,656 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-04 15:47:45,896 [WARNING] __main__ - Cell image not found: /tmp/tmpsve2k44o.jpg_rows/row_0/col_0.png -2025-03-04 15:47:46,152 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-04 15:47:47,274 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c0.png -2025-03-04 15:47:48,703 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c1.png -2025-03-04 15:47:49,914 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c0.png -2025-03-04 15:47:51,465 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c1.png -2025-03-04 15:47:52,636 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 15:47:53,387 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-04 15:47:55,236 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r0_c0.png -2025-03-04 15:47:56,577 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c0.png -2025-03-04 15:47:57,868 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c1.png -2025-03-04 15:47:59,162 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c0.png -2025-03-04 15:48:00,723 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c1.png -2025-03-04 15:48:01,930 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 15:48:02,711 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-04 15:48:05,434 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r0_c0.png -2025-03-04 15:48:06,804 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c0.png -2025-03-04 15:48:08,227 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c1.png -2025-03-04 15:48:09,307 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c0.png -2025-03-04 15:48:10,638 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c1.png -2025-03-04 15:48:11,777 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 15:48:12,692 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-04 15:48:14,522 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r0_c0.png -2025-03-04 15:48:15,588 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c0.png -2025-03-04 15:48:16,846 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c1.png -2025-03-04 15:48:18,287 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c0.png -2025-03-04 15:48:19,866 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c1.png -2025-03-04 15:48:20,869 [WARNING] __main__ - Cell image not found: /tmp/tmp4g3jjqpe.jpg_rows/row_3/col_0.png -2025-03-04 15:48:20,870 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-04 15:48:24,308 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r0_c0.png -2025-03-04 15:48:25,890 [INFO] __main__ - GPU memory cleaned up. -2025-03-04 16:03:17,179 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf -2025-03-04 16:03:18,019 [INFO] __main__ - Gemini returned subtopics: {'2.1AS units': [7, 22], '2.2A2 units': [23, 43]} -2025-03-04 16:03:18,021 [INFO] __main__ - Loaded 3543551 bytes from local file '/home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf' -2025-03-04 16:03:18,325 [INFO] __main__ - Computed global offset: 0 -2025-03-04 16:03:18,326 [INFO] __main__ - Processing pages (0-based): [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42] -2025-03-04 16:04:01,319 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-04 16:04:02,043 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-04 16:04:02,530 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-04 16:04:03,057 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-04 16:04:03,381 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-04 16:04:03,816 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-04 16:04:04,400 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-04 16:04:04,909 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-04 16:04:05,301 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-04 16:04:05,593 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-04 16:04:06,162 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-04 16:04:06,567 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-04 16:04:07,102 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-04 16:04:07,396 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-04 16:04:08,064 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-04 16:04:08,639 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-04 16:04:09,162 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-04 16:04:09,603 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-04 16:04:10,193 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-04 16:04:10,587 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-04 16:04:11,143 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-04 16:04:11,668 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-04 16:04:12,063 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-04 16:04:12,722 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-04 16:04:13,222 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-04 16:04:13,723 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-04 16:04:14,266 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-04 16:04:14,809 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-04 16:04:15,168 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_29.jpg -2025-03-04 16:04:15,629 [INFO] __main__ - Classifying images to detect tables. -2025-03-04 16:04:29,447 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-04 16:04:33,151 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c0.png -2025-03-04 16:04:34,320 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.png -2025-03-04 16:04:35,668 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c1.png -2025-03-04 16:04:36,741 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-04 16:04:38,478 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c0.png -2025-03-04 16:04:39,938 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c1.png -2025-03-04 16:04:41,274 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c0.png -2025-03-04 16:04:42,492 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c1.png -2025-03-04 16:04:43,600 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 16:05:01,293 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 16:05:01,293 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 16:05:01,294 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-04 16:05:05,067 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r0_c0.png -2025-03-04 16:05:06,139 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_2.jpg_r0_c0.png -2025-03-04 16:05:06,140 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_2.jpg_r0_c0.png -2025-03-04 16:05:06,471 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c0.png -2025-03-04 16:05:08,036 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c1.png -2025-03-04 16:05:09,983 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c0.png -2025-03-04 16:05:11,448 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c1.png -2025-03-04 16:05:12,646 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c0.png -2025-03-04 16:05:14,088 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c1.png -2025-03-04 16:05:15,041 [WARNING] __main__ - Cell image not found: /tmp/tmpx1hjrs13.jpg_rows/row_4/col_0.png -2025-03-04 16:05:15,041 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-04 16:05:17,176 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r0_c0.png -2025-03-04 16:05:45,780 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c0.png -2025-03-04 16:05:47,269 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c1.png -2025-03-04 16:05:48,375 [WARNING] __main__ - Cell image not found: /tmp/tmp34e7emqv.jpg_rows/row_2/col_0.png -2025-03-04 16:05:48,376 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-04 16:05:50,843 [WARNING] __main__ - Cell image not found: /tmp/tmpa_s846g8.jpg_rows/row_0/col_0.png -2025-03-04 16:05:50,843 [WARNING] __main__ - Cell image not found: /tmp/tmpa_s846g8.jpg_rows/row_0/col_1.png -2025-03-04 16:05:51,130 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r1_c0.png -2025-03-04 16:05:52,027 [WARNING] __main__ - Cell image not found: /tmp/tmpa_s846g8.jpg_rows/row_1/col_1.png -2025-03-04 16:05:52,283 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c0.png -2025-03-04 16:05:53,776 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c1.png -2025-03-04 16:05:55,038 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c0.png -2025-03-04 16:05:56,538 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c1.png -2025-03-04 16:06:23,247 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 16:06:24,239 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 16:06:24,239 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 16:06:24,240 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-04 16:06:25,315 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r0_c0.png -2025-03-04 16:06:26,475 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c0.png -2025-03-04 16:06:27,933 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c1.png -2025-03-04 16:06:50,733 [WARNING] __main__ - Cell image not found: /tmp/tmp70so49qw.jpg_rows/row_2/col_0.png -2025-03-04 16:06:50,733 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-04 16:06:53,033 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r0_c0.png -2025-03-04 16:06:54,115 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c0.png -2025-03-04 16:06:55,601 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c1.png -2025-03-04 16:06:56,853 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 16:06:57,887 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 16:06:57,887 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 16:06:57,887 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-04 16:07:01,402 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-04 16:07:02,400 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-04 16:07:02,400 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-04 16:07:02,744 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c0.png -2025-03-04 16:07:04,356 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c1.png -2025-03-04 16:07:05,783 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c0.png -2025-03-04 16:07:07,097 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c1.png -2025-03-04 16:07:09,236 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 16:07:10,106 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 16:07:10,106 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 16:07:10,106 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-04 16:07:12,358 [WARNING] __main__ - Cell image not found: /tmp/tmpk6x2ncvt.jpg_rows/row_0/col_0.png -2025-03-04 16:07:12,358 [WARNING] __main__ - Cell image not found: /tmp/tmpk6x2ncvt.jpg_rows/row_0/col_1.png -2025-03-04 16:07:12,622 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-04 16:07:13,644 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-04 16:07:13,645 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-04 16:07:13,919 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c0.png -2025-03-04 16:07:15,216 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c1.png -2025-03-04 16:07:16,322 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c0.png -2025-03-04 16:07:17,573 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c1.png -2025-03-04 16:07:18,755 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 16:07:19,769 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 16:07:19,769 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 16:07:19,770 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-04 16:07:21,920 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r0_c0.png -2025-03-04 16:07:23,109 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c0.png -2025-03-04 16:07:24,492 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c1.png -2025-03-04 16:07:25,709 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c0.png -2025-03-04 16:07:27,108 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c1.png -2025-03-04 16:07:28,348 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c0.png -2025-03-04 16:07:29,633 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c1.png -2025-03-04 16:07:30,743 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 16:07:31,711 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 16:07:31,712 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 16:07:31,712 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-04 16:07:32,719 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r0_c0.png -2025-03-04 16:07:33,732 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_10.jpg_r0_c0.png -2025-03-04 16:07:33,732 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_10.jpg_r0_c0.png -2025-03-04 16:07:33,985 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c0.png -2025-03-04 16:07:35,294 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c1.png -2025-03-04 16:07:36,334 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-04 16:07:37,915 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-04 16:07:38,094 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 16:07:39,146 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 16:07:39,147 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 16:07:39,147 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=three -2025-03-04 16:07:41,700 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r0_c0.png -2025-03-04 16:07:42,597 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_11.jpg_r0_c0.png -2025-03-04 16:07:42,597 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_11.jpg_r0_c0.png -2025-03-04 16:07:42,938 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c0.png -2025-03-04 16:07:44,504 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c1.png -2025-03-04 16:07:45,787 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 16:07:46,740 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 16:07:46,740 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 16:07:46,740 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-04 16:07:49,383 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r0_c0.png -2025-03-04 16:07:50,335 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_12.jpg_r0_c0.png -2025-03-04 16:07:50,335 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_12.jpg_r0_c0.png -2025-03-04 16:07:50,664 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c0.png -2025-03-04 16:08:24,955 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c1.png -2025-03-04 16:08:26,302 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 16:08:27,235 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 16:08:27,236 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 16:08:27,236 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-04 16:08:30,452 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-04 16:08:31,317 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-04 16:08:31,318 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-04 16:08:31,577 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c1.png -2025-03-04 16:08:32,560 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_13.jpg_r0_c1.png -2025-03-04 16:08:32,561 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r0_c1.png -2025-03-04 16:08:32,897 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c0.png -2025-03-04 16:08:34,349 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c1.png -2025-03-04 16:08:35,700 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 16:08:36,848 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 16:08:36,848 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 16:08:37,105 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c0.png -2025-03-04 16:08:38,569 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c1.png -2025-03-04 16:09:07,706 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 16:09:08,764 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 16:09:08,765 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 16:09:08,765 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-04 16:09:09,918 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r0_c0.png -2025-03-04 16:09:11,070 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_14.jpg_r0_c0.png -2025-03-04 16:09:11,070 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_14.jpg_r0_c0.png -2025-03-04 16:09:11,332 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c0.png -2025-03-04 16:09:12,851 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c1.png -2025-03-04 16:09:14,040 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 16:09:15,159 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 16:09:15,159 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 16:09:15,160 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-04 16:09:18,791 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r0_c0.png -2025-03-04 16:09:20,084 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c0.png -2025-03-04 16:09:53,993 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c1.png -2025-03-04 16:09:55,306 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c0.png -2025-03-04 16:09:56,701 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c1.png -2025-03-04 16:10:45,161 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c0.png -2025-03-04 16:10:46,603 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c1.png -2025-03-04 16:10:47,763 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 16:10:48,717 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 16:10:48,717 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 16:10:48,718 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-04 16:10:52,043 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r0_c0.png -2025-03-04 16:10:53,182 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c0.png -2025-03-04 16:10:54,673 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c1.png -2025-03-04 16:10:55,933 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c0.png -2025-03-04 16:10:57,230 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c1.png -2025-03-04 16:11:51,406 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c0.png -2025-03-04 16:11:52,824 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c1.png -2025-03-04 16:11:54,109 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c0.png -2025-03-04 16:11:55,701 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c1.png -2025-03-04 16:12:58,650 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 16:14:12,385 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 16:14:12,385 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 16:14:12,385 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-04 16:14:15,668 [WARNING] __main__ - Cell image not found: /tmp/tmpjbxpfrax.jpg_rows/row_0/col_0.png -2025-03-04 16:14:15,932 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-04 16:14:17,496 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-04 16:14:17,496 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-04 16:14:17,757 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c0.png -2025-03-04 16:14:19,186 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c1.png -2025-03-04 16:14:20,372 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c0.png -2025-03-04 16:14:21,809 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c1.png -2025-03-04 16:14:23,047 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 16:14:23,875 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 16:14:23,875 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 16:14:23,876 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-04 16:14:25,815 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r0_c0.png -2025-03-04 16:14:27,062 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c0.png -2025-03-04 16:14:28,388 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c1.png -2025-03-04 16:14:29,579 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c0.png -2025-03-04 16:14:30,976 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c1.png -2025-03-04 16:14:32,148 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 16:14:33,102 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 16:14:33,102 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 16:14:33,103 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-04 16:14:35,699 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r0_c0.png -2025-03-04 16:14:36,808 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c0.png -2025-03-04 16:14:38,161 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c1.png -2025-03-04 16:14:39,288 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c0.png -2025-03-04 16:14:40,644 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c1.png -2025-03-04 16:14:41,853 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 16:14:42,911 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 16:14:42,912 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 16:14:42,912 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-04 16:14:44,787 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r0_c0.png -2025-03-04 16:14:45,920 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c0.png -2025-03-04 16:14:47,192 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c1.png -2025-03-04 16:14:48,562 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c0.png -2025-03-04 16:14:49,981 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c1.png -2025-03-04 16:14:50,977 [WARNING] __main__ - Cell image not found: /tmp/tmpg265bgw6.jpg_rows/row_3/col_0.png -2025-03-04 16:14:50,978 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-04 16:14:53,847 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r0_c0.png -2025-03-04 16:14:54,827 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_21.jpg_r0_c0.png -2025-03-04 16:14:54,827 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_21.jpg_r0_c0.png -2025-03-04 16:14:55,202 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c0.png -2025-03-04 16:14:56,926 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c1.png -2025-03-04 16:14:58,282 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-04 16:14:59,389 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-04 16:14:59,389 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-04 16:14:59,389 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-04 16:15:02,110 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r0_c0.png -2025-03-04 16:15:03,294 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c0.png -2025-03-04 16:15:04,939 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c1.png -2025-03-04 16:15:06,157 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-04 16:15:07,359 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-04 16:15:07,359 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-04 16:15:07,359 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-04 16:15:09,602 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r0_c0.png -2025-03-04 16:15:10,689 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c0.png -2025-03-04 16:15:12,173 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c1.png -2025-03-04 16:15:13,139 [WARNING] __main__ - Cell image not found: /tmp/tmp2pu0ajzo.jpg_rows/row_2/col_0.png -2025-03-04 16:15:13,140 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-04 16:15:16,860 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r0_c0.png -2025-03-04 16:15:17,764 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_24.jpg_r0_c0.png -2025-03-04 16:15:17,764 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_24.jpg_r0_c0.png -2025-03-04 16:15:18,157 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c0.png -2025-03-04 16:15:19,840 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c1.png -2025-03-04 16:15:21,161 [WARNING] __main__ - Cell image not found: /tmp/tmprpd5elhh.jpg_rows/row_2/col_0.png -2025-03-04 16:15:21,161 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=three -2025-03-04 16:15:23,636 [WARNING] __main__ - Cell image not found: /tmp/tmpg3004neh.jpg_rows/row_0/col_0.png -2025-03-04 16:15:23,898 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r1_c0.png -2025-03-04 16:15:25,193 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c0.png -2025-03-04 16:15:26,896 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c1.png -2025-03-04 16:17:10,247 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-04 16:17:11,142 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-04 16:17:11,142 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-04 16:17:11,142 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-04 16:17:13,261 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r0_c0.png -2025-03-04 16:17:14,191 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_26.jpg_r0_c0.png -2025-03-04 16:17:14,192 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_26.jpg_r0_c0.png -2025-03-04 16:17:14,531 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c0.png -2025-03-04 16:17:16,162 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c1.png -2025-03-04 16:17:17,185 [WARNING] __main__ - Cell image not found: /tmp/tmppqjti3ld.jpg_rows/row_2/col_0.png -2025-03-04 16:17:17,185 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-04 16:17:20,592 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r0_c0.png -2025-03-04 16:17:21,798 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c0.png -2025-03-04 16:17:23,372 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c1.png -2025-03-04 16:17:25,350 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c0.png -2025-03-04 16:17:26,720 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c1.png -2025-03-04 16:17:27,948 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-04 16:17:29,024 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-04 16:17:29,024 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-04 16:17:29,025 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=three -2025-03-04 16:17:31,697 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r0_c0.png -2025-03-04 16:17:33,119 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c0.png -2025-03-04 16:17:34,654 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c1.png -2025-03-04 16:17:36,121 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-04 16:17:36,990 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-04 16:17:36,991 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-04 16:17:36,991 [INFO] __main__ - Processing table image: /topic-extraction/img_29.jpg, columns=three -2025-03-04 16:17:38,977 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r0_c0.png -2025-03-04 16:17:40,031 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r1_c0.png -2025-03-04 16:17:41,536 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r1_c1.png -2025-03-04 16:17:42,780 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r2_c0.png -2025-03-04 16:17:43,678 [INFO] __main__ - Deleted from S3: /topic-extraction/cells/img_29.jpg_r2_c0.png -2025-03-04 16:17:43,678 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_29.jpg_r2_c0.png -2025-03-04 16:17:43,682 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/_subtopics.json -2025-03-04 16:17:43,995 [INFO] __main__ - GPU memory cleaned up. -2025-03-04 16:17:44,000 [INFO] __main__ - Processing completed successfully. -2025-03-04 16:39:05,313 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf -2025-03-04 16:39:06,086 [INFO] __main__ - Gemini returned subtopics: {'2.1AS units': [7, 22], '2.2A2 units': [23, 43]} -2025-03-04 16:39:06,088 [INFO] __main__ - Loaded 3543551 bytes from local file '/home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf' -2025-03-04 16:39:06,326 [INFO] __main__ - Computed global offset: 0 -2025-03-04 16:39:06,326 [INFO] __main__ - Processing pages (0-based): [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42] -2025-03-04 16:39:49,136 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-04 16:39:49,708 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-04 16:39:50,157 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-04 16:39:50,688 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-04 16:39:51,083 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-04 16:39:51,533 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-04 16:39:52,100 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-04 16:39:52,532 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-04 16:39:52,942 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-04 16:39:53,244 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-04 16:39:53,742 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-04 16:39:54,213 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-04 16:39:54,761 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-04 16:39:55,050 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-04 16:39:55,740 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-04 16:39:56,304 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-04 16:39:56,780 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-04 16:39:57,175 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-04 16:39:57,748 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-04 16:39:58,140 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-04 16:39:58,682 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-04 16:39:59,190 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-04 16:39:59,577 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-04 16:40:00,229 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-04 16:40:00,732 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-04 16:40:01,136 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-04 16:40:01,706 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-04 16:40:02,236 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-04 16:40:02,621 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_29.jpg -2025-03-04 16:40:03,051 [INFO] __main__ - Classifying images to detect tables. -2025-03-04 16:40:06,927 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-04 16:40:10,403 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c0.png -2025-03-04 16:40:11,481 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.png -2025-03-04 16:40:12,796 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c1.png -2025-03-04 16:40:13,767 [ERROR] __main__ - Gemini subtopic identification error on attempt 0: Expecting value: line 1 column 1 (char 0) -2025-03-04 16:40:15,308 [ERROR] __main__ - Gemini subtopic identification error on attempt 1: Expecting value: line 1 column 1 (char 0) -2025-03-04 16:40:15,585 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c0.png -2025-03-04 16:40:18,265 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c1.png -2025-03-04 16:40:19,708 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c0.png -2025-03-04 16:40:20,908 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c1.png -2025-03-04 16:40:22,033 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 16:40:22,999 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 16:40:22,999 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-04 16:40:26,396 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r0_c0.png -2025-03-04 16:40:27,834 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c0.png -2025-03-04 16:40:29,314 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c1.png -2025-03-04 16:40:30,652 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c0.png -2025-03-04 16:40:32,068 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c1.png -2025-03-04 16:40:33,239 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c0.png -2025-03-04 16:40:34,633 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c1.png -2025-03-04 16:40:35,597 [WARNING] __main__ - Cell image not found: /tmp/tmpkr6p74mz.jpg_rows/row_4/col_0.png -2025-03-04 16:40:35,598 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-04 16:40:38,470 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r0_c0.png -2025-03-04 16:40:39,732 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c0.png -2025-03-04 16:40:41,236 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c1.png -2025-03-04 16:40:42,293 [WARNING] __main__ - Cell image not found: /tmp/tmp_4ioykgq.jpg_rows/row_2/col_0.png -2025-03-04 16:40:42,293 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-04 16:40:44,974 [WARNING] __main__ - Cell image not found: /tmp/tmp8qnr07bo.jpg_rows/row_0/col_0.png -2025-03-04 16:40:44,974 [WARNING] __main__ - Cell image not found: /tmp/tmp8qnr07bo.jpg_rows/row_0/col_1.png -2025-03-04 16:40:45,250 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r1_c0.png -2025-03-04 16:40:46,109 [WARNING] __main__ - Cell image not found: /tmp/tmp8qnr07bo.jpg_rows/row_1/col_1.png -2025-03-04 16:40:46,385 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c0.png -2025-03-04 16:40:47,759 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c1.png -2025-03-04 16:40:58,221 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c0.png -2025-03-04 16:40:59,680 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c1.png -2025-03-04 16:41:00,920 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 16:41:01,935 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 16:41:01,936 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-04 16:41:03,077 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r0_c0.png -2025-03-04 16:41:04,211 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c0.png -2025-03-04 16:41:05,575 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c1.png -2025-03-04 16:41:06,564 [WARNING] __main__ - Cell image not found: /tmp/tmph28hdp5v.jpg_rows/row_2/col_0.png -2025-03-04 16:41:06,564 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-04 16:41:08,734 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r0_c0.png -2025-03-04 16:41:09,813 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c0.png -2025-03-04 16:41:11,241 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c1.png -2025-03-04 16:41:12,566 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 16:41:13,534 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 16:41:13,535 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-04 16:41:17,251 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-04 16:41:18,440 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c0.png -2025-03-04 16:41:20,101 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c1.png -2025-03-04 16:41:21,465 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c0.png -2025-03-04 16:41:22,836 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c1.png -2025-03-04 16:41:24,168 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 16:41:25,738 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 16:41:25,738 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-04 16:41:28,044 [WARNING] __main__ - Cell image not found: /tmp/tmp2s0xxpac.jpg_rows/row_0/col_0.png -2025-03-04 16:41:28,044 [WARNING] __main__ - Cell image not found: /tmp/tmp2s0xxpac.jpg_rows/row_0/col_1.png -2025-03-04 16:41:28,317 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-04 16:41:29,465 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c0.png -2025-03-04 16:41:30,769 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c1.png -2025-03-04 16:41:32,249 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c0.png -2025-03-04 16:41:33,665 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c1.png -2025-03-04 16:41:34,812 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 16:41:35,736 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 16:41:35,737 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-04 16:41:38,094 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r0_c0.png -2025-03-04 16:41:39,194 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c0.png -2025-03-04 16:41:40,315 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c1.png -2025-03-04 16:41:41,569 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c0.png -2025-03-04 16:41:42,914 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c1.png -2025-03-04 16:41:44,447 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c0.png -2025-03-04 16:41:45,567 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c1.png -2025-03-04 16:41:46,689 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 16:41:47,934 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 16:41:47,935 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-04 16:41:48,856 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r0_c0.png -2025-03-04 16:41:49,986 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c0.png -2025-03-04 16:41:51,601 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c1.png -2025-03-04 16:41:52,769 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 16:41:53,870 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 16:41:53,871 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=three -2025-03-04 16:41:56,443 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r0_c0.png -2025-03-04 16:41:57,555 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c0.png -2025-03-04 16:41:59,137 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c1.png -2025-03-04 16:42:00,476 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 16:42:01,609 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 16:42:01,610 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-04 16:42:04,361 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r0_c0.png -2025-03-04 16:42:05,501 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c0.png -2025-03-04 16:42:07,023 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c1.png -2025-03-04 16:42:08,304 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 16:42:09,424 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 16:42:09,425 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-04 16:42:12,614 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-04 16:42:13,436 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-04 16:42:13,691 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c1.png -2025-03-04 16:42:14,930 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c0.png -2025-03-04 16:42:16,395 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c1.png -2025-03-04 16:42:17,794 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 16:42:18,797 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 16:42:19,053 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c0.png -2025-03-04 16:42:20,378 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c1.png -2025-03-04 16:42:21,565 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 16:42:22,635 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 16:42:22,635 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-04 16:42:23,713 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r0_c0.png -2025-03-04 16:42:24,787 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c0.png -2025-03-04 16:42:26,077 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c1.png -2025-03-04 16:42:27,195 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 16:42:28,273 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 16:42:28,274 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-04 16:42:32,234 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r0_c0.png -2025-03-04 16:42:33,414 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c0.png -2025-03-04 16:42:35,062 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c1.png -2025-03-04 16:42:36,362 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c0.png -2025-03-04 16:42:37,790 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c1.png -2025-03-04 16:42:38,877 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c0.png -2025-03-04 16:42:40,011 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c1.png -2025-03-04 16:42:41,094 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 16:42:42,019 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 16:42:42,020 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-04 16:42:45,163 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r0_c0.png -2025-03-04 16:42:46,253 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c0.png -2025-03-04 16:42:47,665 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c1.png -2025-03-04 16:42:48,812 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c0.png -2025-03-04 16:42:50,033 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c1.png -2025-03-04 16:42:51,432 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c0.png -2025-03-04 16:42:52,858 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c1.png -2025-03-04 16:42:54,216 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c0.png -2025-03-04 16:42:55,778 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c1.png -2025-03-04 16:42:56,931 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 16:42:57,851 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 16:42:57,851 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-04 16:43:01,201 [WARNING] __main__ - Cell image not found: /tmp/tmpdmvh3rc8.jpg_rows/row_0/col_0.png -2025-03-04 16:43:01,475 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-04 16:43:02,567 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c0.png -2025-03-04 16:43:04,176 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c1.png -2025-03-04 16:43:05,365 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c0.png -2025-03-04 16:43:06,802 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c1.png -2025-03-04 16:43:07,969 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 16:43:08,946 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 16:43:08,946 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-04 16:43:10,806 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r0_c0.png -2025-03-04 16:43:11,925 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c0.png -2025-03-04 16:43:13,286 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c1.png -2025-03-04 16:43:14,651 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c0.png -2025-03-04 16:43:16,070 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c1.png -2025-03-04 16:43:17,209 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 16:43:18,206 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 16:43:18,206 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-04 16:43:21,350 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r0_c0.png -2025-03-04 16:43:22,479 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c0.png -2025-03-04 16:43:24,003 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c1.png -2025-03-04 16:43:25,317 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c0.png -2025-03-04 16:43:26,815 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c1.png -2025-03-04 16:43:28,078 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 16:43:29,086 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 16:43:29,086 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-04 16:43:30,918 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r0_c0.png -2025-03-04 16:43:32,141 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c0.png -2025-03-04 16:43:33,282 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c1.png -2025-03-04 16:43:34,592 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c0.png -2025-03-04 16:43:36,080 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c1.png -2025-03-04 16:43:37,530 [WARNING] __main__ - Cell image not found: /tmp/tmp6_d2lvpn.jpg_rows/row_3/col_0.png -2025-03-04 16:43:37,531 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-04 16:43:40,529 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r0_c0.png -2025-03-04 16:43:41,854 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c0.png -2025-03-04 16:43:43,415 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c1.png -2025-03-04 16:43:45,170 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-04 16:43:46,291 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-04 16:43:46,292 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-04 16:43:48,973 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r0_c0.png -2025-03-04 16:43:50,109 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c0.png -2025-03-04 16:43:51,618 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c1.png -2025-03-04 16:43:52,724 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_22.jpg_r1_c1.png -2025-03-04 16:43:52,904 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-04 16:43:54,163 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-04 16:43:54,163 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-04 16:43:56,200 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r0_c0.png -2025-03-04 16:43:57,589 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c0.png -2025-03-04 16:43:59,010 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c1.png -2025-03-04 16:44:00,106 [WARNING] __main__ - Cell image not found: /tmp/tmp5l7mn427.jpg_rows/row_2/col_0.png -2025-03-04 16:44:00,107 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-04 16:44:03,906 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r0_c0.png -2025-03-04 16:44:05,120 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c0.png -2025-03-04 16:44:06,699 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c1.png -2025-03-04 16:44:08,013 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_24.jpg_r1_c1.png -2025-03-04 16:44:08,014 [WARNING] __main__ - Cell image not found: /tmp/tmp7f4e012v.jpg_rows/row_2/col_0.png -2025-03-04 16:44:08,019 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=three -2025-03-04 16:44:10,562 [WARNING] __main__ - Cell image not found: /tmp/tmpsr1107vb.jpg_rows/row_0/col_0.png -2025-03-04 16:44:10,823 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r1_c0.png -2025-03-04 16:44:12,067 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c0.png -2025-03-04 16:44:13,630 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c1.png -2025-03-04 16:44:15,001 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-04 16:44:16,162 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-04 16:44:16,163 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-04 16:44:18,257 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r0_c0.png -2025-03-04 16:44:19,367 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c0.png -2025-03-04 16:44:20,866 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c1.png -2025-03-04 16:44:22,029 [WARNING] __main__ - Cell image not found: /tmp/tmpm4jr09co.jpg_rows/row_2/col_0.png -2025-03-04 16:44:22,030 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-04 16:44:25,458 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r0_c0.png -2025-03-04 16:44:26,636 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c0.png -2025-03-04 16:44:28,117 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c1.png -2025-03-04 16:44:29,316 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c0.png -2025-03-04 16:44:30,892 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c1.png -2025-03-04 16:44:32,031 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-04 16:44:32,983 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-04 16:44:32,984 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=three -2025-03-04 16:44:35,702 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r0_c0.png -2025-03-04 16:44:37,077 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c0.png -2025-03-04 16:44:38,586 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c1.png -2025-03-04 16:44:40,000 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-04 16:44:41,005 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-04 16:44:41,006 [INFO] __main__ - Processing table image: /topic-extraction/img_29.jpg, columns=three -2025-03-04 16:44:42,801 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r0_c0.png -2025-03-04 16:44:43,877 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r1_c0.png -2025-03-04 16:44:45,297 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r1_c1.png -2025-03-04 16:44:46,572 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r2_c0.png -2025-03-04 16:44:47,560 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_29.jpg_r2_c0.png -2025-03-04 16:44:47,564 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/_subtopics.json -2025-03-04 16:44:47,893 [INFO] __main__ - GPU memory cleaned up. -2025-03-04 16:44:47,898 [INFO] __main__ - Processing completed successfully. -2025-03-04 17:13:14,000 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf -2025-03-04 17:13:14,813 [INFO] __main__ - Gemini returned subtopics: {'2.1AS units': [7, 22], '2.2A2 units': [23, 43]} -2025-03-04 17:13:14,814 [INFO] __main__ - Loaded 3543551 bytes from local file '/home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf' -2025-03-04 17:13:15,049 [INFO] __main__ - Computed global offset: 0 -2025-03-04 17:13:15,049 [INFO] __main__ - Processing pages (0-based): [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42] -2025-03-04 17:13:55,840 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-04 17:13:56,487 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-04 17:13:56,943 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-04 17:13:57,441 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-04 17:13:57,816 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-04 17:13:58,206 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-04 17:13:58,724 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-04 17:13:59,172 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-04 17:13:59,579 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-04 17:13:59,870 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-04 17:14:00,375 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-04 17:14:00,860 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-04 17:14:01,418 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-04 17:14:01,705 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-04 17:14:02,299 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-04 17:14:02,835 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-04 17:14:03,343 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-04 17:14:03,722 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-04 17:14:04,256 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-04 17:14:04,622 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-04 17:14:05,155 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-04 17:14:05,630 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-04 17:14:06,016 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-04 17:14:06,624 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-04 17:14:07,057 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-04 17:14:07,468 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-04 17:14:08,013 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-04 17:14:08,559 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-04 17:14:08,944 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_29.jpg -2025-03-04 17:14:09,370 [INFO] __main__ - Classifying images to detect tables. -2025-03-04 17:14:13,356 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-04 17:14:16,548 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c0.png -2025-03-04 17:14:17,824 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.png -2025-03-04 17:14:19,207 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c1.png -2025-03-04 17:14:20,785 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c0.png -2025-03-04 17:14:22,337 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c1.png -2025-03-04 17:14:24,117 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c0.png -2025-03-04 17:14:25,468 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c1.png -2025-03-04 17:14:26,797 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 17:14:27,715 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 17:14:27,715 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-04 17:14:31,016 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r0_c0.png -2025-03-04 17:14:32,468 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c0.png -2025-03-04 17:14:34,010 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c1.png -2025-03-04 17:14:37,127 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c0.png -2025-03-04 17:14:38,574 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c1.png -2025-03-04 17:14:40,014 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c0.png -2025-03-04 17:14:41,453 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c1.png -2025-03-04 17:14:43,026 [WARNING] __main__ - Cell image not found: /tmp/tmpgz3m3b9n.jpg_rows/row_4/col_0.png -2025-03-04 17:14:43,026 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-04 17:14:45,066 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r0_c0.png -2025-03-04 17:14:46,513 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c0.png -2025-03-04 17:14:48,054 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c1.png -2025-03-04 17:14:49,517 [WARNING] __main__ - Cell image not found: /tmp/tmpkn9damk4.jpg_rows/row_2/col_0.png -2025-03-04 17:14:49,518 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-04 17:14:51,857 [WARNING] __main__ - Cell image not found: /tmp/tmp_7v9cvwb.jpg_rows/row_0/col_0.png -2025-03-04 17:14:51,857 [WARNING] __main__ - Cell image not found: /tmp/tmp_7v9cvwb.jpg_rows/row_0/col_1.png -2025-03-04 17:14:52,128 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r1_c0.png -2025-03-04 17:14:53,246 [WARNING] __main__ - Cell image not found: /tmp/tmp_7v9cvwb.jpg_rows/row_1/col_1.png -2025-03-04 17:14:53,522 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c0.png -2025-03-04 17:14:54,896 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c1.png -2025-03-04 17:14:56,522 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c0.png -2025-03-04 17:14:57,958 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c1.png -2025-03-04 17:14:59,510 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 17:15:00,493 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 17:15:00,494 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-04 17:15:01,571 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r0_c0.png -2025-03-04 17:15:02,997 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c0.png -2025-03-04 17:15:04,424 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c1.png -2025-03-04 17:15:05,776 [WARNING] __main__ - Cell image not found: /tmp/tmpr68pawul.jpg_rows/row_2/col_0.png -2025-03-04 17:15:05,776 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-04 17:15:07,783 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r0_c0.png -2025-03-04 17:15:09,073 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c0.png -2025-03-04 17:15:10,711 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c1.png -2025-03-04 17:15:12,507 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 17:15:13,630 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 17:15:13,631 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-04 17:15:16,878 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-04 17:15:18,344 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c0.png -2025-03-04 17:15:19,949 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c1.png -2025-03-04 17:15:22,552 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c0.png -2025-03-04 17:15:23,888 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c1.png -2025-03-04 17:15:25,222 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 17:15:26,200 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 17:15:26,200 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-04 17:15:28,378 [WARNING] __main__ - Cell image not found: /tmp/tmpeauayzcm.jpg_rows/row_0/col_0.png -2025-03-04 17:15:28,378 [WARNING] __main__ - Cell image not found: /tmp/tmpeauayzcm.jpg_rows/row_0/col_1.png -2025-03-04 17:15:28,642 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-04 17:15:30,092 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c0.png -2025-03-04 17:15:31,485 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c1.png -2025-03-04 17:15:33,367 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c0.png -2025-03-04 17:15:34,783 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c1.png -2025-03-04 17:15:36,384 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 17:15:37,395 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 17:15:37,396 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-04 17:15:39,469 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r0_c0.png -2025-03-04 17:15:40,865 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c0.png -2025-03-04 17:15:42,177 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c1.png -2025-03-04 17:15:43,748 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c0.png -2025-03-04 17:15:45,111 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c1.png -2025-03-04 17:15:47,334 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c0.png -2025-03-04 17:15:48,513 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c1.png -2025-03-04 17:15:49,748 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 17:15:50,582 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 17:15:50,582 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-04 17:15:51,570 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r0_c0.png -2025-03-04 17:15:53,041 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c0.png -2025-03-04 17:15:54,468 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c1.png -2025-03-04 17:15:55,844 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 17:15:56,966 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 17:15:56,967 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=three -2025-03-04 17:15:59,374 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r0_c0.png -2025-03-04 17:16:00,804 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c0.png -2025-03-04 17:16:02,580 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c1.png -2025-03-04 17:16:04,173 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 17:16:05,062 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 17:16:05,062 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-04 17:16:07,653 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r0_c0.png -2025-03-04 17:16:09,201 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c0.png -2025-03-04 17:16:10,928 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c1.png -2025-03-04 17:16:12,739 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 17:16:13,735 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 17:16:13,735 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-04 17:16:16,756 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-04 17:16:17,689 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-04 17:16:17,947 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c1.png -2025-03-04 17:16:19,521 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c0.png -2025-03-04 17:16:21,310 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c1.png -2025-03-04 17:16:23,370 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 17:16:24,380 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 17:16:24,634 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c0.png -2025-03-04 17:16:26,009 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c1.png -2025-03-04 17:16:27,859 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 17:16:28,943 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 17:16:28,943 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-04 17:16:30,062 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r0_c0.png -2025-03-04 17:16:31,485 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c0.png -2025-03-04 17:16:32,831 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c1.png -2025-03-04 17:16:34,357 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 17:16:35,420 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 17:16:35,420 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-04 17:16:38,997 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r0_c0.png -2025-03-04 17:16:40,384 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c0.png -2025-03-04 17:16:42,086 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c1.png -2025-03-04 17:16:43,960 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c0.png -2025-03-04 17:16:45,362 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c1.png -2025-03-04 17:16:47,152 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c0.png -2025-03-04 17:16:48,540 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c1.png -2025-03-04 17:16:49,983 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 17:16:51,054 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 17:16:51,054 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-04 17:16:54,130 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r0_c0.png -2025-03-04 17:16:55,482 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c0.png -2025-03-04 17:16:56,770 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c1.png -2025-03-04 17:16:58,649 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c0.png -2025-03-04 17:16:59,944 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c1.png -2025-03-04 17:17:01,397 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c0.png -2025-03-04 17:17:02,716 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c1.png -2025-03-04 17:17:04,306 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c0.png -2025-03-04 17:17:05,735 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c1.png -2025-03-04 17:17:07,796 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 17:17:08,712 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 17:17:08,712 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-04 17:17:11,791 [WARNING] __main__ - Cell image not found: /tmp/tmp8t199g9l.jpg_rows/row_0/col_0.png -2025-03-04 17:17:12,053 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-04 17:17:13,388 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c0.png -2025-03-04 17:17:14,808 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c1.png -2025-03-04 17:17:16,828 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c0.png -2025-03-04 17:17:18,305 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c1.png -2025-03-04 17:17:20,126 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 17:17:21,082 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 17:17:21,082 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-04 17:17:22,864 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r0_c0.png -2025-03-04 17:17:24,349 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c0.png -2025-03-04 17:17:25,674 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c1.png -2025-03-04 17:17:27,576 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c0.png -2025-03-04 17:17:28,875 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c1.png -2025-03-04 17:17:30,338 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 17:17:31,459 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 17:17:31,460 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-04 17:17:33,895 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r0_c0.png -2025-03-04 17:17:35,505 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c0.png -2025-03-04 17:17:36,920 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c1.png -2025-03-04 17:17:38,707 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c0.png -2025-03-04 17:17:40,159 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c1.png -2025-03-04 17:17:42,150 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 17:17:43,069 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 17:17:43,069 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-04 17:17:44,770 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r0_c0.png -2025-03-04 17:17:46,112 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c0.png -2025-03-04 17:17:47,369 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c1.png -2025-03-04 17:17:48,764 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c0.png -2025-03-04 17:17:50,279 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c1.png -2025-03-04 17:17:52,008 [WARNING] __main__ - Cell image not found: /tmp/tmpyuhd9sl8.jpg_rows/row_3/col_0.png -2025-03-04 17:17:52,009 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-04 17:17:54,856 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r0_c0.png -2025-03-04 17:17:56,238 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c0.png -2025-03-04 17:17:58,121 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c1.png -2025-03-04 17:18:00,408 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-04 17:18:01,418 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-04 17:18:01,418 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-04 17:18:03,917 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r0_c0.png -2025-03-04 17:18:05,292 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c0.png -2025-03-04 17:18:07,082 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c1.png -2025-03-04 17:18:08,934 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-04 17:18:10,012 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-04 17:18:10,012 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-04 17:18:11,952 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r0_c0.png -2025-03-04 17:18:13,275 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c0.png -2025-03-04 17:18:14,714 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c1.png -2025-03-04 17:18:16,140 [WARNING] __main__ - Cell image not found: /tmp/tmp91opcy4g.jpg_rows/row_2/col_0.png -2025-03-04 17:18:16,140 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-04 17:18:19,748 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r0_c0.png -2025-03-04 17:18:21,092 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c0.png -2025-03-04 17:18:23,324 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c1.png -2025-03-04 17:18:26,880 [WARNING] __main__ - Cell image not found: /tmp/tmpigalpv91.jpg_rows/row_2/col_0.png -2025-03-04 17:18:26,880 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=three -2025-03-04 17:18:29,208 [WARNING] __main__ - Cell image not found: /tmp/tmppaoedyal.jpg_rows/row_0/col_0.png -2025-03-04 17:18:29,475 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r1_c0.png -2025-03-04 17:18:30,947 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c0.png -2025-03-04 17:18:33,064 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c1.png -2025-03-04 17:18:36,316 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-04 17:18:37,482 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-04 17:18:37,483 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-04 17:18:39,543 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r0_c0.png -2025-03-04 17:18:40,901 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c0.png -2025-03-04 17:18:42,749 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c1.png -2025-03-04 17:18:44,332 [WARNING] __main__ - Cell image not found: /tmp/tmp22n5c_8q.jpg_rows/row_2/col_0.png -2025-03-04 17:18:44,332 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-04 17:18:47,634 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r0_c0.png -2025-03-04 17:18:49,048 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c0.png -2025-03-04 17:18:50,572 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c1.png -2025-03-04 17:18:52,196 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c0.png -2025-03-04 17:18:53,636 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c1.png -2025-03-04 17:18:55,054 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-04 17:18:56,002 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-04 17:18:56,003 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=three -2025-03-04 17:18:58,520 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r0_c0.png -2025-03-04 17:18:59,970 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c0.png -2025-03-04 17:19:01,773 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c1.png -2025-03-04 17:19:03,587 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-04 17:19:04,755 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-04 17:19:04,755 [INFO] __main__ - Processing table image: /topic-extraction/img_29.jpg, columns=three -2025-03-04 17:19:06,526 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r0_c0.png -2025-03-04 17:19:07,817 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r1_c0.png -2025-03-04 17:19:09,284 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r1_c1.png -2025-03-04 17:19:10,915 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r2_c0.png -2025-03-04 17:19:11,969 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_29.jpg_r2_c0.png -2025-03-04 17:19:11,972 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/_subtopics.json -2025-03-04 17:19:12,278 [INFO] __main__ - GPU memory cleaned up. -2025-03-04 17:19:12,283 [INFO] __main__ - Processing completed successfully. -2025-03-04 17:28:37,803 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf -2025-03-04 17:28:38,622 [INFO] __main__ - Gemini returned subtopics: {'2.1AS units': [7, 22], '2.2A2 units': [23, 43]} -2025-03-04 17:28:38,624 [INFO] __main__ - Loaded 3543551 bytes from local file '/home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf' -2025-03-04 17:28:38,859 [INFO] __main__ - Computed global offset: 0 -2025-03-04 17:28:38,860 [INFO] __main__ - Processing pages (0-based): [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42] -2025-03-04 17:29:19,633 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-04 17:29:20,237 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-04 17:29:20,620 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-04 17:29:21,124 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-04 17:29:21,413 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-04 17:29:21,792 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-04 17:29:22,350 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-04 17:29:22,827 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-04 17:29:23,260 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-04 17:29:23,574 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-04 17:29:24,083 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-04 17:29:24,602 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-04 17:29:25,141 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-04 17:29:25,442 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-04 17:29:26,082 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-04 17:29:26,668 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-04 17:29:27,176 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-04 17:29:27,575 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-04 17:29:28,110 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-04 17:29:28,509 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-04 17:29:29,046 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-04 17:29:29,553 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-04 17:29:29,936 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-04 17:29:30,523 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-04 17:29:31,034 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-04 17:29:31,417 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-04 17:29:31,991 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-04 17:29:32,506 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-04 17:29:32,884 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_29.jpg -2025-03-04 17:29:33,308 [INFO] __main__ - Classifying images to detect tables. -2025-03-04 17:59:52,883 [INFO] __main__ - GPU memory cleaned up. -2025-03-04 18:24:55,659 [INFO] __main__ - Processing PDF: /home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf -2025-03-04 18:24:56,486 [INFO] __main__ - Gemini returned subtopics: {'2.1AS units': [7, 22], '2.2A2 units': [23, 43]} -2025-03-04 18:24:56,487 [INFO] __main__ - Loaded 3543551 bytes from local file '/home/user/app/input_output/wjec-gce-as-a-economics-specification-from-2015.pdf' -2025-03-04 18:24:56,724 [INFO] __main__ - Computed global offset: 0 -2025-03-04 18:24:56,725 [INFO] __main__ - Processing pages (0-based): [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42] -2025-03-04 18:26:37,627 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_1.jpg -2025-03-04 18:26:38,287 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_2.jpg -2025-03-04 18:26:38,720 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_3.jpg -2025-03-04 18:26:39,215 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_4.jpg -2025-03-04 18:26:39,531 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_5.jpg -2025-03-04 18:26:39,917 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_6.jpg -2025-03-04 18:26:40,490 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_7.jpg -2025-03-04 18:26:40,968 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_8.jpg -2025-03-04 18:26:41,372 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_9.jpg -2025-03-04 18:26:41,675 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_10.jpg -2025-03-04 18:26:42,251 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_11.jpg -2025-03-04 18:26:42,757 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_12.jpg -2025-03-04 18:26:43,326 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_13.jpg -2025-03-04 18:26:43,626 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_14.jpg -2025-03-04 18:26:44,254 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_15.jpg -2025-03-04 18:26:44,797 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_16.jpg -2025-03-04 18:26:45,300 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_17.jpg -2025-03-04 18:26:45,689 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_18.jpg -2025-03-04 18:26:46,237 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_19.jpg -2025-03-04 18:26:46,642 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_20.jpg -2025-03-04 18:26:47,162 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_21.jpg -2025-03-04 18:26:47,668 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_22.jpg -2025-03-04 18:26:48,043 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_23.jpg -2025-03-04 18:26:48,639 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_24.jpg -2025-03-04 18:26:49,154 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_25.jpg -2025-03-04 18:26:49,534 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_26.jpg -2025-03-04 18:26:50,096 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_27.jpg -2025-03-04 18:26:50,670 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_28.jpg -2025-03-04 18:26:51,044 [INFO] __main__ - Uploaded to S3: /topic-extraction/img_29.jpg -2025-03-04 18:26:51,475 [INFO] __main__ - Classifying images to detect tables. -2025-03-04 18:26:56,074 [INFO] __main__ - Processing table image: /topic-extraction/img_1.jpg, columns=three -2025-03-04 18:26:59,389 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r0_c0.png -2025-03-04 18:27:00,348 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_1.jpg_r0_c0.png -2025-03-04 18:27:00,601 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r1_c0.png -2025-03-04 18:27:10,689 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r2_c0.png -2025-03-04 18:27:11,820 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r3_c0.png -2025-03-04 18:27:12,855 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 18:27:13,889 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_1.jpg_r4_c0.png -2025-03-04 18:27:13,890 [INFO] __main__ - Processing table image: /topic-extraction/img_2.jpg, columns=three -2025-03-04 18:27:17,341 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r0_c0.png -2025-03-04 18:27:18,536 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r1_c0.png -2025-03-04 18:27:19,842 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r2_c0.png -2025-03-04 18:27:20,887 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_2.jpg_r3_c0.png -2025-03-04 18:27:22,626 [WARNING] __main__ - Cell image not found: /tmp/tmpns_p2pw7.jpg_rows/row_4/col_0.png -2025-03-04 18:27:22,626 [INFO] __main__ - Processing table image: /topic-extraction/img_3.jpg, columns=three -2025-03-04 18:27:24,756 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r0_c0.png -2025-03-04 18:27:25,630 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_3.jpg_r0_c0.png -2025-03-04 18:27:25,976 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_3.jpg_r1_c0.png -2025-03-04 18:27:26,909 [WARNING] __main__ - Cell image not found: /tmp/tmpmkqp5iik.jpg_rows/row_2/col_0.png -2025-03-04 18:27:26,910 [INFO] __main__ - Processing table image: /topic-extraction/img_4.jpg, columns=three -2025-03-04 18:27:29,569 [WARNING] __main__ - Cell image not found: /tmp/tmpnakrpg49.jpg_rows/row_0/col_0.png -2025-03-04 18:27:29,569 [WARNING] __main__ - Cell image not found: /tmp/tmpnakrpg49.jpg_rows/row_0/col_1.png -2025-03-04 18:27:29,835 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r1_c0.png -2025-03-04 18:27:30,823 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_4.jpg_r1_c0.png -2025-03-04 18:27:30,823 [WARNING] __main__ - Cell image not found: /tmp/tmpnakrpg49.jpg_rows/row_1/col_1.png -2025-03-04 18:27:31,085 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r2_c0.png -2025-03-04 18:27:33,674 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r3_c0.png -2025-03-04 18:27:34,672 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 18:27:35,592 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_4.jpg_r4_c0.png -2025-03-04 18:27:35,593 [INFO] __main__ - Processing table image: /topic-extraction/img_5.jpg, columns=three -2025-03-04 18:27:36,679 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r0_c0.png -2025-03-04 18:27:37,655 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_5.jpg_r0_c0.png -2025-03-04 18:27:37,997 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_5.jpg_r1_c0.png -2025-03-04 18:27:38,787 [WARNING] __main__ - Cell image not found: /tmp/tmp59baffv6.jpg_rows/row_2/col_0.png -2025-03-04 18:27:38,787 [INFO] __main__ - Processing table image: /topic-extraction/img_6.jpg, columns=three -2025-03-04 18:27:40,808 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r0_c0.png -2025-03-04 18:27:41,806 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_6.jpg_r0_c0.png -2025-03-04 18:27:42,094 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r1_c0.png -2025-03-04 18:27:43,132 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 18:27:44,097 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_6.jpg_r2_c0.png -2025-03-04 18:27:44,097 [INFO] __main__ - Processing table image: /topic-extraction/img_7.jpg, columns=three -2025-03-04 18:27:47,411 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-04 18:27:48,353 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_7.jpg_r0_c0.png -2025-03-04 18:27:48,705 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r1_c0.png -2025-03-04 18:27:49,963 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r2_c0.png -2025-03-04 18:27:50,936 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 18:27:52,024 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_7.jpg_r3_c0.png -2025-03-04 18:27:52,025 [INFO] __main__ - Processing table image: /topic-extraction/img_8.jpg, columns=three -2025-03-04 18:27:54,377 [WARNING] __main__ - Cell image not found: /tmp/tmpsppe7tt4.jpg_rows/row_0/col_0.png -2025-03-04 18:27:54,378 [WARNING] __main__ - Cell image not found: /tmp/tmpsppe7tt4.jpg_rows/row_0/col_1.png -2025-03-04 18:27:54,639 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-04 18:27:55,574 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_8.jpg_r1_c0.png -2025-03-04 18:27:55,856 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r2_c0.png -2025-03-04 18:27:56,935 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r3_c0.png -2025-03-04 18:27:57,936 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 18:27:58,830 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_8.jpg_r4_c0.png -2025-03-04 18:27:58,830 [INFO] __main__ - Processing table image: /topic-extraction/img_9.jpg, columns=three -2025-03-04 18:28:00,927 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r0_c0.png -2025-03-04 18:28:01,839 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_9.jpg_r0_c0.png -2025-03-04 18:28:02,124 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r1_c0.png -2025-03-04 18:28:03,147 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r2_c0.png -2025-03-04 18:28:04,318 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r3_c0.png -2025-03-04 18:28:05,234 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 18:28:06,333 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_9.jpg_r4_c0.png -2025-03-04 18:28:06,333 [INFO] __main__ - Processing table image: /topic-extraction/img_10.jpg, columns=three -2025-03-04 18:28:07,300 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r0_c0.png -2025-03-04 18:28:08,246 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_10.jpg_r0_c0.png -2025-03-04 18:28:08,508 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r1_c0.png -2025-03-04 18:28:09,569 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 18:28:10,602 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_10.jpg_r2_c0.png -2025-03-04 18:28:10,603 [INFO] __main__ - Processing table image: /topic-extraction/img_11.jpg, columns=three -2025-03-04 18:28:13,214 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r0_c0.png -2025-03-04 18:28:14,131 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_11.jpg_r0_c0.png -2025-03-04 18:28:14,477 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r1_c0.png -2025-03-04 18:28:15,765 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 18:28:16,868 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_11.jpg_r2_c0.png -2025-03-04 18:28:16,869 [INFO] __main__ - Processing table image: /topic-extraction/img_12.jpg, columns=three -2025-03-04 18:28:19,488 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r0_c0.png -2025-03-04 18:28:20,477 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_12.jpg_r0_c0.png -2025-03-04 18:28:20,850 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r1_c0.png -2025-03-04 18:28:21,976 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 18:28:22,922 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_12.jpg_r2_c0.png -2025-03-04 18:28:22,923 [INFO] __main__ - Processing table image: /topic-extraction/img_13.jpg, columns=three -2025-03-04 18:28:26,026 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-04 18:28:26,939 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r0_c0.png -2025-03-04 18:28:27,213 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r0_c1.png -2025-03-04 18:28:28,270 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r0_c1.png -2025-03-04 18:28:28,611 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r1_c0.png -2025-03-04 18:28:29,683 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 18:28:30,673 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r2_c0.png -2025-03-04 18:28:30,933 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r3_c0.png -2025-03-04 18:28:31,996 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 18:28:32,949 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_13.jpg_r4_c0.png -2025-03-04 18:28:32,950 [INFO] __main__ - Processing table image: /topic-extraction/img_14.jpg, columns=three -2025-03-04 18:28:34,332 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r0_c0.png -2025-03-04 18:28:35,272 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_14.jpg_r0_c0.png -2025-03-04 18:28:35,541 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r1_c0.png -2025-03-04 18:28:36,537 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 18:28:37,794 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_14.jpg_r2_c0.png -2025-03-04 18:28:37,794 [INFO] __main__ - Processing table image: /topic-extraction/img_15.jpg, columns=three -2025-03-04 18:28:43,119 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r0_c0.png -2025-03-04 18:28:44,084 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_15.jpg_r0_c0.png -2025-03-04 18:28:44,353 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r1_c0.png -2025-03-04 18:28:45,692 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r2_c0.png -2025-03-04 18:28:46,679 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r3_c0.png -2025-03-04 18:28:47,545 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 18:28:48,749 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_15.jpg_r4_c0.png -2025-03-04 18:28:48,749 [INFO] __main__ - Processing table image: /topic-extraction/img_16.jpg, columns=three -2025-03-04 18:28:51,810 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r0_c0.png -2025-03-04 18:28:52,802 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_16.jpg_r0_c0.png -2025-03-04 18:28:53,064 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r1_c0.png -2025-03-04 18:28:54,144 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r2_c0.png -2025-03-04 18:28:55,133 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r3_c0.png -2025-03-04 18:28:57,845 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r4_c0.png -2025-03-04 18:28:58,855 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 18:28:59,722 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_16.jpg_r5_c0.png -2025-03-04 18:28:59,722 [INFO] __main__ - Processing table image: /topic-extraction/img_17.jpg, columns=three -2025-03-04 18:29:02,875 [WARNING] __main__ - Cell image not found: /tmp/tmp0emfx_zt.jpg_rows/row_0/col_0.png -2025-03-04 18:29:03,148 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-04 18:29:04,098 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_17.jpg_r1_c0.png -2025-03-04 18:29:04,361 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r2_c0.png -2025-03-04 18:29:05,885 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r3_c0.png -2025-03-04 18:29:06,881 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 18:29:07,738 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_17.jpg_r4_c0.png -2025-03-04 18:29:07,739 [INFO] __main__ - Processing table image: /topic-extraction/img_18.jpg, columns=three -2025-03-04 18:29:09,552 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r0_c0.png -2025-03-04 18:29:10,757 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r1_c0.png -2025-03-04 18:29:11,784 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r2_c0.png -2025-03-04 18:29:12,800 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 18:29:13,609 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_18.jpg_r3_c0.png -2025-03-04 18:29:13,610 [INFO] __main__ - Processing table image: /topic-extraction/img_19.jpg, columns=three -2025-03-04 18:29:16,305 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r0_c0.png -2025-03-04 18:29:17,210 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_19.jpg_r0_c0.png -2025-03-04 18:29:17,472 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r1_c0.png -2025-03-04 18:29:18,587 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r2_c0.png -2025-03-04 18:29:19,610 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 18:29:20,792 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_19.jpg_r3_c0.png -2025-03-04 18:29:20,792 [INFO] __main__ - Processing table image: /topic-extraction/img_20.jpg, columns=three -2025-03-04 18:29:22,579 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r0_c0.png -2025-03-04 18:29:23,599 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_20.jpg_r0_c0.png -2025-03-04 18:29:23,861 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r1_c0.png -2025-03-04 18:29:24,796 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_20.jpg_r2_c0.png -2025-03-04 18:29:25,612 [WARNING] __main__ - Cell image not found: /tmp/tmpmxenc_0d.jpg_rows/row_3/col_0.png -2025-03-04 18:29:25,613 [INFO] __main__ - Processing table image: /topic-extraction/img_21.jpg, columns=three -2025-03-04 18:29:28,446 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r0_c0.png -2025-03-04 18:29:29,404 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_21.jpg_r0_c0.png -2025-03-04 18:29:29,814 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r1_c0.png -2025-03-04 18:29:30,864 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-04 18:29:31,899 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_21.jpg_r2_c0.png -2025-03-04 18:29:31,899 [INFO] __main__ - Processing table image: /topic-extraction/img_22.jpg, columns=three -2025-03-04 18:29:34,452 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r0_c0.png -2025-03-04 18:29:35,395 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_22.jpg_r0_c0.png -2025-03-04 18:29:35,740 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r1_c0.png -2025-03-04 18:29:36,880 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-04 18:29:37,830 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_22.jpg_r2_c0.png -2025-03-04 18:29:37,830 [INFO] __main__ - Processing table image: /topic-extraction/img_23.jpg, columns=three -2025-03-04 18:29:39,773 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r0_c0.png -2025-03-04 18:29:40,725 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_23.jpg_r0_c0.png -2025-03-04 18:29:40,986 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_23.jpg_r1_c0.png -2025-03-04 18:29:41,800 [WARNING] __main__ - Cell image not found: /tmp/tmp1_2b4e5z.jpg_rows/row_2/col_0.png -2025-03-04 18:29:41,800 [INFO] __main__ - Processing table image: /topic-extraction/img_24.jpg, columns=three -2025-03-04 18:29:45,437 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r0_c0.png -2025-03-04 18:29:46,443 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_24.jpg_r0_c0.png -2025-03-04 18:29:46,788 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_24.jpg_r1_c0.png -2025-03-04 18:29:47,654 [WARNING] __main__ - Cell image not found: /tmp/tmpyd5fc1x8.jpg_rows/row_2/col_0.png -2025-03-04 18:29:47,654 [INFO] __main__ - Processing table image: /topic-extraction/img_25.jpg, columns=three -2025-03-04 18:29:49,997 [WARNING] __main__ - Cell image not found: /tmp/tmpje6qj8ty.jpg_rows/row_0/col_0.png -2025-03-04 18:29:50,258 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r1_c0.png -2025-03-04 18:29:51,237 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_25.jpg_r1_c0.png -2025-03-04 18:29:51,649 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r2_c0.png -2025-03-04 18:29:52,817 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-04 18:29:53,849 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_25.jpg_r3_c0.png -2025-03-04 18:29:53,849 [INFO] __main__ - Processing table image: /topic-extraction/img_26.jpg, columns=three -2025-03-04 18:29:55,903 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r0_c0.png -2025-03-04 18:29:56,784 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_26.jpg_r0_c0.png -2025-03-04 18:29:57,121 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_26.jpg_r1_c0.png -2025-03-04 18:29:58,092 [WARNING] __main__ - Cell image not found: /tmp/tmple_xivqw.jpg_rows/row_2/col_0.png -2025-03-04 18:29:58,092 [INFO] __main__ - Processing table image: /topic-extraction/img_27.jpg, columns=three -2025-03-04 18:30:01,339 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r0_c0.png -2025-03-04 18:30:02,324 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_27.jpg_r0_c0.png -2025-03-04 18:30:02,680 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r1_c0.png -2025-03-04 18:30:03,795 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r2_c0.png -2025-03-04 18:30:04,805 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-04 18:30:05,808 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_27.jpg_r3_c0.png -2025-03-04 18:30:05,809 [INFO] __main__ - Processing table image: /topic-extraction/img_28.jpg, columns=three -2025-03-04 18:30:08,340 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r0_c0.png -2025-03-04 18:30:09,205 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_28.jpg_r0_c0.png -2025-03-04 18:30:09,541 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r1_c0.png -2025-03-04 18:30:11,786 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-04 18:30:12,603 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_28.jpg_r2_c0.png -2025-03-04 18:30:12,603 [INFO] __main__ - Processing table image: /topic-extraction/img_29.jpg, columns=three -2025-03-04 18:30:14,423 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r0_c0.png -2025-03-04 18:30:15,408 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_29.jpg_r0_c0.png -2025-03-04 18:30:15,669 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r1_c0.png -2025-03-04 18:30:18,844 [INFO] __main__ - Uploaded to S3: /topic-extraction/cells/img_29.jpg_r2_c0.png -2025-03-04 18:30:20,616 [INFO] __main__ - Deleted empty cell image from S3: /topic-extraction/cells/img_29.jpg_r2_c0.png -2025-03-04 18:30:20,620 [INFO] __main__ - Final subtopics JSON saved locally at /home/user/app/pearson_json/_subtopics.json -2025-03-04 18:30:20,956 [INFO] __main__ - GPU memory cleaned up. -2025-03-04 18:30:20,961 [INFO] __main__ - Processing completed successfully.