trevorpfiz commited on
Commit
ae9d014
·
1 Parent(s): 8f90477

iterating to get pdfs ocr working

Browse files
PRD.md ADDED
@@ -0,0 +1,186 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # VLM Playground (PreviewSpace) — Product Requirements Document
2
+
3
+ ## Summary
4
+
5
+ An internal Gradio Blocks app for rapid, structured experimentation with a Vision-Language Model (initially `dots.ocr`). It mirrors the reference playground but is deliberately minimal: stateless by default, no run history, focused on feeling model performance. Supports PDF/image upload, page preview and navigation, page-range parsing, and result views (Markdown Render, Markdown Raw Text, Current Page JSON) with preserved scroll position. Designed to run locally or on Hugging Face Spaces.
6
+
7
+ ## Goals
8
+
9
+ - **Fast iteration**: Upload, prompt, parse, iterate in seconds with minimal ceremony.
10
+ - **Model-light**: Start with one model (`dots.ocr`), optional model selector later. No provider switching UI.
11
+ - **Structured output**: First-class JSON output and markdown preview.
12
+ - **Stateless by default**: No run history or persistence beyond the current browser session unless explicitly downloading.
13
+ - **Document-centric UX**: Multi-page PDF preview, page navigation, per-page execution, and page-range parsing.
14
+
15
+ ## Non-Goals
16
+
17
+ - Not a full labeling platform or production extraction pipeline.
18
+ - Not a dataset hosting service or long-term data store for PHI.
19
+ - Not a fine-tuning/training product; inference playground only.
20
+ - No bounding-box drawing or manual annotation tools in v1.
21
+
22
+ ## Primary Users / Personas
23
+
24
+ - **Applied Researcher / Data Scientist**: Tries different prompts/models, collects structured outputs.
25
+ - **ML Engineer**: Prototypes pipelines, compares providers, validates latency/cost.
26
+ - **Domain Expert (e.g., Clinical Analyst)**: Uses curated templates to extract specific fields.
27
+
28
+ ## Key User Stories
29
+
30
+ - As a user, I can upload a PDF or image, select a template prompt, and click Parse to see Markdown and JSON results.
31
+ - As a user, I can preview pages, specify a page range to parse, and run per-page extraction.
32
+ - As a user, I can jump to a specific page index in a PDF and use Prev/Next controls.
33
+ - As a user, I can switch between result tabs (Markdown Render, Markdown Raw Text, Current Page JSON) without losing scroll position.
34
+ - As a user, I can download the results for my current session as a ZIP or JSON/Markdown.
35
+ - As a user, I can tweak the prompt and basic model settings and quickly re-run.
36
+
37
+ ## UX Requirements (inspired by dots.ocr playground)
38
+
39
+ - **Left Panel — Upload & Select**
40
+ - Drag-and-drop or file picker for PNG/JPG/PDF; show file name and size.
41
+ - Optional Examples dropdown (curated sample docs and pre-baked prompts).
42
+ - File ingestion for PDFs extracts page thumbnails and page count.
43
+ - **Left Panel — Prompt & Actions**
44
+ - Prompt Template select; Current Prompt editor (multiline with variable chips).
45
+ - Actions: Parse (primary), Clear (secondary).
46
+ - Show prompt variables, e.g., `bbox`, `category`, `page_number`.
47
+ - **Left Panel — Advanced Configuration**
48
+ - Preprocessing toggle (fitz-like DPI upsample for low-res images).
49
+ - Minimal server/model config: Host/Port for local inference or a dropdown for on-host models.
50
+ - Page selection: single page, page range, or all.
51
+ - **Center — File Preview**
52
+ - Large page preview with pan/zoom; page navigator (Prev/Next and page picker).
53
+ - Page jump field to go directly to page N.
54
+ - **Right Panel — Result Display**
55
+ - Tabs: Markdown Render Preview, Markdown Raw Text, Current Page JSON.
56
+ - Preserve scroll position when switching tabs.
57
+ - Copy-to-clipboard and a Download Results button.
58
+
59
+ ## Functional Requirements
60
+
61
+ - **File Handling**
62
+ - Accept PDF (up to 300 pages) and images (PNG/JPG/WebP). Max upload 50 MB (configurable).
63
+ - Extract page images for preview; store temp files locally (ephemeral) with TTL.
64
+ - Provide page-level selection and batching.
65
+ - **Prompting**
66
+ - Template library with variables and descriptions. Variables can be sourced from UI state (page, bbox list) or user input.
67
+ - System prompt + user prompt fields; allow few-shot examples.
68
+ - Presets for common tasks (layout extraction, table extraction, key-value extraction, captioning).
69
+ - **Model Support**
70
+ - Start with `dots.ocr` via the official parser or REST endpoint.
71
+ - Optional: dropdown to switch among `dots.ocr` model variants if present on the host. No cross-provider switching UI.
72
+ - **Execution**
73
+ - Run per-page or whole-document, controlled by UI. Concurrency limit (default 3).
74
+ - Timeouts and retries surfaced to UI; cancellation supported.
75
+ - Caching: request hash on (file checksum, page, prompt, params, model) to avoid recomputation.
76
+ - **Outputs**
77
+ - Markdown Render, Raw Markdown, and Current Page JSON.
78
+ - Export: Download button to export combined Markdown, per-page JSONL, and all artifacts as a ZIP.
79
+ - **Examples Gallery**
80
+ - Preloaded example docs and templates to demonstrate patterns (OCR table, K/V extraction, figure captioning, layout detection).
81
+ - **Observability**
82
+ - Show basic runtime info (latency, model id) inline; no history or centralized logs in v1.
83
+
84
+ ## Data Model (high-level)
85
+
86
+ - In-memory, per-session structures only; no database.
87
+ - **Document**: id, name, type, checksum, page count, temp storage path, created_at.
88
+ - **Page**: document_id, page_index, image_path, width, height, preview_thumbnail.
89
+ - **Template**: id, name, description, model_defaults, prompt_text, output_schema (optional JSON Schema), variables.
90
+
91
+ ## JSON Output Guidance
92
+
93
+ - For structured tasks, templates may specify an output schema. The UI validates model JSON and highlights issues.
94
+ - All results stored as JSON lines per page with summary aggregation.
95
+
96
+ ## Security & Compliance
97
+
98
+ - Internal-only; access requires SSO or VPN.
99
+ - Sensitive documents (e.g., PHI) processed only against approved providers/endpoints. Warn when a provider is external.
100
+ - Ephemeral storage with TTL auto-clean; configurable retention. Redact logs where needed.
101
+
102
+ ## Performance Targets
103
+
104
+ - Cold start to first parse: < 10s on typical PDFs (<= 20 pages) with network providers.
105
+ - Per-page preview render: < 500ms after page image generation.
106
+ - Concurrency: default 3 parallel page requests; configurable up to 10.
107
+ - Throughput: 1,000 pages/day per user on average use without manual scaling.
108
+
109
+ ## Error States & Edge Cases
110
+
111
+ - Unsupported file types or oversize files; clear messaging and guardrails.
112
+ - Pages with extreme aspect ratios or very small text; suggest preprocessing.
113
+ - Provider rate limits; exponential backoff and UI feedback.
114
+ - Invalid model JSON; surface diffs and attempt best-effort JSON repair (opt-in).
115
+
116
+ ## Architecture (proposed)
117
+
118
+ - **App**: Single Gradio Blocks app (Python). No separate backend required.
119
+ - **Execution**: Use `uv run` locally. Designed to run as-is on Hugging Face Spaces.
120
+ - **Model**: `dots.ocr` via local parser or REST endpoint; configurable host/port.
121
+ - **Storage**: Ephemeral `/tmp/previewspace/*`; cleared at session end or TTL.
122
+ - **Caching**: Optional on-disk cache keyed by content hash + prompt + params + model.
123
+
124
+ ## API Surface (v1)
125
+
126
+ - Pure Gradio callbacks; no public REST API. Optional: expose simple `/healthz`.
127
+
128
+ ## Templates (initial set)
129
+
130
+ - **Layout Extraction**: Return list of elements with `bbox`, `category`, and `text` within bbox.
131
+ - **Table Extraction**: Return rows/columns as structured JSON; include confidence and cell bboxes.
132
+ - **Key-Value Extraction**: Extract specified fields with locations and normalized values.
133
+ - **Captioning/Description**: Summarize or caption selected regions or whole pages.
134
+
135
+ ## Privacy-by-Design Defaults
136
+
137
+ - Local processing preferred where possible; clear visual indicator when sending to external APIs.
138
+ - Redaction utilities for logs; toggle to disable request logging entirely.
139
+
140
+ ## Success Metrics
141
+
142
+ - Time-to-first-result after upload.
143
+ - Number of saved runs and templates re-used.
144
+ - Reduction in manual extraction time for a representative task.
145
+ - User satisfaction (quick pulse after saved runs).
146
+
147
+ ## Release Plan
148
+
149
+ - **M1 (v0.1) — Core Playground**
150
+ - Upload PDF/image; page preview and navigation.
151
+ - Parse with one provider; show Markdown and JSON; save runs; export JSON.
152
+ - Basic provider config (host/port/api key) and preprocessing toggle.
153
+ - Acceptance: A user can replicate a layout extraction example end-to-end in < 2 minutes.
154
+ - **M2 (v0.2) — Templates, Regions, and Examples**
155
+ - Template library + editor; draw/save bboxes; per-page runs; examples gallery.
156
+ - Multiple providers; concurrency and caching; logs and token usage.
157
+ - Acceptance: A user can create a new template with variables and run it across 10 pages with regions in one click.
158
+ - **M3 (v0.3) — Projects and Evals**
159
+ - Projects grouping; batch runs over documents; dataset export; simple eval harness with spot checks.
160
+ - Acceptance: A user can run a project over 100 pages and export an evaluation-ready JSONL in < 10 minutes.
161
+
162
+ ## Open Questions
163
+
164
+ - Do we require strict JSON schema validation with auto-repair, or soft validation with warnings?
165
+ - What are the approved external providers for sensitive documents?
166
+ - Should we include table renderers in the UI, or keep to JSON/Markdown only?
167
+ - How long should run artifacts persist by default (e.g., 7 days)?
168
+
169
+ ## Risks & Mitigations
170
+
171
+ - **External API variability**: Abstract through connectors; provide stubs/mocks for local dev.
172
+ - **Document diversity**: Offer preprocessing toggles and template variables; maintain an examples gallery.
173
+ - **Cost visibility**: Track token usage and estimated cost per run; warn when large batches are selected.
174
+
175
+ ## Appendices
176
+
177
+ ### Example: Layout Extraction Prompt (concept)
178
+
179
+ ```text
180
+ System: You are a vision-language model that outputs structured JSON only.
181
+ User: Please output the layout information from the PDF page image. For each element, return:
182
+ - bbox: [x1, y1, x2, y2] in image pixels
183
+ - category: string label from {"title","header","paragraph","table","figure","footnote"}
184
+ - text: content within bbox
185
+ Return JSON: {"elements": [{"bbox": [..], "category": "..", "text": ".."}], "page": <number>}.
186
+ ```
pyproject.toml CHANGED
@@ -5,7 +5,18 @@ description = "Minimal Gradio demo containerized with uv"
5
  readme = "README.md"
6
  requires-python = ">=3.12"
7
  dependencies = [
 
 
8
  "gradio>=5.41.1",
 
 
 
 
 
 
 
 
 
9
  ]
10
 
11
  [build-system]
 
5
  readme = "README.md"
6
  requires-python = ">=3.12"
7
  dependencies = [
8
+ "accelerate>=0.33.0",
9
+ "einops>=0.7.0",
10
  "gradio>=5.41.1",
11
+ "huggingface-hub[cli]>=0.34.3",
12
+ "pillow>=10.3.0",
13
+ "pymupdf>=1.26.3",
14
+ "qwen-vl-utils>=0.0.11",
15
+ "requests>=2.32.0",
16
+ "safetensors>=0.4.5",
17
+ "torch>=2.8.0",
18
+ "torchvision>=0.23.0",
19
+ "transformers>=4.55.0",
20
  ]
21
 
22
  [build-system]
src/vlm_playground/app.py CHANGED
@@ -1,43 +1,9 @@
1
- import gradio as gr
2
 
3
 
4
- def greet(name: str) -> str:
5
- return f"Hello {name}!!"
6
-
7
-
8
- demo = gr.Interface(
9
- fn=greet,
10
- inputs="text",
11
- outputs="text",
12
- title="VLM Playground Gradio Demo",
13
- description="Simple demo app to verify Gradio + uv Docker setup.",
14
- )
15
-
16
-
17
- def main() -> None:
18
- demo.launch(server_name="0.0.0.0", server_port=7860)
19
-
20
-
21
- if __name__ == "__main__":
22
- main()
23
-
24
- import gradio as gr
25
-
26
-
27
- def greet(name: str) -> str:
28
- return f"Hello {name}!!"
29
-
30
-
31
- def run():
32
- demo = gr.Interface(
33
- fn=greet,
34
- inputs="text",
35
- outputs="text",
36
- title="VLM Playground Gradio Demo",
37
- description="Simple demo app to verify Gradio + uv Docker setup.",
38
- )
39
-
40
- demo.launch(server_name="0.0.0.0", server_port=7860)
41
 
42
 
43
  if __name__ == "__main__":
 
1
+ from .preview_app import create_blocks_app
2
 
3
 
4
+ def run() -> None:
5
+ demo = create_blocks_app()
6
+ demo.queue(max_size=16).launch(server_name="0.0.0.0", server_port=7860)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
 
8
 
9
  if __name__ == "__main__":
src/vlm_playground/preview_app.py ADDED
@@ -0,0 +1,835 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gc
2
+ import hashlib
3
+ import json
4
+ import math
5
+ import os
6
+ import re
7
+ from io import BytesIO
8
+ from typing import Any, Dict, List, Optional, Tuple
9
+
10
+ import fitz # PyMuPDF
11
+ import gradio as gr
12
+ import requests
13
+ import torch
14
+ from huggingface_hub import snapshot_download
15
+ from PIL import Image, ImageDraw, ImageFont
16
+ from qwen_vl_utils import process_vision_info
17
+ from transformers import AutoModelForCausalLM, AutoProcessor
18
+
19
+ from .utils.constants import IMAGE_FACTOR, MAX_PIXELS, MIN_PIXELS
20
+ from .utils.prompts import dict_promptmode_to_prompt
21
+
22
+ # ============================
23
+ # Constants and configuration
24
+ # ============================
25
+ APP_TITLE = "PreviewSpace — VLM Playground"
26
+ TMP_DIR = "/tmp/previewspace"
27
+ MODELS_DIR = os.path.join(TMP_DIR, "models")
28
+ DOTS_REPO_ID = "rednote-hilab/dots.ocr"
29
+ DOTS_LOCAL_DIR = os.path.join(MODELS_DIR, "dots.ocr")
30
+
31
+ DEFAULT_PROMPT = dict_promptmode_to_prompt.get(
32
+ "prompt_layout_all_en",
33
+ (
34
+ "Please output the layout information from the PDF page image. For each element, return: "
35
+ 'bbox: [x1, y1, x2, y2], category from {"title","header","paragraph","table","figure","footnote"}, and text. '
36
+ 'Return JSON: {"elements": [{"bbox": [..], "category": "..", "text": ".."}], "page": <number>}'
37
+ ),
38
+ )
39
+
40
+
41
+ os.makedirs(TMP_DIR, exist_ok=True)
42
+ os.makedirs(MODELS_DIR, exist_ok=True)
43
+
44
+
45
+ # ===========
46
+ # Utilities
47
+ # ===========
48
+ def round_by_factor(number: int, factor: int) -> int:
49
+ return round(number / factor) * factor
50
+
51
+
52
+ def smart_resize(
53
+ height: int,
54
+ width: int,
55
+ factor: int = IMAGE_FACTOR,
56
+ min_pixels: int = MIN_PIXELS,
57
+ max_pixels: int = MAX_PIXELS,
58
+ ) -> Tuple[int, int]:
59
+ if max(height, width) / min(height, width) > 200:
60
+ raise ValueError("absolute aspect ratio must be smaller than 200")
61
+ h_bar = max(factor, round_by_factor(height, factor))
62
+ w_bar = max(factor, round_by_factor(width, factor))
63
+
64
+ if h_bar * w_bar > max_pixels:
65
+ beta = math.sqrt((height * width) / max_pixels)
66
+ h_bar = round_by_factor(height / beta, factor)
67
+ w_bar = round_by_factor(width / beta, factor)
68
+ elif h_bar * w_bar < min_pixels:
69
+ beta = math.sqrt(min_pixels / (height * width))
70
+ h_bar = round_by_factor(height * beta, factor)
71
+ w_bar = round_by_factor(width * beta, factor)
72
+ return int(h_bar), int(w_bar)
73
+
74
+
75
+ def fetch_image(
76
+ image_input: Any,
77
+ min_pixels: Optional[int] = None,
78
+ max_pixels: Optional[int] = None,
79
+ ) -> Image.Image:
80
+ if isinstance(image_input, str):
81
+ if image_input.startswith(("http://", "https://")):
82
+ response = requests.get(image_input, timeout=60)
83
+ image = Image.open(BytesIO(response.content)).convert("RGB")
84
+ else:
85
+ image = Image.open(image_input).convert("RGB")
86
+ elif isinstance(image_input, Image.Image):
87
+ image = image_input.convert("RGB")
88
+ else:
89
+ raise ValueError(f"Invalid image input type: {type(image_input)}")
90
+
91
+ if min_pixels is not None or max_pixels is not None:
92
+ min_pixels = min_pixels or MIN_PIXELS
93
+ max_pixels = max_pixels or MAX_PIXELS
94
+ new_h, new_w = smart_resize(
95
+ image.height,
96
+ image.width,
97
+ factor=IMAGE_FACTOR,
98
+ min_pixels=min_pixels,
99
+ max_pixels=max_pixels,
100
+ )
101
+ image = image.resize((new_w, new_h), Image.LANCZOS)
102
+ return image
103
+
104
+
105
+ def load_images_from_pdf(pdf_path: str) -> List[Image.Image]:
106
+ images: List[Image.Image] = []
107
+ pdf_document = fitz.open(pdf_path)
108
+ try:
109
+ for page_idx in range(len(pdf_document)):
110
+ page = pdf_document.load_page(page_idx)
111
+ pix = page.get_pixmap(matrix=fitz.Matrix(2.0, 2.0))
112
+ img_data = pix.tobytes("ppm")
113
+ image = Image.open(BytesIO(img_data)).convert("RGB")
114
+ images.append(image)
115
+ finally:
116
+ pdf_document.close()
117
+ return images
118
+
119
+
120
+ def file_checksum(path: str, chunk_size: int = 1 << 20) -> str:
121
+ hasher = hashlib.sha256()
122
+ with open(path, "rb") as f:
123
+ while True:
124
+ chunk = f.read(chunk_size)
125
+ if not chunk:
126
+ break
127
+ hasher.update(chunk)
128
+ return hasher.hexdigest()
129
+
130
+
131
+ def draw_layout_on_image(image: Image.Image, layout_data: List[Dict]) -> Image.Image:
132
+ img = image.copy()
133
+ draw = ImageDraw.Draw(img)
134
+ colors = {
135
+ "Caption": "#FF6B6B",
136
+ "Footnote": "#4ECDC4",
137
+ "Formula": "#45B7D1",
138
+ "List-item": "#96CEB4",
139
+ "Page-footer": "#FFEAA7",
140
+ "Page-header": "#DDA0DD",
141
+ "Picture": "#FFD93D",
142
+ "Section-header": "#6C5CE7",
143
+ "Table": "#FD79A8",
144
+ "Text": "#74B9FF",
145
+ "Title": "#E17055",
146
+ }
147
+
148
+ try:
149
+ try:
150
+ font = ImageFont.truetype(
151
+ "/System/Library/Fonts/Supplemental/Arial Bold.ttf", 12
152
+ )
153
+ except Exception:
154
+ try:
155
+ font = ImageFont.truetype(
156
+ "/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf", 12
157
+ )
158
+ except Exception:
159
+ font = ImageFont.load_default()
160
+
161
+ for item in layout_data:
162
+ bbox = item.get("bbox")
163
+ category = item.get("category")
164
+ if not bbox or not category:
165
+ continue
166
+ color = colors.get(category, "#000000")
167
+ draw.rectangle(bbox, outline=color, width=2)
168
+ label = str(category)
169
+ label_bbox = draw.textbbox((0, 0), label, font=font)
170
+ label_w = label_bbox[2] - label_bbox[0]
171
+ label_h = label_bbox[3] - label_bbox[1]
172
+ x1, y1 = int(bbox[0]), int(bbox[1])
173
+ lx = x1
174
+ ly = max(0, y1 - label_h - 2)
175
+ draw.rectangle([lx, ly, lx + label_w + 4, ly + label_h + 2], fill=color)
176
+ draw.text((lx + 2, ly + 1), label, fill="white", font=font)
177
+ except Exception:
178
+ pass
179
+ return img
180
+
181
+
182
+ def is_arabic_text(text: str) -> bool:
183
+ if not text:
184
+ return False
185
+ header_pattern = r"^#{1,6}\s+(.+)$"
186
+ paragraph_pattern = r"^(?!#{1,6}\s|!\[|```|\||\s*[-*+]\s|\s*\d+\.\s)(.+)$"
187
+ content_lines: List[str] = []
188
+ for line in text.split("\n"):
189
+ s = line.strip()
190
+ if not s:
191
+ continue
192
+ m = re.match(header_pattern, s)
193
+ if m:
194
+ content_lines.append(m.group(1))
195
+ continue
196
+ if re.match(paragraph_pattern, s):
197
+ content_lines.append(s)
198
+ if not content_lines:
199
+ return False
200
+ combined = " ".join(content_lines)
201
+ arabic = 0
202
+ total = 0
203
+ for ch in combined:
204
+ if ch.isalpha():
205
+ total += 1
206
+ if (
207
+ ("\u0600" <= ch <= "\u06ff")
208
+ or ("\u0750" <= ch <= "\u077f")
209
+ or ("\u08a0" <= ch <= "\u08ff")
210
+ ):
211
+ arabic += 1
212
+ if total == 0:
213
+ return False
214
+ return (arabic / total) > 0.5
215
+
216
+
217
+ def extract_json(text: str) -> Optional[Dict[str, Any]]:
218
+ if not text:
219
+ return None
220
+ try:
221
+ return json.loads(text)
222
+ except Exception:
223
+ pass
224
+ # Try to extract JSON block
225
+ brace_start = text.find("{")
226
+ brace_end = text.rfind("}")
227
+ if 0 <= brace_start < brace_end:
228
+ snippet = text[brace_start : brace_end + 1]
229
+ try:
230
+ return json.loads(snippet)
231
+ except Exception:
232
+ pass
233
+ fenced = re.findall(r"```json\s*([\s\S]*?)\s*```", text)
234
+ for block in fenced:
235
+ try:
236
+ return json.loads(block)
237
+ except Exception:
238
+ continue
239
+ return None
240
+
241
+
242
+ def layoutjson2md(
243
+ image: Image.Image, layout_data: List[Dict], text_key: str = "text"
244
+ ) -> str:
245
+ lines: List[str] = []
246
+ try:
247
+ items = sorted(
248
+ layout_data,
249
+ key=lambda x: (
250
+ x.get("bbox", [0, 0, 0, 0])[1],
251
+ x.get("bbox", [0, 0, 0, 0])[0],
252
+ ),
253
+ )
254
+ for item in items:
255
+ category = item.get("category", "")
256
+ text = item.get(text_key, "")
257
+ if category == "Title" and text:
258
+ lines.append(f"# {text}\n")
259
+ elif category == "Section-header" and text:
260
+ lines.append(f"## {text}\n")
261
+ elif category == "List-item" and text:
262
+ lines.append(f"- {text}\n")
263
+ elif category == "Table" and text:
264
+ if text.strip().startswith("<"):
265
+ lines.append(text + "\n")
266
+ else:
267
+ lines.append(f"**Table:** {text}\n")
268
+ elif category == "Formula" and text:
269
+ if text.strip().startswith("$") or "\\" in text:
270
+ lines.append(f"$$\n{text}\n$$\n")
271
+ else:
272
+ lines.append(f"**Formula:** {text}\n")
273
+ elif category == "Caption" and text:
274
+ lines.append(f"*{text}*\n")
275
+ elif category in ["Page-header", "Page-footer"]:
276
+ continue
277
+ elif category == "Picture":
278
+ # Skip embedding image fragments in markdown for now
279
+ continue
280
+ elif text:
281
+ lines.append(f"{text}\n")
282
+ lines.append("")
283
+ except Exception:
284
+ return json.dumps(layout_data, ensure_ascii=False)
285
+ return "\n".join(lines)
286
+
287
+
288
+ # =====================
289
+ # Model initialization
290
+ # =====================
291
+ model: Optional[AutoModelForCausalLM] = None
292
+ processor: Optional[AutoProcessor] = None
293
+ device = (
294
+ "cuda"
295
+ if torch.cuda.is_available()
296
+ else ("mps" if torch.backends.mps.is_available() else "cpu")
297
+ )
298
+
299
+
300
+ def get_torch_dtype() -> torch.dtype:
301
+ if device == "cuda":
302
+ return torch.bfloat16
303
+ if device == "mps":
304
+ return torch.float16
305
+ return torch.float32
306
+
307
+
308
+ def ensure_model_loaded() -> Tuple[AutoModelForCausalLM, AutoProcessor]:
309
+ global model, processor
310
+ if model is not None and processor is not None:
311
+ return model, processor
312
+
313
+ os.environ.setdefault("HF_HUB_DISABLE_SYMLINKS_WARNING", "1")
314
+ snapshot_download(
315
+ repo_id=DOTS_REPO_ID,
316
+ local_dir=DOTS_LOCAL_DIR,
317
+ local_dir_use_symlinks=False,
318
+ )
319
+
320
+ dtype = get_torch_dtype()
321
+
322
+ model = AutoModelForCausalLM.from_pretrained(
323
+ DOTS_LOCAL_DIR,
324
+ torch_dtype=dtype,
325
+ device_map="auto",
326
+ trust_remote_code=True,
327
+ )
328
+ proc = AutoProcessor.from_pretrained(DOTS_LOCAL_DIR, trust_remote_code=True)
329
+ processor = proc
330
+ return model, processor
331
+
332
+
333
+ def run_inference(
334
+ image: Image.Image, prompt_text: str, max_new_tokens: int = 24000
335
+ ) -> str:
336
+ mdl, proc = ensure_model_loaded()
337
+ messages = [
338
+ {
339
+ "role": "user",
340
+ "content": [
341
+ {"type": "image", "image": image},
342
+ {"type": "text", "text": prompt_text},
343
+ ],
344
+ }
345
+ ]
346
+ text = proc.apply_chat_template(
347
+ messages, tokenize=False, add_generation_prompt=True
348
+ )
349
+ image_inputs, video_inputs = process_vision_info(messages)
350
+ inputs = proc(
351
+ text=[text],
352
+ images=image_inputs,
353
+ videos=video_inputs,
354
+ padding=True,
355
+ return_tensors="pt",
356
+ )
357
+ inputs = {k: v.to(device) if hasattr(v, "to") else v for k, v in inputs.items()}
358
+ with torch.no_grad():
359
+ generated_ids = mdl.generate(
360
+ **inputs,
361
+ max_new_tokens=int(max_new_tokens),
362
+ do_sample=False,
363
+ temperature=0.1,
364
+ )
365
+ trimmed = [
366
+ out_ids[len(in_ids) :]
367
+ for in_ids, out_ids in zip(inputs["input_ids"], generated_ids)
368
+ ]
369
+ output_text = processor.batch_decode(
370
+ trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
371
+ )
372
+ return output_text[0] if output_text else ""
373
+
374
+
375
+ def process_single_image(
376
+ image: Image.Image,
377
+ prompt_text: str,
378
+ min_pixels: Optional[int],
379
+ max_pixels: Optional[int],
380
+ max_new_tokens: int,
381
+ ) -> Dict[str, Any]:
382
+ img = fetch_image(image, min_pixels=min_pixels, max_pixels=max_pixels)
383
+ raw = run_inference(img, prompt_text, max_new_tokens=max_new_tokens)
384
+ result: Dict[str, Any] = {
385
+ "original_image": img,
386
+ "processed_image": img,
387
+ "raw_output": raw,
388
+ "layout_result": None,
389
+ "markdown": None,
390
+ }
391
+ data = extract_json(raw)
392
+ if isinstance(data, dict):
393
+ result["layout_result"] = data
394
+ items = data.get("elements", data.get("elements_list", data.get("content", [])))
395
+ if isinstance(items, list):
396
+ result["processed_image"] = draw_layout_on_image(img, items)
397
+ result["markdown"] = layoutjson2md(img, items)
398
+ if result["markdown"] is None:
399
+ result["markdown"] = raw
400
+ return result
401
+
402
+
403
+ # =================
404
+ # Gradio Interface
405
+ # =================
406
+ def create_blocks_app():
407
+ css = """
408
+ .main-container { max-width: 1500px; margin: 0 auto; }
409
+ .header-text { text-align: center; color: #1f2937; margin-bottom: 12px; }
410
+ .page-info { text-align: center; padding: 8px 16px; border-radius: 20px; font-weight: 600; }
411
+ .process-button { border: none !important; color: white !important; font-weight: 700 !important; }
412
+ """
413
+
414
+ with gr.Blocks(theme=gr.themes.Soft(), css=css, title=APP_TITLE) as demo:
415
+ # App state
416
+ doc_state = gr.State(
417
+ {
418
+ "images": [],
419
+ "current_page": 0,
420
+ "total_pages": 0,
421
+ "file_type": None,
422
+ "checksum": None,
423
+ "results": [],
424
+ "parsed": False,
425
+ }
426
+ )
427
+
428
+ cache_state = gr.State({}) # (checksum, page, prompt_hash) -> result
429
+
430
+ gr.HTML(
431
+ """
432
+ <div class=\"header-text\">
433
+ <h2>VLM Playground — dots.ocr</h2>
434
+ <p>Upload a PDF or image, preview pages, and parse with a layout-extraction prompt.</p>
435
+ </div>
436
+ """
437
+ )
438
+
439
+ with gr.Row(elem_classes=["main-container"]):
440
+ # Left: upload + controls
441
+ with gr.Column(scale=4):
442
+ file_input = gr.File(
443
+ label="Upload PDF or Image",
444
+ file_types=[
445
+ ".pdf",
446
+ ".png",
447
+ ".jpg",
448
+ ".jpeg",
449
+ ".bmp",
450
+ ".tiff",
451
+ ".webp",
452
+ ],
453
+ type="filepath",
454
+ )
455
+
456
+ with gr.Group():
457
+ template = gr.Dropdown(
458
+ label="Prompt Template",
459
+ choices=["Layout Extraction"],
460
+ value="Layout Extraction",
461
+ )
462
+ prompt_text = gr.Textbox(
463
+ label="Current Prompt",
464
+ value=DEFAULT_PROMPT,
465
+ lines=6,
466
+ )
467
+
468
+ with gr.Row():
469
+ parse_button = gr.Button(
470
+ "Parse", variant="primary", elem_classes=["process-button"]
471
+ )
472
+ clear_button = gr.Button("Clear")
473
+
474
+ with gr.Accordion("Advanced", open=False):
475
+ max_new_tokens = gr.Slider(
476
+ minimum=512,
477
+ maximum=32000,
478
+ value=24000,
479
+ step=256,
480
+ label="Max new tokens",
481
+ )
482
+ min_pixels_in = gr.Number(value=MIN_PIXELS, label="Min pixels")
483
+ max_pixels_in = gr.Number(value=MAX_PIXELS, label="Max pixels")
484
+ page_range = gr.Textbox(
485
+ label="Page selection",
486
+ placeholder="e.g., 1-3,5 (blank = current page, 'all' = all pages)",
487
+ )
488
+
489
+ # Center: page preview + nav
490
+ with gr.Column(scale=5):
491
+ preview_image = gr.Image(label="Page Preview", type="pil", height=520)
492
+ with gr.Row():
493
+ prev_btn = gr.Button("◀ Prev")
494
+ page_info = gr.HTML('<div class="page-info">No file</div>')
495
+ next_btn = gr.Button("Next ▶")
496
+ with gr.Row():
497
+ page_jump = gr.Number(value=1, label="Page #", precision=0)
498
+ jump_btn = gr.Button("Go")
499
+
500
+ # Right: results
501
+ with gr.Column(scale=6):
502
+ with gr.Tabs():
503
+ with gr.Tab("Markdown Render"):
504
+ md_render = gr.Markdown(
505
+ value="Upload and parse to view results", height=520
506
+ )
507
+ with gr.Tab("Raw Markdown"):
508
+ md_raw = gr.Textbox(value="", lines=20)
509
+ with gr.Tab("Current Page JSON"):
510
+ json_view = gr.JSON(value=None)
511
+ with gr.Tab("Processed Image"):
512
+ processed_view = gr.Image(type="pil", height=520)
513
+
514
+ with gr.Row():
515
+ download_jsonl = gr.DownloadButton(
516
+ label="Download JSONL", file_name="results.jsonl"
517
+ )
518
+ download_markdown = gr.DownloadButton(
519
+ label="Download Markdown", file_name="results.md"
520
+ )
521
+
522
+ # ===== Handlers =====
523
+ def on_template_change(choice: str) -> str:
524
+ return DEFAULT_PROMPT
525
+
526
+ def on_file_change(path: Optional[str]):
527
+ if not path or not os.path.exists(path):
528
+ return (
529
+ {
530
+ "images": [],
531
+ "current_page": 0,
532
+ "total_pages": 0,
533
+ "file_type": None,
534
+ "checksum": None,
535
+ "results": [],
536
+ "parsed": False,
537
+ },
538
+ None,
539
+ '<div class="page-info">No file</div>',
540
+ )
541
+ checksum = file_checksum(path)
542
+ ext = os.path.splitext(path)[1].lower()
543
+ if ext == ".pdf":
544
+ images = load_images_from_pdf(path)
545
+ state = {
546
+ "images": images,
547
+ "current_page": 0,
548
+ "total_pages": len(images),
549
+ "file_type": "pdf",
550
+ "checksum": checksum,
551
+ "results": [None] * len(images),
552
+ "parsed": False,
553
+ }
554
+ return (
555
+ state,
556
+ images[0] if images else None,
557
+ f'<div class="page-info">Page 1 / {len(images)}</div>',
558
+ )
559
+ else:
560
+ image = Image.open(path).convert("RGB")
561
+ state = {
562
+ "images": [image],
563
+ "current_page": 0,
564
+ "total_pages": 1,
565
+ "file_type": "image",
566
+ "checksum": checksum,
567
+ "results": [None],
568
+ "parsed": False,
569
+ }
570
+ return state, image, '<div class="page-info">Page 1 / 1</div>'
571
+
572
+ def nav_page(state: Dict[str, Any], direction: str):
573
+ if not state.get("images"):
574
+ return (
575
+ state,
576
+ None,
577
+ '<div class="page-info">No file</div>',
578
+ "No results",
579
+ "",
580
+ None,
581
+ None,
582
+ )
583
+ if direction == "prev":
584
+ state["current_page"] = max(0, state["current_page"] - 1)
585
+ elif direction == "next":
586
+ state["current_page"] = min(
587
+ state["total_pages"] - 1, state["current_page"] + 1
588
+ )
589
+ idx = state["current_page"]
590
+ img = state["images"][idx]
591
+ info = (
592
+ f'<div class="page-info">Page {idx + 1} / {state["total_pages"]}</div>'
593
+ )
594
+ result = (
595
+ state["results"][idx]
596
+ if state.get("parsed") and idx < len(state["results"])
597
+ else None
598
+ )
599
+ md = result.get("markdown") if result else "Page not processed yet"
600
+ md_out = gr.update(value=md, rtl=True) if is_arabic_text(md) else md
601
+ md_raw_text = md
602
+ proc_img = result.get("processed_image") if result else None
603
+ js = result.get("layout_result") if result else None
604
+ return state, img, info, md_out, md_raw_text, proc_img, js
605
+
606
+ def jump_to_page(state: Dict[str, Any], page_num: Any):
607
+ if not state.get("images"):
608
+ return (
609
+ state,
610
+ None,
611
+ '<div class="page-info">No file</div>',
612
+ "No results",
613
+ "",
614
+ None,
615
+ None,
616
+ )
617
+ try:
618
+ n = int(page_num)
619
+ except Exception:
620
+ n = 1
621
+ n = max(1, min(state["total_pages"], n))
622
+ state["current_page"] = n - 1
623
+ return nav_page(state, direction="stay")
624
+
625
+ def parse_pages(
626
+ state: Dict[str, Any],
627
+ prompt: str,
628
+ max_tokens: int,
629
+ min_pix: Optional[float],
630
+ max_pix: Optional[float],
631
+ selection: Optional[str],
632
+ ):
633
+ if not state.get("images"):
634
+ return state, None, "No file", "No content", "", None, None
635
+
636
+ # Determine pages to process
637
+ indices: List[int] = []
638
+ if not selection or selection.strip() == "":
639
+ indices = [state["current_page"]]
640
+ elif selection.strip().lower() == "all":
641
+ indices = list(range(state["total_pages"]))
642
+ else:
643
+ # parse like 1-3,5
644
+ parts = [p.strip() for p in selection.split(",") if p.strip()]
645
+ for p in parts:
646
+ if "-" in p:
647
+ a, b = p.split("-", 1)
648
+ try:
649
+ a_i = max(1, int(a))
650
+ b_i = min(state["total_pages"], int(b))
651
+ for i in range(a_i - 1, b_i):
652
+ indices.append(i)
653
+ except Exception:
654
+ continue
655
+ else:
656
+ try:
657
+ i = max(1, min(state["total_pages"], int(p)))
658
+ indices.append(i - 1)
659
+ except Exception:
660
+ continue
661
+ indices = sorted(
662
+ set([i for i in indices if 0 <= i < state["total_pages"]])
663
+ )
664
+
665
+ # Process sequentially for stability
666
+ results = state.get("results") or [None] * state["total_pages"]
667
+ for i in indices:
668
+ img = state["images"][i]
669
+ prompt_hash = hashlib.sha256(prompt.encode("utf-8")).hexdigest()[:16]
670
+ cache_key = (
671
+ state["checksum"],
672
+ i,
673
+ prompt_hash,
674
+ int(min_pix or 0),
675
+ int(max_pix or 0),
676
+ int(max_tokens),
677
+ )
678
+ cached = cache_state.value.get(cache_key)
679
+ if cached:
680
+ results[i] = cached
681
+ continue
682
+ res = process_single_image(
683
+ img,
684
+ prompt_text=prompt,
685
+ min_pixels=int(min_pix) if min_pix else None,
686
+ max_pixels=int(max_pix) if max_pix else None,
687
+ max_new_tokens=int(max_tokens),
688
+ )
689
+ results[i] = res
690
+ cache_state.value[cache_key] = res
691
+ state["results"] = results
692
+ state["parsed"] = True
693
+
694
+ # Return current page outputs
695
+ idx = state["current_page"]
696
+ curr = results[idx]
697
+ md = curr.get("markdown") if curr else "No content"
698
+ md_out = gr.update(value=md, rtl=True) if is_arabic_text(md) else md
699
+ md_raw_text = md
700
+ proc_img = curr.get("processed_image") if curr else None
701
+ js = curr.get("layout_result") if curr else None
702
+ info = (
703
+ f'<div class="page-info">Page {idx + 1} / {state["total_pages"]}</div>'
704
+ )
705
+ prev = state["images"][idx]
706
+ return state, prev, info, md_out, md_raw_text, proc_img, js
707
+
708
+ def clear_all():
709
+ gc.collect()
710
+ return (
711
+ {
712
+ "images": [],
713
+ "current_page": 0,
714
+ "total_pages": 0,
715
+ "file_type": None,
716
+ "checksum": None,
717
+ "results": [],
718
+ "parsed": False,
719
+ },
720
+ None,
721
+ '<div class="page-info">No file</div>',
722
+ "Upload and parse to view results",
723
+ "",
724
+ None,
725
+ None,
726
+ )
727
+
728
+ def download_current_jsonl(state: Dict[str, Any]):
729
+ if not state.get("parsed"):
730
+ return gr.DownloadButton.update(value=b"")
731
+ lines: List[str] = []
732
+ for i, res in enumerate(state.get("results", [])):
733
+ if res and res.get("layout_result") is not None:
734
+ obj = {"page": i + 1, "layout": res["layout_result"]}
735
+ lines.append(json.dumps(obj, ensure_ascii=False))
736
+ content = "\n".join(lines) if lines else ""
737
+ return gr.DownloadButton.update(value=content.encode("utf-8"))
738
+
739
+ def download_current_markdown(state: Dict[str, Any]):
740
+ if not state.get("parsed"):
741
+ return gr.DownloadButton.update(value=b"")
742
+ chunks: List[str] = []
743
+ for i, res in enumerate(state.get("results", [])):
744
+ if res and res.get("markdown"):
745
+ chunks.append(f"## Page {i + 1}\n\n{res['markdown']}")
746
+ content = "\n\n---\n\n".join(chunks) if chunks else ""
747
+ return gr.DownloadButton.update(value=content.encode("utf-8"))
748
+
749
+ # Wire events
750
+ template.change(on_template_change, inputs=[template], outputs=[prompt_text])
751
+ file_input.change(
752
+ on_file_change,
753
+ inputs=[file_input],
754
+ outputs=[doc_state, preview_image, page_info],
755
+ )
756
+ prev_btn.click(
757
+ lambda s: nav_page(s, "prev"),
758
+ inputs=[doc_state],
759
+ outputs=[
760
+ doc_state,
761
+ preview_image,
762
+ page_info,
763
+ md_render,
764
+ md_raw,
765
+ processed_view,
766
+ json_view,
767
+ ],
768
+ )
769
+ next_btn.click(
770
+ lambda s: nav_page(s, "next"),
771
+ inputs=[doc_state],
772
+ outputs=[
773
+ doc_state,
774
+ preview_image,
775
+ page_info,
776
+ md_render,
777
+ md_raw,
778
+ processed_view,
779
+ json_view,
780
+ ],
781
+ )
782
+ jump_btn.click(
783
+ jump_to_page,
784
+ inputs=[doc_state, page_jump],
785
+ outputs=[
786
+ doc_state,
787
+ preview_image,
788
+ page_info,
789
+ md_render,
790
+ md_raw,
791
+ processed_view,
792
+ json_view,
793
+ ],
794
+ )
795
+ parse_button.click(
796
+ parse_pages,
797
+ inputs=[
798
+ doc_state,
799
+ prompt_text,
800
+ max_new_tokens,
801
+ min_pixels_in,
802
+ max_pixels_in,
803
+ page_range,
804
+ ],
805
+ outputs=[
806
+ doc_state,
807
+ preview_image,
808
+ page_info,
809
+ md_render,
810
+ md_raw,
811
+ processed_view,
812
+ json_view,
813
+ ],
814
+ )
815
+ clear_button.click(
816
+ clear_all,
817
+ outputs=[
818
+ doc_state,
819
+ preview_image,
820
+ page_info,
821
+ md_render,
822
+ md_raw,
823
+ processed_view,
824
+ json_view,
825
+ ],
826
+ )
827
+
828
+ download_jsonl.click(
829
+ download_current_jsonl, inputs=[doc_state], outputs=[download_jsonl]
830
+ )
831
+ download_markdown.click(
832
+ download_current_markdown, inputs=[doc_state], outputs=[download_markdown]
833
+ )
834
+
835
+ return demo
src/vlm_playground/utils/__init__.py ADDED
File without changes
src/vlm_playground/utils/constants.py ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ MIN_PIXELS = 3136
2
+ MAX_PIXELS = 11289600
3
+ IMAGE_FACTOR = 28
4
+
5
+ image_extensions = {".jpg", ".jpeg", ".png"}
src/vlm_playground/utils/prompts.py ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ dict_promptmode_to_prompt = {
2
+ # prompt_layout_all_en: parse all layout info in json format.
3
+ "prompt_layout_all_en": """Please output the layout information from the PDF image, including each layout element's bbox, its category, and the corresponding text content within the bbox.
4
+
5
+ 1. Bbox format: [x1, y1, x2, y2]
6
+
7
+ 2. Layout Categories: The possible categories are ['Caption', 'Footnote', 'Formula', 'List-item', 'Page-footer', 'Page-header', 'Picture', 'Section-header', 'Table', 'Text', 'Title'].
8
+
9
+ 3. Text Extraction & Formatting Rules:
10
+ - Picture: For the 'Picture' category, the text field should be omitted.
11
+ - Formula: Format its text as LaTeX.
12
+ - Table: Format its text as HTML.
13
+ - All Others (Text, Title, etc.): Format their text as Markdown.
14
+
15
+ 4. Constraints:
16
+ - The output text must be the original text from the image, with no translation.
17
+ - All layout elements must be sorted according to human reading order.
18
+
19
+ 5. Final Output: The entire output must be a single JSON object.
20
+ """,
21
+ # prompt_layout_only_en: layout detection
22
+ "prompt_layout_only_en": """Please output the layout information from this PDF image, including each layout's bbox and its category. The bbox should be in the format [x1, y1, x2, y2]. The layout categories for the PDF document include ['Caption', 'Footnote', 'Formula', 'List-item', 'Page-footer', 'Page-header', 'Picture', 'Section-header', 'Table', 'Text', 'Title']. Do not output the corresponding text. The layout result should be in JSON format.""",
23
+ # prompt_layout_only_en: parse ocr text except the Page-header and Page-footer
24
+ "prompt_ocr": """Extract the text content from this image.""",
25
+ # prompt_grounding_ocr: extract text content in the given bounding box
26
+ "prompt_grounding_ocr": """Extract text from the given bounding box on the image (format: [x1, y1, x2, y2]).\nBounding Box:\n""",
27
+ # "prompt_table_html": """Convert the table in this image to HTML.""",
28
+ # "prompt_table_latex": """Convert the table in this image to LaTeX.""",
29
+ # "prompt_formula_latex": """Convert the formula in this image to LaTeX.""",
30
+ }
uv.lock CHANGED
@@ -6,6 +6,24 @@ resolution-markers = [
6
  "python_full_version < '3.13'",
7
  ]
8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  [[package]]
10
  name = "aiofiles"
11
  version = "24.1.0"
@@ -94,6 +112,32 @@ wheels = [
94
  { url = "https://files.pythonhosted.org/packages/f6/22/91616fe707a5c5510de2cac9b046a30defe7007ba8a0c04f9c08f27df312/audioop_lts-0.2.2-cp314-cp314t-win_arm64.whl", hash = "sha256:b492c3b040153e68b9fdaff5913305aaaba5bb433d8a7f73d5cf6a64ed3cc1dd", size = 25206, upload-time = "2025-08-05T16:43:16.444Z" },
95
  ]
96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
97
  [[package]]
98
  name = "brotli"
99
  version = "1.1.0"
@@ -197,6 +241,15 @@ wheels = [
197
  { url = "https://files.pythonhosted.org/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6", size = 25335, upload-time = "2022-10-25T02:36:20.889Z" },
198
  ]
199
 
 
 
 
 
 
 
 
 
 
200
  [[package]]
201
  name = "fastapi"
202
  version = "0.116.1"
@@ -376,6 +429,11 @@ wheels = [
376
  { url = "https://files.pythonhosted.org/packages/59/a8/4677014e771ed1591a87b63a2392ce6923baf807193deef302dcfde17542/huggingface_hub-0.34.3-py3-none-any.whl", hash = "sha256:5444550099e2d86e68b2898b09e85878fbd788fc2957b506c6a79ce060e39492", size = 558847, upload-time = "2025-07-29T08:38:51.904Z" },
377
  ]
378
 
 
 
 
 
 
379
  [[package]]
380
  name = "idna"
381
  version = "3.10"
@@ -385,6 +443,19 @@ wheels = [
385
  { url = "https://files.pythonhosted.org/packages/76/c6/c88e154df9c4e1a2a66ccf0005a88dfb2650c1dffb6f5ce603dfbd452ce3/idna-3.10-py3-none-any.whl", hash = "sha256:946d195a0d259cbba61165e88e65941f16e9b36ea6ddb97f00452bae8b1287d3", size = 70442, upload-time = "2024-09-15T18:07:37.964Z" },
386
  ]
387
 
 
 
 
 
 
 
 
 
 
 
 
 
 
388
  [[package]]
389
  name = "jinja2"
390
  version = "3.1.6"
@@ -456,6 +527,24 @@ wheels = [
456
  { url = "https://files.pythonhosted.org/packages/b3/38/89ba8ad64ae25be8de66a6d463314cf1eb366222074cfda9ee839c56a4b4/mdurl-0.1.2-py3-none-any.whl", hash = "sha256:84008a41e51615a49fc9966191ff91509e3c40b939176e643fd50a5c2196b8f8", size = 9979, upload-time = "2022-08-14T12:40:09.779Z" },
457
  ]
458
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
459
  [[package]]
460
  name = "numpy"
461
  version = "2.3.2"
@@ -519,6 +608,132 @@ wheels = [
519
  { url = "https://files.pythonhosted.org/packages/c1/9e/1652778bce745a67b5fe05adde60ed362d38eb17d919a540e813d30f6874/numpy-2.3.2-cp314-cp314t-win_arm64.whl", hash = "sha256:092aeb3449833ea9c0bf0089d70c29ae480685dd2377ec9cdbbb620257f84631", size = 10544226, upload-time = "2025-07-24T20:56:34.509Z" },
520
  ]
521
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
522
  [[package]]
523
  name = "orjson"
524
  version = "3.11.1"
@@ -611,6 +826,15 @@ wheels = [
611
  { url = "https://files.pythonhosted.org/packages/d5/f9/07086f5b0f2a19872554abeea7658200824f5835c58a106fa8f2ae96a46c/pandas-2.3.1-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:5db9637dbc24b631ff3707269ae4559bce4b7fd75c1c4d7e13f40edc42df4444", size = 13189044, upload-time = "2025-07-07T19:19:39.999Z" },
612
  ]
613
 
 
 
 
 
 
 
 
 
 
614
  [[package]]
615
  name = "pillow"
616
  version = "11.3.0"
@@ -677,6 +901,33 @@ wheels = [
677
  { url = "https://files.pythonhosted.org/packages/89/c7/5572fa4a3f45740eaab6ae86fcdf7195b55beac1371ac8c619d880cfe948/pillow-11.3.0-cp314-cp314t-win_arm64.whl", hash = "sha256:79ea0d14d3ebad43ec77ad5272e6ff9bba5b679ef73375ea760261207fa8e0aa", size = 2512835, upload-time = "2025-07-01T09:15:50.399Z" },
678
  ]
679
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
680
  [[package]]
681
  name = "pydantic"
682
  version = "2.11.7"
@@ -752,6 +1003,21 @@ wheels = [
752
  { url = "https://files.pythonhosted.org/packages/c7/21/705964c7812476f378728bdf590ca4b771ec72385c533964653c68e86bdc/pygments-2.19.2-py3-none-any.whl", hash = "sha256:86540386c03d588bb81d44bc3928634ff26449851e99741617ecb9037ee5ec0b", size = 1225217, upload-time = "2025-06-21T13:39:07.939Z" },
753
  ]
754
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
755
  [[package]]
756
  name = "python-dateutil"
757
  version = "2.9.0.post0"
@@ -808,6 +1074,71 @@ wheels = [
808
  { url = "https://files.pythonhosted.org/packages/fa/de/02b54f42487e3d3c6efb3f89428677074ca7bf43aae402517bc7cca949f3/PyYAML-6.0.2-cp313-cp313-win_amd64.whl", hash = "sha256:8388ee1976c416731879ac16da0aff3f63b286ffdd57cdeb95f3f2e085687563", size = 156446, upload-time = "2024-08-06T20:33:04.33Z" },
809
  ]
810
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
811
  [[package]]
812
  name = "requests"
813
  version = "2.32.4"
@@ -873,6 +1204,28 @@ wheels = [
873
  { url = "https://files.pythonhosted.org/packages/4d/c0/1108ad9f01567f66b3154063605b350b69c3c9366732e09e45f9fd0d1deb/safehttpx-0.1.6-py3-none-any.whl", hash = "sha256:407cff0b410b071623087c63dd2080c3b44dc076888d8c5823c00d1e58cb381c", size = 8692, upload-time = "2024-12-02T18:44:08.555Z" },
874
  ]
875
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
876
  [[package]]
877
  name = "semantic-version"
878
  version = "2.10.0"
@@ -882,6 +1235,15 @@ wheels = [
882
  { url = "https://files.pythonhosted.org/packages/6a/23/8146aad7d88f4fcb3a6218f41a60f6c2d4e3a72de72da1825dc7c8f7877c/semantic_version-2.10.0-py2.py3-none-any.whl", hash = "sha256:de78a3b8e0feda74cabc54aab2da702113e33ac9d9eb9d2389bcf1f58b7d9177", size = 15552, upload-time = "2022-05-26T13:35:21.206Z" },
883
  ]
884
 
 
 
 
 
 
 
 
 
 
885
  [[package]]
886
  name = "shellingham"
887
  version = "1.5.4"
@@ -922,6 +1284,43 @@ wheels = [
922
  { url = "https://files.pythonhosted.org/packages/f7/1f/b876b1f83aef204198a42dc101613fefccb32258e5428b5f9259677864b4/starlette-0.47.2-py3-none-any.whl", hash = "sha256:c5847e96134e5c5371ee9fac6fdf1a67336d5815e09eb2a01fdb57a351ef915b", size = 72984, upload-time = "2025-07-20T17:31:56.738Z" },
923
  ]
924
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
925
  [[package]]
926
  name = "tomlkit"
927
  version = "0.13.3"
@@ -931,6 +1330,73 @@ wheels = [
931
  { url = "https://files.pythonhosted.org/packages/bd/75/8539d011f6be8e29f339c42e633aae3cb73bffa95dd0f9adec09b9c58e85/tomlkit-0.13.3-py3-none-any.whl", hash = "sha256:c89c649d79ee40629a9fda55f8ace8c6a1b42deb912b2a8fd8d942ddadb606b0", size = 38901, upload-time = "2025-06-05T07:13:43.546Z" },
932
  ]
933
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
934
  [[package]]
935
  name = "tqdm"
936
  version = "4.67.1"
@@ -943,6 +1409,40 @@ wheels = [
943
  { url = "https://files.pythonhosted.org/packages/d0/30/dc54f88dd4a2b5dc8a0279bdd7270e735851848b762aeb1c1184ed1f6b14/tqdm-4.67.1-py3-none-any.whl", hash = "sha256:26445eca388f82e72884e0d580d5464cd801a3ea01e63e5601bdff9ba6a48de2", size = 78540, upload-time = "2024-11-24T20:12:19.698Z" },
944
  ]
945
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
946
  [[package]]
947
  name = "typer"
948
  version = "0.16.0"
@@ -1015,11 +1515,44 @@ name = "vlm-playground"
1015
  version = "0.1.0"
1016
  source = { editable = "." }
1017
  dependencies = [
 
 
1018
  { name = "gradio" },
 
 
 
 
 
 
 
 
 
1019
  ]
1020
 
1021
  [package.metadata]
1022
- requires-dist = [{ name = "gradio", specifier = ">=5.41.1" }]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1023
 
1024
  [[package]]
1025
  name = "websockets"
 
6
  "python_full_version < '3.13'",
7
  ]
8
 
9
+ [[package]]
10
+ name = "accelerate"
11
+ version = "1.10.0"
12
+ source = { registry = "https://pypi.org/simple" }
13
+ dependencies = [
14
+ { name = "huggingface-hub" },
15
+ { name = "numpy" },
16
+ { name = "packaging" },
17
+ { name = "psutil" },
18
+ { name = "pyyaml" },
19
+ { name = "safetensors" },
20
+ { name = "torch" },
21
+ ]
22
+ sdist = { url = "https://files.pythonhosted.org/packages/f7/66/be171836d86dc5b8698b3a9bf4b9eb10cb53369729939f88bf650167588b/accelerate-1.10.0.tar.gz", hash = "sha256:8270568fda9036b5cccdc09703fef47872abccd56eb5f6d53b54ea5fb7581496", size = 392261, upload-time = "2025-08-07T10:54:51.664Z" }
23
+ wheels = [
24
+ { url = "https://files.pythonhosted.org/packages/30/dd/0107f0aa179869ee9f47ef5a2686abd5e022fdc82af901d535e52fe91ce1/accelerate-1.10.0-py3-none-any.whl", hash = "sha256:260a72b560e100e839b517a331ec85ed495b3889d12886e79d1913071993c5a3", size = 374718, upload-time = "2025-08-07T10:54:49.988Z" },
25
+ ]
26
+
27
  [[package]]
28
  name = "aiofiles"
29
  version = "24.1.0"
 
112
  { url = "https://files.pythonhosted.org/packages/f6/22/91616fe707a5c5510de2cac9b046a30defe7007ba8a0c04f9c08f27df312/audioop_lts-0.2.2-cp314-cp314t-win_arm64.whl", hash = "sha256:b492c3b040153e68b9fdaff5913305aaaba5bb433d8a7f73d5cf6a64ed3cc1dd", size = 25206, upload-time = "2025-08-05T16:43:16.444Z" },
113
  ]
114
 
115
+ [[package]]
116
+ name = "av"
117
+ version = "15.0.0"
118
+ source = { registry = "https://pypi.org/simple" }
119
+ sdist = { url = "https://files.pythonhosted.org/packages/17/89/940a509ee7e9449f0c877fa984b37b7cc485546035cc67bbc353f2ac20f3/av-15.0.0.tar.gz", hash = "sha256:871c1a9becddf00b60b1294dc0bff9ff193ac31286aeec1a34039bd27e650183", size = 3833128, upload-time = "2025-07-03T16:23:48.455Z" }
120
+ wheels = [
121
+ { url = "https://files.pythonhosted.org/packages/89/81/c5d009ea9c01a513b7af6aac2ac49c0f2f7193345071cd6dd4d91bef3ab9/av-15.0.0-cp312-cp312-macosx_13_0_arm64.whl", hash = "sha256:84e2ede9459e64e768f4bc56d9df65da9e94b704ee3eccfe2e5b1da1da754313", size = 21782026, upload-time = "2025-07-03T16:22:18.41Z" },
122
+ { url = "https://files.pythonhosted.org/packages/16/8a/ffe9fcac35a07efc6aa0d765015efa499d88823c01499f318760460f8088/av-15.0.0-cp312-cp312-macosx_13_0_x86_64.whl", hash = "sha256:9473ed92d6942c5a449a2c79d49f3425eb0272499d1a3559b32c1181ff736a08", size = 26974939, upload-time = "2025-07-03T16:22:21.493Z" },
123
+ { url = "https://files.pythonhosted.org/packages/a0/e7/0816e52134dc2d0259bb1aaad78573eacaf2bebc1a643de34e3384b520d6/av-15.0.0-cp312-cp312-manylinux2014_i686.manylinux_2_17_i686.whl", hash = "sha256:56a53fe4e09bebd99355eaa0ce221b681eaf205bdda114f5e17fb79f3c3746ad", size = 34573486, upload-time = "2025-07-03T16:22:24.684Z" },
124
+ { url = "https://files.pythonhosted.org/packages/a3/f4/07cc05712e9824a4bb68beea44eb5a7369dee3f00fa258879190004b7fc5/av-15.0.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:247dd9a99d7ed3577b8c1e9977e811f423b04504ff36c9dcd7a4de3e6e5fe5ad", size = 38418908, upload-time = "2025-07-03T16:22:27.799Z" },
125
+ { url = "https://files.pythonhosted.org/packages/19/48/7f3a21a41e291f8c5b8a98f95cfef308ce1b024a634413ce910c270efd7d/av-15.0.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:fc50a7d5f60109221ccf44f8fa4c56ce73f22948b7f19b1717fcc58f7fbc383e", size = 40010257, upload-time = "2025-07-03T16:22:31.15Z" },
126
+ { url = "https://files.pythonhosted.org/packages/6d/c9/ced392e82d39084544d2d0c05decb36446028928eddf0d40ec3d8fe6c050/av-15.0.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:77deaec8943abfebd4e262924f2f452d6594cf0bc67d8d98aac0462b476e4182", size = 40381801, upload-time = "2025-07-03T16:22:34.254Z" },
127
+ { url = "https://files.pythonhosted.org/packages/d2/73/a23ad111200e27f5773e94b0b6f9e2ea492a72ded7f4787a358d9d504a8b/av-15.0.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:601d9b0740e47a17ec96ba2a537ebfd4d6edc859ae6f298475c06caa51f0a019", size = 37219417, upload-time = "2025-07-03T16:22:37.497Z" },
128
+ { url = "https://files.pythonhosted.org/packages/45/0c/2ac20143b74e3792ede40bfd397ce72fa4e76a03999c2fd0aee3997b6971/av-15.0.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:e021f67e0db7256c9f5d3d6a2a4237a4a4a804b131b33e7f2778981070519b20", size = 41242077, upload-time = "2025-07-03T16:22:40.86Z" },
129
+ { url = "https://files.pythonhosted.org/packages/bd/30/40452705dffbfef0f5505d36218970dfeff0a86048689910219c8717b310/av-15.0.0-cp312-cp312-win_amd64.whl", hash = "sha256:383f1b57520d790069d85fc75f43cfa32fca07f5fb3fb842be37bd596638602c", size = 31357617, upload-time = "2025-07-03T16:22:43.934Z" },
130
+ { url = "https://files.pythonhosted.org/packages/a6/27/c2e248498ce78dd504b0b1818ce88e71e30a7e26c348bdf5d6467d7b06f7/av-15.0.0-cp313-cp313-macosx_13_0_arm64.whl", hash = "sha256:0701c116f32bd9478023f610722f6371d15ca0c068ff228d355f54a7cf23d9cb", size = 21746400, upload-time = "2025-07-03T16:22:46.604Z" },
131
+ { url = "https://files.pythonhosted.org/packages/1d/d8/11f8452f19f4ddc189e978b215420131db40e3919135c14a0d13520f7c94/av-15.0.0-cp313-cp313-macosx_13_0_x86_64.whl", hash = "sha256:57fb6232494ec575b8e78e5a9ef9b811d78f8d67324476ec8430ca3146751124", size = 26939576, upload-time = "2025-07-03T16:22:49.255Z" },
132
+ { url = "https://files.pythonhosted.org/packages/00/1c/b109fd41487d91b8843f9e199b65e89ca533a612ec788b11ed0ba9812ea3/av-15.0.0-cp313-cp313-manylinux2014_i686.manylinux_2_17_i686.whl", hash = "sha256:801a3e0afd5c36df70d012d083bfca67ab22d0ebd2c860c0d9432ac875bc0ad6", size = 34284344, upload-time = "2025-07-03T16:22:52.373Z" },
133
+ { url = "https://files.pythonhosted.org/packages/99/71/aee35fa182d0a41227fbd3f4250fd94c54acdd2995025ee59dd948bba930/av-15.0.0-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:d5e97791b96741b344bf6dbea4fb14481c117b1f7fe8113721e8d80e26cbb388", size = 38130346, upload-time = "2025-07-03T16:22:56.755Z" },
134
+ { url = "https://files.pythonhosted.org/packages/b7/c4/2d9bbc9c42a804c99bc571eeacb2fe1582fe9cfdb726616876cada937d6a/av-15.0.0-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:acb4e4aa6bb394d3a9e60feb4cb7a856fc7bac01f3c99019b1d0f11c898c682c", size = 39728857, upload-time = "2025-07-03T16:23:00.392Z" },
135
+ { url = "https://files.pythonhosted.org/packages/7c/d6/a5746e9fb4fdf326e9897abd7538413210e66f35ad4793fe30f87859249d/av-15.0.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:02d2d80bdbe184f1f3f49b3f5eae7f0ff7cba0a62ab3b18be0505715e586ad29", size = 40109012, upload-time = "2025-07-03T16:23:04.1Z" },
136
+ { url = "https://files.pythonhosted.org/packages/77/1f/da89798231ad0feacfaaea4efec4f1779060226986f97498eabe2c7c54a8/av-15.0.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:603f3ae751f6678df5d8b949f92c6f8257064bba8b3e8db606a24c29d31b4e25", size = 36929211, upload-time = "2025-07-03T16:23:07.694Z" },
137
+ { url = "https://files.pythonhosted.org/packages/d5/4c/2bcabe65a1c19e552f03540f16155a0d02cb9b7a90d31242ab3e0c7ea0d8/av-15.0.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:682686a9ea2745e63c8878641ec26b1787b9210533f3e945a6e07e24ab788c2e", size = 40967172, upload-time = "2025-07-03T16:23:13.488Z" },
138
+ { url = "https://files.pythonhosted.org/packages/c9/f0/fe14adaa670ab7a3f709805a8494fd0a2eeb6a5b18b8c59dc6014639a5b1/av-15.0.0-cp313-cp313-win_amd64.whl", hash = "sha256:5758231163b5486dfbf664036be010b7f5ebb24564aaeb62577464be5ea996e0", size = 31332650, upload-time = "2025-07-03T16:23:16.558Z" },
139
+ ]
140
+
141
  [[package]]
142
  name = "brotli"
143
  version = "1.1.0"
 
241
  { url = "https://files.pythonhosted.org/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6", size = 25335, upload-time = "2022-10-25T02:36:20.889Z" },
242
  ]
243
 
244
+ [[package]]
245
+ name = "einops"
246
+ version = "0.8.1"
247
+ source = { registry = "https://pypi.org/simple" }
248
+ sdist = { url = "https://files.pythonhosted.org/packages/e5/81/df4fbe24dff8ba3934af99044188e20a98ed441ad17a274539b74e82e126/einops-0.8.1.tar.gz", hash = "sha256:de5d960a7a761225532e0f1959e5315ebeafc0cd43394732f103ca44b9837e84", size = 54805, upload-time = "2025-02-09T03:17:00.434Z" }
249
+ wheels = [
250
+ { url = "https://files.pythonhosted.org/packages/87/62/9773de14fe6c45c23649e98b83231fffd7b9892b6cf863251dc2afa73643/einops-0.8.1-py3-none-any.whl", hash = "sha256:919387eb55330f5757c6bea9165c5ff5cfe63a642682ea788a6d472576d81737", size = 64359, upload-time = "2025-02-09T03:17:01.998Z" },
251
+ ]
252
+
253
  [[package]]
254
  name = "fastapi"
255
  version = "0.116.1"
 
429
  { url = "https://files.pythonhosted.org/packages/59/a8/4677014e771ed1591a87b63a2392ce6923baf807193deef302dcfde17542/huggingface_hub-0.34.3-py3-none-any.whl", hash = "sha256:5444550099e2d86e68b2898b09e85878fbd788fc2957b506c6a79ce060e39492", size = 558847, upload-time = "2025-07-29T08:38:51.904Z" },
430
  ]
431
 
432
+ [package.optional-dependencies]
433
+ cli = [
434
+ { name = "inquirerpy" },
435
+ ]
436
+
437
  [[package]]
438
  name = "idna"
439
  version = "3.10"
 
443
  { url = "https://files.pythonhosted.org/packages/76/c6/c88e154df9c4e1a2a66ccf0005a88dfb2650c1dffb6f5ce603dfbd452ce3/idna-3.10-py3-none-any.whl", hash = "sha256:946d195a0d259cbba61165e88e65941f16e9b36ea6ddb97f00452bae8b1287d3", size = 70442, upload-time = "2024-09-15T18:07:37.964Z" },
444
  ]
445
 
446
+ [[package]]
447
+ name = "inquirerpy"
448
+ version = "0.3.4"
449
+ source = { registry = "https://pypi.org/simple" }
450
+ dependencies = [
451
+ { name = "pfzy" },
452
+ { name = "prompt-toolkit" },
453
+ ]
454
+ sdist = { url = "https://files.pythonhosted.org/packages/64/73/7570847b9da026e07053da3bbe2ac7ea6cde6bb2cbd3c7a5a950fa0ae40b/InquirerPy-0.3.4.tar.gz", hash = "sha256:89d2ada0111f337483cb41ae31073108b2ec1e618a49d7110b0d7ade89fc197e", size = 44431, upload-time = "2022-06-27T23:11:20.598Z" }
455
+ wheels = [
456
+ { url = "https://files.pythonhosted.org/packages/ce/ff/3b59672c47c6284e8005b42e84ceba13864aa0f39f067c973d1af02f5d91/InquirerPy-0.3.4-py3-none-any.whl", hash = "sha256:c65fdfbac1fa00e3ee4fb10679f4d3ed7a012abf4833910e63c295827fe2a7d4", size = 67677, upload-time = "2022-06-27T23:11:17.723Z" },
457
+ ]
458
+
459
  [[package]]
460
  name = "jinja2"
461
  version = "3.1.6"
 
527
  { url = "https://files.pythonhosted.org/packages/b3/38/89ba8ad64ae25be8de66a6d463314cf1eb366222074cfda9ee839c56a4b4/mdurl-0.1.2-py3-none-any.whl", hash = "sha256:84008a41e51615a49fc9966191ff91509e3c40b939176e643fd50a5c2196b8f8", size = 9979, upload-time = "2022-08-14T12:40:09.779Z" },
528
  ]
529
 
530
+ [[package]]
531
+ name = "mpmath"
532
+ version = "1.3.0"
533
+ source = { registry = "https://pypi.org/simple" }
534
+ sdist = { url = "https://files.pythonhosted.org/packages/e0/47/dd32fa426cc72114383ac549964eecb20ecfd886d1e5ccf5340b55b02f57/mpmath-1.3.0.tar.gz", hash = "sha256:7a28eb2a9774d00c7bc92411c19a89209d5da7c4c9a9e227be8330a23a25b91f", size = 508106, upload-time = "2023-03-07T16:47:11.061Z" }
535
+ wheels = [
536
+ { url = "https://files.pythonhosted.org/packages/43/e3/7d92a15f894aa0c9c4b49b8ee9ac9850d6e63b03c9c32c0367a13ae62209/mpmath-1.3.0-py3-none-any.whl", hash = "sha256:a0b2b9fe80bbcd81a6647ff13108738cfb482d481d826cc0e02f5b35e5c88d2c", size = 536198, upload-time = "2023-03-07T16:47:09.197Z" },
537
+ ]
538
+
539
+ [[package]]
540
+ name = "networkx"
541
+ version = "3.5"
542
+ source = { registry = "https://pypi.org/simple" }
543
+ sdist = { url = "https://files.pythonhosted.org/packages/6c/4f/ccdb8ad3a38e583f214547fd2f7ff1fc160c43a75af88e6aec213404b96a/networkx-3.5.tar.gz", hash = "sha256:d4c6f9cf81f52d69230866796b82afbccdec3db7ae4fbd1b65ea750feed50037", size = 2471065, upload-time = "2025-05-29T11:35:07.804Z" }
544
+ wheels = [
545
+ { url = "https://files.pythonhosted.org/packages/eb/8d/776adee7bbf76365fdd7f2552710282c79a4ead5d2a46408c9043a2b70ba/networkx-3.5-py3-none-any.whl", hash = "sha256:0030d386a9a06dee3565298b4a734b68589749a544acbb6c412dc9e2489ec6ec", size = 2034406, upload-time = "2025-05-29T11:35:04.961Z" },
546
+ ]
547
+
548
  [[package]]
549
  name = "numpy"
550
  version = "2.3.2"
 
608
  { url = "https://files.pythonhosted.org/packages/c1/9e/1652778bce745a67b5fe05adde60ed362d38eb17d919a540e813d30f6874/numpy-2.3.2-cp314-cp314t-win_arm64.whl", hash = "sha256:092aeb3449833ea9c0bf0089d70c29ae480685dd2377ec9cdbbb620257f84631", size = 10544226, upload-time = "2025-07-24T20:56:34.509Z" },
609
  ]
610
 
611
+ [[package]]
612
+ name = "nvidia-cublas-cu12"
613
+ version = "12.8.4.1"
614
+ source = { registry = "https://pypi.org/simple" }
615
+ wheels = [
616
+ { url = "https://files.pythonhosted.org/packages/dc/61/e24b560ab2e2eaeb3c839129175fb330dfcfc29e5203196e5541a4c44682/nvidia_cublas_cu12-12.8.4.1-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:8ac4e771d5a348c551b2a426eda6193c19aa630236b418086020df5ba9667142", size = 594346921, upload-time = "2025-03-07T01:44:31.254Z" },
617
+ ]
618
+
619
+ [[package]]
620
+ name = "nvidia-cuda-cupti-cu12"
621
+ version = "12.8.90"
622
+ source = { registry = "https://pypi.org/simple" }
623
+ wheels = [
624
+ { url = "https://files.pythonhosted.org/packages/f8/02/2adcaa145158bf1a8295d83591d22e4103dbfd821bcaf6f3f53151ca4ffa/nvidia_cuda_cupti_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:ea0cb07ebda26bb9b29ba82cda34849e73c166c18162d3913575b0c9db9a6182", size = 10248621, upload-time = "2025-03-07T01:40:21.213Z" },
625
+ ]
626
+
627
+ [[package]]
628
+ name = "nvidia-cuda-nvrtc-cu12"
629
+ version = "12.8.93"
630
+ source = { registry = "https://pypi.org/simple" }
631
+ wheels = [
632
+ { url = "https://files.pythonhosted.org/packages/05/6b/32f747947df2da6994e999492ab306a903659555dddc0fbdeb9d71f75e52/nvidia_cuda_nvrtc_cu12-12.8.93-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl", hash = "sha256:a7756528852ef889772a84c6cd89d41dfa74667e24cca16bb31f8f061e3e9994", size = 88040029, upload-time = "2025-03-07T01:42:13.562Z" },
633
+ ]
634
+
635
+ [[package]]
636
+ name = "nvidia-cuda-runtime-cu12"
637
+ version = "12.8.90"
638
+ source = { registry = "https://pypi.org/simple" }
639
+ wheels = [
640
+ { url = "https://files.pythonhosted.org/packages/0d/9b/a997b638fcd068ad6e4d53b8551a7d30fe8b404d6f1804abf1df69838932/nvidia_cuda_runtime_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:adade8dcbd0edf427b7204d480d6066d33902cab2a4707dcfc48a2d0fd44ab90", size = 954765, upload-time = "2025-03-07T01:40:01.615Z" },
641
+ ]
642
+
643
+ [[package]]
644
+ name = "nvidia-cudnn-cu12"
645
+ version = "9.10.2.21"
646
+ source = { registry = "https://pypi.org/simple" }
647
+ dependencies = [
648
+ { name = "nvidia-cublas-cu12" },
649
+ ]
650
+ wheels = [
651
+ { url = "https://files.pythonhosted.org/packages/ba/51/e123d997aa098c61d029f76663dedbfb9bc8dcf8c60cbd6adbe42f76d049/nvidia_cudnn_cu12-9.10.2.21-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:949452be657fa16687d0930933f032835951ef0892b37d2d53824d1a84dc97a8", size = 706758467, upload-time = "2025-06-06T21:54:08.597Z" },
652
+ ]
653
+
654
+ [[package]]
655
+ name = "nvidia-cufft-cu12"
656
+ version = "11.3.3.83"
657
+ source = { registry = "https://pypi.org/simple" }
658
+ dependencies = [
659
+ { name = "nvidia-nvjitlink-cu12" },
660
+ ]
661
+ wheels = [
662
+ { url = "https://files.pythonhosted.org/packages/1f/13/ee4e00f30e676b66ae65b4f08cb5bcbb8392c03f54f2d5413ea99a5d1c80/nvidia_cufft_cu12-11.3.3.83-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4d2dd21ec0b88cf61b62e6b43564355e5222e4a3fb394cac0db101f2dd0d4f74", size = 193118695, upload-time = "2025-03-07T01:45:27.821Z" },
663
+ ]
664
+
665
+ [[package]]
666
+ name = "nvidia-cufile-cu12"
667
+ version = "1.13.1.3"
668
+ source = { registry = "https://pypi.org/simple" }
669
+ wheels = [
670
+ { url = "https://files.pythonhosted.org/packages/bb/fe/1bcba1dfbfb8d01be8d93f07bfc502c93fa23afa6fd5ab3fc7c1df71038a/nvidia_cufile_cu12-1.13.1.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:1d069003be650e131b21c932ec3d8969c1715379251f8d23a1860554b1cb24fc", size = 1197834, upload-time = "2025-03-07T01:45:50.723Z" },
671
+ ]
672
+
673
+ [[package]]
674
+ name = "nvidia-curand-cu12"
675
+ version = "10.3.9.90"
676
+ source = { registry = "https://pypi.org/simple" }
677
+ wheels = [
678
+ { url = "https://files.pythonhosted.org/packages/fb/aa/6584b56dc84ebe9cf93226a5cde4d99080c8e90ab40f0c27bda7a0f29aa1/nvidia_curand_cu12-10.3.9.90-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:b32331d4f4df5d6eefa0554c565b626c7216f87a06a4f56fab27c3b68a830ec9", size = 63619976, upload-time = "2025-03-07T01:46:23.323Z" },
679
+ ]
680
+
681
+ [[package]]
682
+ name = "nvidia-cusolver-cu12"
683
+ version = "11.7.3.90"
684
+ source = { registry = "https://pypi.org/simple" }
685
+ dependencies = [
686
+ { name = "nvidia-cublas-cu12" },
687
+ { name = "nvidia-cusparse-cu12" },
688
+ { name = "nvidia-nvjitlink-cu12" },
689
+ ]
690
+ wheels = [
691
+ { url = "https://files.pythonhosted.org/packages/85/48/9a13d2975803e8cf2777d5ed57b87a0b6ca2cc795f9a4f59796a910bfb80/nvidia_cusolver_cu12-11.7.3.90-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:4376c11ad263152bd50ea295c05370360776f8c3427b30991df774f9fb26c450", size = 267506905, upload-time = "2025-03-07T01:47:16.273Z" },
692
+ ]
693
+
694
+ [[package]]
695
+ name = "nvidia-cusparse-cu12"
696
+ version = "12.5.8.93"
697
+ source = { registry = "https://pypi.org/simple" }
698
+ dependencies = [
699
+ { name = "nvidia-nvjitlink-cu12" },
700
+ ]
701
+ wheels = [
702
+ { url = "https://files.pythonhosted.org/packages/c2/f5/e1854cb2f2bcd4280c44736c93550cc300ff4b8c95ebe370d0aa7d2b473d/nvidia_cusparse_cu12-12.5.8.93-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:1ec05d76bbbd8b61b06a80e1eaf8cf4959c3d4ce8e711b65ebd0443bb0ebb13b", size = 288216466, upload-time = "2025-03-07T01:48:13.779Z" },
703
+ ]
704
+
705
+ [[package]]
706
+ name = "nvidia-cusparselt-cu12"
707
+ version = "0.7.1"
708
+ source = { registry = "https://pypi.org/simple" }
709
+ wheels = [
710
+ { url = "https://files.pythonhosted.org/packages/56/79/12978b96bd44274fe38b5dde5cfb660b1d114f70a65ef962bcbbed99b549/nvidia_cusparselt_cu12-0.7.1-py3-none-manylinux2014_x86_64.whl", hash = "sha256:f1bb701d6b930d5a7cea44c19ceb973311500847f81b634d802b7b539dc55623", size = 287193691, upload-time = "2025-02-26T00:15:44.104Z" },
711
+ ]
712
+
713
+ [[package]]
714
+ name = "nvidia-nccl-cu12"
715
+ version = "2.27.3"
716
+ source = { registry = "https://pypi.org/simple" }
717
+ wheels = [
718
+ { url = "https://files.pythonhosted.org/packages/5c/5b/4e4fff7bad39adf89f735f2bc87248c81db71205b62bcc0d5ca5b606b3c3/nvidia_nccl_cu12-2.27.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:adf27ccf4238253e0b826bce3ff5fa532d65fc42322c8bfdfaf28024c0fbe039", size = 322364134, upload-time = "2025-06-03T21:58:04.013Z" },
719
+ ]
720
+
721
+ [[package]]
722
+ name = "nvidia-nvjitlink-cu12"
723
+ version = "12.8.93"
724
+ source = { registry = "https://pypi.org/simple" }
725
+ wheels = [
726
+ { url = "https://files.pythonhosted.org/packages/f6/74/86a07f1d0f42998ca31312f998bd3b9a7eff7f52378f4f270c8679c77fb9/nvidia_nvjitlink_cu12-12.8.93-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl", hash = "sha256:81ff63371a7ebd6e6451970684f916be2eab07321b73c9d244dc2b4da7f73b88", size = 39254836, upload-time = "2025-03-07T01:49:55.661Z" },
727
+ ]
728
+
729
+ [[package]]
730
+ name = "nvidia-nvtx-cu12"
731
+ version = "12.8.90"
732
+ source = { registry = "https://pypi.org/simple" }
733
+ wheels = [
734
+ { url = "https://files.pythonhosted.org/packages/a2/eb/86626c1bbc2edb86323022371c39aa48df6fd8b0a1647bc274577f72e90b/nvidia_nvtx_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:5b17e2001cc0d751a5bc2c6ec6d26ad95913324a4adb86788c944f8ce9ba441f", size = 89954, upload-time = "2025-03-07T01:42:44.131Z" },
735
+ ]
736
+
737
  [[package]]
738
  name = "orjson"
739
  version = "3.11.1"
 
826
  { url = "https://files.pythonhosted.org/packages/d5/f9/07086f5b0f2a19872554abeea7658200824f5835c58a106fa8f2ae96a46c/pandas-2.3.1-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:5db9637dbc24b631ff3707269ae4559bce4b7fd75c1c4d7e13f40edc42df4444", size = 13189044, upload-time = "2025-07-07T19:19:39.999Z" },
827
  ]
828
 
829
+ [[package]]
830
+ name = "pfzy"
831
+ version = "0.3.4"
832
+ source = { registry = "https://pypi.org/simple" }
833
+ sdist = { url = "https://files.pythonhosted.org/packages/d9/5a/32b50c077c86bfccc7bed4881c5a2b823518f5450a30e639db5d3711952e/pfzy-0.3.4.tar.gz", hash = "sha256:717ea765dd10b63618e7298b2d98efd819e0b30cd5905c9707223dceeb94b3f1", size = 8396, upload-time = "2022-01-28T02:26:17.946Z" }
834
+ wheels = [
835
+ { url = "https://files.pythonhosted.org/packages/8c/d7/8ff98376b1acc4503253b685ea09981697385ce344d4e3935c2af49e044d/pfzy-0.3.4-py3-none-any.whl", hash = "sha256:5f50d5b2b3207fa72e7ec0ef08372ef652685470974a107d0d4999fc5a903a96", size = 8537, upload-time = "2022-01-28T02:26:16.047Z" },
836
+ ]
837
+
838
  [[package]]
839
  name = "pillow"
840
  version = "11.3.0"
 
901
  { url = "https://files.pythonhosted.org/packages/89/c7/5572fa4a3f45740eaab6ae86fcdf7195b55beac1371ac8c619d880cfe948/pillow-11.3.0-cp314-cp314t-win_arm64.whl", hash = "sha256:79ea0d14d3ebad43ec77ad5272e6ff9bba5b679ef73375ea760261207fa8e0aa", size = 2512835, upload-time = "2025-07-01T09:15:50.399Z" },
902
  ]
903
 
904
+ [[package]]
905
+ name = "prompt-toolkit"
906
+ version = "3.0.51"
907
+ source = { registry = "https://pypi.org/simple" }
908
+ dependencies = [
909
+ { name = "wcwidth" },
910
+ ]
911
+ sdist = { url = "https://files.pythonhosted.org/packages/bb/6e/9d084c929dfe9e3bfe0c6a47e31f78a25c54627d64a66e884a8bf5474f1c/prompt_toolkit-3.0.51.tar.gz", hash = "sha256:931a162e3b27fc90c86f1b48bb1fb2c528c2761475e57c9c06de13311c7b54ed", size = 428940, upload-time = "2025-04-15T09:18:47.731Z" }
912
+ wheels = [
913
+ { url = "https://files.pythonhosted.org/packages/ce/4f/5249960887b1fbe561d9ff265496d170b55a735b76724f10ef19f9e40716/prompt_toolkit-3.0.51-py3-none-any.whl", hash = "sha256:52742911fde84e2d423e2f9a4cf1de7d7ac4e51958f648d9540e0fb8db077b07", size = 387810, upload-time = "2025-04-15T09:18:44.753Z" },
914
+ ]
915
+
916
+ [[package]]
917
+ name = "psutil"
918
+ version = "7.0.0"
919
+ source = { registry = "https://pypi.org/simple" }
920
+ sdist = { url = "https://files.pythonhosted.org/packages/2a/80/336820c1ad9286a4ded7e845b2eccfcb27851ab8ac6abece774a6ff4d3de/psutil-7.0.0.tar.gz", hash = "sha256:7be9c3eba38beccb6495ea33afd982a44074b78f28c434a1f51cc07fd315c456", size = 497003, upload-time = "2025-02-13T21:54:07.946Z" }
921
+ wheels = [
922
+ { url = "https://files.pythonhosted.org/packages/ed/e6/2d26234410f8b8abdbf891c9da62bee396583f713fb9f3325a4760875d22/psutil-7.0.0-cp36-abi3-macosx_10_9_x86_64.whl", hash = "sha256:101d71dc322e3cffd7cea0650b09b3d08b8e7c4109dd6809fe452dfd00e58b25", size = 238051, upload-time = "2025-02-13T21:54:12.36Z" },
923
+ { url = "https://files.pythonhosted.org/packages/04/8b/30f930733afe425e3cbfc0e1468a30a18942350c1a8816acfade80c005c4/psutil-7.0.0-cp36-abi3-macosx_11_0_arm64.whl", hash = "sha256:39db632f6bb862eeccf56660871433e111b6ea58f2caea825571951d4b6aa3da", size = 239535, upload-time = "2025-02-13T21:54:16.07Z" },
924
+ { url = "https://files.pythonhosted.org/packages/2a/ed/d362e84620dd22876b55389248e522338ed1bf134a5edd3b8231d7207f6d/psutil-7.0.0-cp36-abi3-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:1fcee592b4c6f146991ca55919ea3d1f8926497a713ed7faaf8225e174581e91", size = 275004, upload-time = "2025-02-13T21:54:18.662Z" },
925
+ { url = "https://files.pythonhosted.org/packages/bf/b9/b0eb3f3cbcb734d930fdf839431606844a825b23eaf9a6ab371edac8162c/psutil-7.0.0-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4b1388a4f6875d7e2aff5c4ca1cc16c545ed41dd8bb596cefea80111db353a34", size = 277986, upload-time = "2025-02-13T21:54:21.811Z" },
926
+ { url = "https://files.pythonhosted.org/packages/eb/a2/709e0fe2f093556c17fbafda93ac032257242cabcc7ff3369e2cb76a97aa/psutil-7.0.0-cp36-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a5f098451abc2828f7dc6b58d44b532b22f2088f4999a937557b603ce72b1993", size = 279544, upload-time = "2025-02-13T21:54:24.68Z" },
927
+ { url = "https://files.pythonhosted.org/packages/50/e6/eecf58810b9d12e6427369784efe814a1eec0f492084ce8eb8f4d89d6d61/psutil-7.0.0-cp37-abi3-win32.whl", hash = "sha256:ba3fcef7523064a6c9da440fc4d6bd07da93ac726b5733c29027d7dc95b39d99", size = 241053, upload-time = "2025-02-13T21:54:34.31Z" },
928
+ { url = "https://files.pythonhosted.org/packages/50/1b/6921afe68c74868b4c9fa424dad3be35b095e16687989ebbb50ce4fceb7c/psutil-7.0.0-cp37-abi3-win_amd64.whl", hash = "sha256:4cf3d4eb1aa9b348dec30105c55cd9b7d4629285735a102beb4441e38db90553", size = 244885, upload-time = "2025-02-13T21:54:37.486Z" },
929
+ ]
930
+
931
  [[package]]
932
  name = "pydantic"
933
  version = "2.11.7"
 
1003
  { url = "https://files.pythonhosted.org/packages/c7/21/705964c7812476f378728bdf590ca4b771ec72385c533964653c68e86bdc/pygments-2.19.2-py3-none-any.whl", hash = "sha256:86540386c03d588bb81d44bc3928634ff26449851e99741617ecb9037ee5ec0b", size = 1225217, upload-time = "2025-06-21T13:39:07.939Z" },
1004
  ]
1005
 
1006
+ [[package]]
1007
+ name = "pymupdf"
1008
+ version = "1.26.3"
1009
+ source = { registry = "https://pypi.org/simple" }
1010
+ sdist = { url = "https://files.pythonhosted.org/packages/6d/d4/70a265e4bcd43e97480ae62da69396ef4507c8f9cfd179005ee731c92a04/pymupdf-1.26.3.tar.gz", hash = "sha256:b7d2c3ffa9870e1e4416d18862f5ccd356af5fe337b4511093bbbce2ca73b7e5", size = 75990308, upload-time = "2025-07-02T21:34:22.243Z" }
1011
+ wheels = [
1012
+ { url = "https://files.pythonhosted.org/packages/70/d3/c7af70545cd3097a869fd635bb6222108d3a0fb28c0b8254754a126c4cbb/pymupdf-1.26.3-cp39-abi3-macosx_10_9_x86_64.whl", hash = "sha256:ded891963944e5f13b03b88f6d9e982e816a4ec8689fe360876eef000c161f2b", size = 23057205, upload-time = "2025-07-02T21:26:16.326Z" },
1013
+ { url = "https://files.pythonhosted.org/packages/04/3d/ec5b69bfeaa5deefa7141fc0b20d77bb20404507cf17196b4eb59f1f2977/pymupdf-1.26.3-cp39-abi3-macosx_11_0_arm64.whl", hash = "sha256:436a33c738bb10eadf00395d18a6992b801ffb26521ee1f361ae786dd283327a", size = 22406630, upload-time = "2025-07-02T21:27:10.112Z" },
1014
+ { url = "https://files.pythonhosted.org/packages/fc/20/661d3894bb05ad75ed6ca103ee2c3fa44d88a458b5c8d4a946b9c0f2569b/pymupdf-1.26.3-cp39-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:a2d7a3cd442f12f05103cb3bb1415111517f0a97162547a3720f3bbbc5e0b51c", size = 23450287, upload-time = "2025-07-03T07:22:19.317Z" },
1015
+ { url = "https://files.pythonhosted.org/packages/9c/7f/21828f018e65b16a033731d21f7b46d93fa81c6e8257f769ca4a1c2a1cb0/pymupdf-1.26.3-cp39-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:454f38c8cf07eb333eb4646dca10517b6e90f57ce2daa2265a78064109d85555", size = 24057319, upload-time = "2025-07-02T21:28:26.697Z" },
1016
+ { url = "https://files.pythonhosted.org/packages/71/5d/e8f88cd5a45b8f5fa6590ce8cef3ce0fad30eac6aac8aea12406f95bee7d/pymupdf-1.26.3-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:759b75d2f710ff4edf8d097d2e98f60e9ecef47632cead6f949b3412facdb9f0", size = 24261350, upload-time = "2025-07-02T21:29:21.733Z" },
1017
+ { url = "https://files.pythonhosted.org/packages/82/22/ecc560e4f281b5dffafbf3a81f023d268b1746d028044f495115b74a2e70/pymupdf-1.26.3-cp39-abi3-win32.whl", hash = "sha256:a839ed44742faa1cd4956bb18068fe5aae435d67ce915e901318646c4e7bbea6", size = 17116371, upload-time = "2025-07-02T21:30:23.253Z" },
1018
+ { url = "https://files.pythonhosted.org/packages/4a/26/8c72973b8833a72785cedc3981eb59b8ac7075942718bbb7b69b352cdde4/pymupdf-1.26.3-cp39-abi3-win_amd64.whl", hash = "sha256:b4cd5124d05737944636cf45fc37ce5824f10e707b0342efe109c7b6bd37a9cc", size = 18735124, upload-time = "2025-07-02T21:31:10.992Z" },
1019
+ ]
1020
+
1021
  [[package]]
1022
  name = "python-dateutil"
1023
  version = "2.9.0.post0"
 
1074
  { url = "https://files.pythonhosted.org/packages/fa/de/02b54f42487e3d3c6efb3f89428677074ca7bf43aae402517bc7cca949f3/PyYAML-6.0.2-cp313-cp313-win_amd64.whl", hash = "sha256:8388ee1976c416731879ac16da0aff3f63b286ffdd57cdeb95f3f2e085687563", size = 156446, upload-time = "2024-08-06T20:33:04.33Z" },
1075
  ]
1076
 
1077
+ [[package]]
1078
+ name = "qwen-vl-utils"
1079
+ version = "0.0.11"
1080
+ source = { registry = "https://pypi.org/simple" }
1081
+ dependencies = [
1082
+ { name = "av" },
1083
+ { name = "packaging" },
1084
+ { name = "pillow" },
1085
+ { name = "requests" },
1086
+ ]
1087
+ sdist = { url = "https://files.pythonhosted.org/packages/42/9f/1229a40ebd49f689a0252144126f3865f31bb4151e942cf781a2936f0c4d/qwen_vl_utils-0.0.11.tar.gz", hash = "sha256:083ba1e5cfa5002165b1e3bddd4d6d26d1d6d34473884033ef12ae3fe8496cd5", size = 7924, upload-time = "2025-04-21T10:38:47.461Z" }
1088
+ wheels = [
1089
+ { url = "https://files.pythonhosted.org/packages/0a/c2/ad7f93e1eea4ea0aefd1cc6fbe7a7095fd2f03a4d8fe2c3707e612b0866e/qwen_vl_utils-0.0.11-py3-none-any.whl", hash = "sha256:7fd5287ac04d6c1f01b93bf053b0be236a35149e414c9e864e3cc5bf2fe8cb7b", size = 7584, upload-time = "2025-04-21T10:38:45.595Z" },
1090
+ ]
1091
+
1092
+ [[package]]
1093
+ name = "regex"
1094
+ version = "2025.7.34"
1095
+ source = { registry = "https://pypi.org/simple" }
1096
+ sdist = { url = "https://files.pythonhosted.org/packages/0b/de/e13fa6dc61d78b30ba47481f99933a3b49a57779d625c392d8036770a60d/regex-2025.7.34.tar.gz", hash = "sha256:9ead9765217afd04a86822dfcd4ed2747dfe426e887da413b15ff0ac2457e21a", size = 400714, upload-time = "2025-07-31T00:21:16.262Z" }
1097
+ wheels = [
1098
+ { url = "https://files.pythonhosted.org/packages/ff/f0/31d62596c75a33f979317658e8d261574785c6cd8672c06741ce2e2e2070/regex-2025.7.34-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:7f7211a746aced993bef487de69307a38c5ddd79257d7be83f7b202cb59ddb50", size = 485492, upload-time = "2025-07-31T00:19:35.57Z" },
1099
+ { url = "https://files.pythonhosted.org/packages/d8/16/b818d223f1c9758c3434be89aa1a01aae798e0e0df36c1f143d1963dd1ee/regex-2025.7.34-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:fb31080f2bd0681484b275461b202b5ad182f52c9ec606052020fe13eb13a72f", size = 290000, upload-time = "2025-07-31T00:19:37.175Z" },
1100
+ { url = "https://files.pythonhosted.org/packages/cd/70/69506d53397b4bd6954061bae75677ad34deb7f6ca3ba199660d6f728ff5/regex-2025.7.34-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:0200a5150c4cf61e407038f4b4d5cdad13e86345dac29ff9dab3d75d905cf130", size = 286072, upload-time = "2025-07-31T00:19:38.612Z" },
1101
+ { url = "https://files.pythonhosted.org/packages/b0/73/536a216d5f66084fb577bb0543b5cb7de3272eb70a157f0c3a542f1c2551/regex-2025.7.34-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:739a74970e736df0773788377969c9fea3876c2fc13d0563f98e5503e5185f46", size = 797341, upload-time = "2025-07-31T00:19:40.119Z" },
1102
+ { url = "https://files.pythonhosted.org/packages/26/af/733f8168449e56e8f404bb807ea7189f59507cbea1b67a7bbcd92f8bf844/regex-2025.7.34-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:4fef81b2f7ea6a2029161ed6dea9ae13834c28eb5a95b8771828194a026621e4", size = 862556, upload-time = "2025-07-31T00:19:41.556Z" },
1103
+ { url = "https://files.pythonhosted.org/packages/19/dd/59c464d58c06c4f7d87de4ab1f590e430821345a40c5d345d449a636d15f/regex-2025.7.34-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:ea74cf81fe61a7e9d77989050d0089a927ab758c29dac4e8e1b6c06fccf3ebf0", size = 910762, upload-time = "2025-07-31T00:19:43Z" },
1104
+ { url = "https://files.pythonhosted.org/packages/37/a8/b05ccf33ceca0815a1e253693b2c86544932ebcc0049c16b0fbdf18b688b/regex-2025.7.34-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e4636a7f3b65a5f340ed9ddf53585c42e3ff37101d383ed321bfe5660481744b", size = 801892, upload-time = "2025-07-31T00:19:44.645Z" },
1105
+ { url = "https://files.pythonhosted.org/packages/5f/9a/b993cb2e634cc22810afd1652dba0cae156c40d4864285ff486c73cd1996/regex-2025.7.34-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:6cef962d7834437fe8d3da6f9bfc6f93f20f218266dcefec0560ed7765f5fe01", size = 786551, upload-time = "2025-07-31T00:19:46.127Z" },
1106
+ { url = "https://files.pythonhosted.org/packages/2d/79/7849d67910a0de4e26834b5bb816e028e35473f3d7ae563552ea04f58ca2/regex-2025.7.34-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:cbe1698e5b80298dbce8df4d8d1182279fbdaf1044e864cbc9d53c20e4a2be77", size = 856457, upload-time = "2025-07-31T00:19:47.562Z" },
1107
+ { url = "https://files.pythonhosted.org/packages/91/c6/de516bc082524b27e45cb4f54e28bd800c01efb26d15646a65b87b13a91e/regex-2025.7.34-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:32b9f9bcf0f605eb094b08e8da72e44badabb63dde6b83bd530580b488d1c6da", size = 848902, upload-time = "2025-07-31T00:19:49.312Z" },
1108
+ { url = "https://files.pythonhosted.org/packages/7d/22/519ff8ba15f732db099b126f039586bd372da6cd4efb810d5d66a5daeda1/regex-2025.7.34-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:524c868ba527eab4e8744a9287809579f54ae8c62fbf07d62aacd89f6026b282", size = 788038, upload-time = "2025-07-31T00:19:50.794Z" },
1109
+ { url = "https://files.pythonhosted.org/packages/3f/7d/aabb467d8f57d8149895d133c88eb809a1a6a0fe262c1d508eb9dfabb6f9/regex-2025.7.34-cp312-cp312-win32.whl", hash = "sha256:d600e58ee6d036081c89696d2bdd55d507498a7180df2e19945c6642fac59588", size = 264417, upload-time = "2025-07-31T00:19:52.292Z" },
1110
+ { url = "https://files.pythonhosted.org/packages/3b/39/bd922b55a4fc5ad5c13753274e5b536f5b06ec8eb9747675668491c7ab7a/regex-2025.7.34-cp312-cp312-win_amd64.whl", hash = "sha256:9a9ab52a466a9b4b91564437b36417b76033e8778e5af8f36be835d8cb370d62", size = 275387, upload-time = "2025-07-31T00:19:53.593Z" },
1111
+ { url = "https://files.pythonhosted.org/packages/f7/3c/c61d2fdcecb754a40475a3d1ef9a000911d3e3fc75c096acf44b0dfb786a/regex-2025.7.34-cp312-cp312-win_arm64.whl", hash = "sha256:c83aec91af9c6fbf7c743274fd952272403ad9a9db05fe9bfc9df8d12b45f176", size = 268482, upload-time = "2025-07-31T00:19:55.183Z" },
1112
+ { url = "https://files.pythonhosted.org/packages/15/16/b709b2119975035169a25aa8e4940ca177b1a2e25e14f8d996d09130368e/regex-2025.7.34-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:c3c9740a77aeef3f5e3aaab92403946a8d34437db930a0280e7e81ddcada61f5", size = 485334, upload-time = "2025-07-31T00:19:56.58Z" },
1113
+ { url = "https://files.pythonhosted.org/packages/94/a6/c09136046be0595f0331bc58a0e5f89c2d324cf734e0b0ec53cf4b12a636/regex-2025.7.34-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:69ed3bc611540f2ea70a4080f853741ec698be556b1df404599f8724690edbcd", size = 289942, upload-time = "2025-07-31T00:19:57.943Z" },
1114
+ { url = "https://files.pythonhosted.org/packages/36/91/08fc0fd0f40bdfb0e0df4134ee37cfb16e66a1044ac56d36911fd01c69d2/regex-2025.7.34-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:d03c6f9dcd562c56527c42b8530aad93193e0b3254a588be1f2ed378cdfdea1b", size = 285991, upload-time = "2025-07-31T00:19:59.837Z" },
1115
+ { url = "https://files.pythonhosted.org/packages/be/2f/99dc8f6f756606f0c214d14c7b6c17270b6bbe26d5c1f05cde9dbb1c551f/regex-2025.7.34-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6164b1d99dee1dfad33f301f174d8139d4368a9fb50bf0a3603b2eaf579963ad", size = 797415, upload-time = "2025-07-31T00:20:01.668Z" },
1116
+ { url = "https://files.pythonhosted.org/packages/62/cf/2fcdca1110495458ba4e95c52ce73b361cf1cafd8a53b5c31542cde9a15b/regex-2025.7.34-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:1e4f4f62599b8142362f164ce776f19d79bdd21273e86920a7b604a4275b4f59", size = 862487, upload-time = "2025-07-31T00:20:03.142Z" },
1117
+ { url = "https://files.pythonhosted.org/packages/90/38/899105dd27fed394e3fae45607c1983e138273ec167e47882fc401f112b9/regex-2025.7.34-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:72a26dcc6a59c057b292f39d41465d8233a10fd69121fa24f8f43ec6294e5415", size = 910717, upload-time = "2025-07-31T00:20:04.727Z" },
1118
+ { url = "https://files.pythonhosted.org/packages/ee/f6/4716198dbd0bcc9c45625ac4c81a435d1c4d8ad662e8576dac06bab35b17/regex-2025.7.34-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d5273fddf7a3e602695c92716c420c377599ed3c853ea669c1fe26218867002f", size = 801943, upload-time = "2025-07-31T00:20:07.1Z" },
1119
+ { url = "https://files.pythonhosted.org/packages/40/5d/cff8896d27e4e3dd11dd72ac78797c7987eb50fe4debc2c0f2f1682eb06d/regex-2025.7.34-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:c1844be23cd40135b3a5a4dd298e1e0c0cb36757364dd6cdc6025770363e06c1", size = 786664, upload-time = "2025-07-31T00:20:08.818Z" },
1120
+ { url = "https://files.pythonhosted.org/packages/10/29/758bf83cf7b4c34f07ac3423ea03cee3eb3176941641e4ccc05620f6c0b8/regex-2025.7.34-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:dde35e2afbbe2272f8abee3b9fe6772d9b5a07d82607b5788e8508974059925c", size = 856457, upload-time = "2025-07-31T00:20:10.328Z" },
1121
+ { url = "https://files.pythonhosted.org/packages/d7/30/c19d212b619963c5b460bfed0ea69a092c6a43cba52a973d46c27b3e2975/regex-2025.7.34-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:f3f6e8e7af516a7549412ce57613e859c3be27d55341a894aacaa11703a4c31a", size = 849008, upload-time = "2025-07-31T00:20:11.823Z" },
1122
+ { url = "https://files.pythonhosted.org/packages/9e/b8/3c35da3b12c87e3cc00010ef6c3a4ae787cff0bc381aa3d251def219969a/regex-2025.7.34-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:469142fb94a869beb25b5f18ea87646d21def10fbacb0bcb749224f3509476f0", size = 788101, upload-time = "2025-07-31T00:20:13.729Z" },
1123
+ { url = "https://files.pythonhosted.org/packages/47/80/2f46677c0b3c2b723b2c358d19f9346e714113865da0f5f736ca1a883bde/regex-2025.7.34-cp313-cp313-win32.whl", hash = "sha256:da7507d083ee33ccea1310447410c27ca11fb9ef18c95899ca57ff60a7e4d8f1", size = 264401, upload-time = "2025-07-31T00:20:15.233Z" },
1124
+ { url = "https://files.pythonhosted.org/packages/be/fa/917d64dd074682606a003cba33585c28138c77d848ef72fc77cbb1183849/regex-2025.7.34-cp313-cp313-win_amd64.whl", hash = "sha256:9d644de5520441e5f7e2db63aec2748948cc39ed4d7a87fd5db578ea4043d997", size = 275368, upload-time = "2025-07-31T00:20:16.711Z" },
1125
+ { url = "https://files.pythonhosted.org/packages/65/cd/f94383666704170a2154a5df7b16be28f0c27a266bffcd843e58bc84120f/regex-2025.7.34-cp313-cp313-win_arm64.whl", hash = "sha256:7bf1c5503a9f2cbd2f52d7e260acb3131b07b6273c470abb78568174fe6bde3f", size = 268482, upload-time = "2025-07-31T00:20:18.189Z" },
1126
+ { url = "https://files.pythonhosted.org/packages/ac/23/6376f3a23cf2f3c00514b1cdd8c990afb4dfbac3cb4a68b633c6b7e2e307/regex-2025.7.34-cp314-cp314-macosx_10_13_universal2.whl", hash = "sha256:8283afe7042d8270cecf27cca558873168e771183d4d593e3c5fe5f12402212a", size = 485385, upload-time = "2025-07-31T00:20:19.692Z" },
1127
+ { url = "https://files.pythonhosted.org/packages/73/5b/6d4d3a0b4d312adbfd6d5694c8dddcf1396708976dd87e4d00af439d962b/regex-2025.7.34-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:6c053f9647e3421dd2f5dff8172eb7b4eec129df9d1d2f7133a4386319b47435", size = 289788, upload-time = "2025-07-31T00:20:21.941Z" },
1128
+ { url = "https://files.pythonhosted.org/packages/92/71/5862ac9913746e5054d01cb9fb8125b3d0802c0706ef547cae1e7f4428fa/regex-2025.7.34-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:a16dd56bbcb7d10e62861c3cd000290ddff28ea142ffb5eb3470f183628011ac", size = 286136, upload-time = "2025-07-31T00:20:26.146Z" },
1129
+ { url = "https://files.pythonhosted.org/packages/27/df/5b505dc447eb71278eba10d5ec940769ca89c1af70f0468bfbcb98035dc2/regex-2025.7.34-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:69c593ff5a24c0d5c1112b0df9b09eae42b33c014bdca7022d6523b210b69f72", size = 797753, upload-time = "2025-07-31T00:20:27.919Z" },
1130
+ { url = "https://files.pythonhosted.org/packages/86/38/3e3dc953d13998fa047e9a2414b556201dbd7147034fbac129392363253b/regex-2025.7.34-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:98d0ce170fcde1a03b5df19c5650db22ab58af375aaa6ff07978a85c9f250f0e", size = 863263, upload-time = "2025-07-31T00:20:29.803Z" },
1131
+ { url = "https://files.pythonhosted.org/packages/68/e5/3ff66b29dde12f5b874dda2d9dec7245c2051f2528d8c2a797901497f140/regex-2025.7.34-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:d72765a4bff8c43711d5b0f5b452991a9947853dfa471972169b3cc0ba1d0751", size = 910103, upload-time = "2025-07-31T00:20:31.313Z" },
1132
+ { url = "https://files.pythonhosted.org/packages/9e/fe/14176f2182125977fba3711adea73f472a11f3f9288c1317c59cd16ad5e6/regex-2025.7.34-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4494f8fd95a77eb434039ad8460e64d57baa0434f1395b7da44015bef650d0e4", size = 801709, upload-time = "2025-07-31T00:20:33.323Z" },
1133
+ { url = "https://files.pythonhosted.org/packages/5a/0d/80d4e66ed24f1ba876a9e8e31b709f9fd22d5c266bf5f3ab3c1afe683d7d/regex-2025.7.34-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:4f42b522259c66e918a0121a12429b2abcf696c6f967fa37bdc7b72e61469f98", size = 786726, upload-time = "2025-07-31T00:20:35.252Z" },
1134
+ { url = "https://files.pythonhosted.org/packages/12/75/c3ebb30e04a56c046f5c85179dc173818551037daae2c0c940c7b19152cb/regex-2025.7.34-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:aaef1f056d96a0a5d53ad47d019d5b4c66fe4be2da87016e0d43b7242599ffc7", size = 857306, upload-time = "2025-07-31T00:20:37.12Z" },
1135
+ { url = "https://files.pythonhosted.org/packages/b1/b2/a4dc5d8b14f90924f27f0ac4c4c4f5e195b723be98adecc884f6716614b6/regex-2025.7.34-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:656433e5b7dccc9bc0da6312da8eb897b81f5e560321ec413500e5367fcd5d47", size = 848494, upload-time = "2025-07-31T00:20:38.818Z" },
1136
+ { url = "https://files.pythonhosted.org/packages/0d/21/9ac6e07a4c5e8646a90b56b61f7e9dac11ae0747c857f91d3d2bc7c241d9/regex-2025.7.34-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:e91eb2c62c39705e17b4d42d4b86c4e86c884c0d15d9c5a47d0835f8387add8e", size = 787850, upload-time = "2025-07-31T00:20:40.478Z" },
1137
+ { url = "https://files.pythonhosted.org/packages/be/6c/d51204e28e7bc54f9a03bb799b04730d7e54ff2718862b8d4e09e7110a6a/regex-2025.7.34-cp314-cp314-win32.whl", hash = "sha256:f978ddfb6216028c8f1d6b0f7ef779949498b64117fc35a939022f67f810bdcb", size = 269730, upload-time = "2025-07-31T00:20:42.253Z" },
1138
+ { url = "https://files.pythonhosted.org/packages/74/52/a7e92d02fa1fdef59d113098cb9f02c5d03289a0e9f9e5d4d6acccd10677/regex-2025.7.34-cp314-cp314-win_amd64.whl", hash = "sha256:4b7dc33b9b48fb37ead12ffc7bdb846ac72f99a80373c4da48f64b373a7abeae", size = 278640, upload-time = "2025-07-31T00:20:44.42Z" },
1139
+ { url = "https://files.pythonhosted.org/packages/d1/78/a815529b559b1771080faa90c3ab401730661f99d495ab0071649f139ebd/regex-2025.7.34-cp314-cp314-win_arm64.whl", hash = "sha256:4b8c4d39f451e64809912c82392933d80fe2e4a87eeef8859fcc5380d0173c64", size = 271757, upload-time = "2025-07-31T00:20:46.355Z" },
1140
+ ]
1141
+
1142
  [[package]]
1143
  name = "requests"
1144
  version = "2.32.4"
 
1204
  { url = "https://files.pythonhosted.org/packages/4d/c0/1108ad9f01567f66b3154063605b350b69c3c9366732e09e45f9fd0d1deb/safehttpx-0.1.6-py3-none-any.whl", hash = "sha256:407cff0b410b071623087c63dd2080c3b44dc076888d8c5823c00d1e58cb381c", size = 8692, upload-time = "2024-12-02T18:44:08.555Z" },
1205
  ]
1206
 
1207
+ [[package]]
1208
+ name = "safetensors"
1209
+ version = "0.6.1"
1210
+ source = { registry = "https://pypi.org/simple" }
1211
+ sdist = { url = "https://files.pythonhosted.org/packages/6c/d2/94fe37355a1d4ff86b0f43b9a018515d5d29bf7ad6d01318a80f5db2fd6a/safetensors-0.6.1.tar.gz", hash = "sha256:a766ba6e19b198eff09be05f24cd89eda1670ed404ae828e2aa3fc09816ba8d8", size = 197968, upload-time = "2025-08-06T09:39:38.376Z" }
1212
+ wheels = [
1213
+ { url = "https://files.pythonhosted.org/packages/6b/c0/40263a2103511917f9a92b4e114ecaff68586df07f12d1d877312f1261f3/safetensors-0.6.1-cp38-abi3-macosx_10_12_x86_64.whl", hash = "sha256:81ed1b69d6f8acd7e759a71197ce3a69da4b7e9faa9dbb005eb06a83b1a4e52d", size = 455232, upload-time = "2025-08-06T09:39:32.037Z" },
1214
+ { url = "https://files.pythonhosted.org/packages/86/bf/432cb4bb1c336d338dd9b29f78622b1441ee06e5868bf1de2ca2bec74c08/safetensors-0.6.1-cp38-abi3-macosx_11_0_arm64.whl", hash = "sha256:01b51af8cb7a3870203f2735e3c7c24d1a65fb2846e75613c8cf9d284271eccc", size = 432150, upload-time = "2025-08-06T09:39:31.008Z" },
1215
+ { url = "https://files.pythonhosted.org/packages/05/d7/820c99032a53d57279ae199df7d114a8c9e2bbce4fa69bc0de53743495f0/safetensors-0.6.1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:64a733886d79e726899b9d9643813e48a2eec49f3ef0fdb8cd4b8152046101c3", size = 471634, upload-time = "2025-08-06T09:39:22.17Z" },
1216
+ { url = "https://files.pythonhosted.org/packages/ea/8b/bcd960087eded7690f118ceeda294912f92a3b508a1d9a504f9c2e02041b/safetensors-0.6.1-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:f233dc3b12fb641b36724844754b6bb41349615a0e258087560968d6da92add5", size = 487855, upload-time = "2025-08-06T09:39:24.142Z" },
1217
+ { url = "https://files.pythonhosted.org/packages/41/64/b44eac4ad87c4e1c0cf5ba5e204c032b1b1eac8ce2b8f65f87791e647bd6/safetensors-0.6.1-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:6f16289e2af54affd591dd78ed12b5465e4dc5823f818beaeddd49a010cf3ba7", size = 607240, upload-time = "2025-08-06T09:39:25.463Z" },
1218
+ { url = "https://files.pythonhosted.org/packages/52/75/0347fa0c080af8bd3341af26a30b85939f6362d4f5240add1a0c9d793354/safetensors-0.6.1-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:1b62eab84e2c69918b598272504c5d2ebfe64da6c16fdf8682054eec9572534d", size = 519864, upload-time = "2025-08-06T09:39:26.872Z" },
1219
+ { url = "https://files.pythonhosted.org/packages/ea/f3/83843d1fe9164f44a267373c55cba706530b209b58415f807b40edddcd3e/safetensors-0.6.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d498363746555dccffc02a47dfe1dee70f7784f3f37f1d66b408366c5d3a989e", size = 485926, upload-time = "2025-08-06T09:39:29.109Z" },
1220
+ { url = "https://files.pythonhosted.org/packages/b8/26/f6b0cb5210bab0e343214fdba7c2df80a69b019e62e760ddc61b18bec383/safetensors-0.6.1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:eed2079dca3ca948d7b0d7120396e776bbc6680637cf199d393e157fde25c937", size = 518999, upload-time = "2025-08-06T09:39:28.054Z" },
1221
+ { url = "https://files.pythonhosted.org/packages/90/b7/8910b165c97d3bd6d445c6ca8b704ec23d0fa33849ce9a51dc783827a302/safetensors-0.6.1-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:294040ff20ebe079a2b4976cfa9a5be0202f56ca4f7f190b4e52009e8c026ceb", size = 650669, upload-time = "2025-08-06T09:39:32.997Z" },
1222
+ { url = "https://files.pythonhosted.org/packages/00/bc/2eeb025381d0834ae038aae2d383dfa830c2e0068e2e4e512ea99b135a4b/safetensors-0.6.1-cp38-abi3-musllinux_1_2_armv7l.whl", hash = "sha256:75693208b492a026b926edeebbae888cc644433bee4993573ead2dc44810b519", size = 750019, upload-time = "2025-08-06T09:39:34.397Z" },
1223
+ { url = "https://files.pythonhosted.org/packages/f9/38/5dda9a8e056eb1f17ed3a7846698fd94623a1648013cdf522538845755da/safetensors-0.6.1-cp38-abi3-musllinux_1_2_i686.whl", hash = "sha256:a8687b71ac67a0b3f8ce87df9e8024edf087e94c34ef46eaaad694dce8d2f83f", size = 689888, upload-time = "2025-08-06T09:39:35.584Z" },
1224
+ { url = "https://files.pythonhosted.org/packages/dd/60/15ee3961996d951002378d041bd82863a5c70738a71375b42d6dd5d2a6d3/safetensors-0.6.1-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:5dd969a01c738104f707fa0e306b757f5beb3ebdcd682fe0724170a0bf1c21fb", size = 655539, upload-time = "2025-08-06T09:39:37.093Z" },
1225
+ { url = "https://files.pythonhosted.org/packages/91/d6/01172a9c77c566800286d379bfc341d75370eae2118dfd339edfd0394c4a/safetensors-0.6.1-cp38-abi3-win32.whl", hash = "sha256:7c3d8d34d01673d1a917445c9437ee73a9d48bc6af10352b84bbd46c5da93ca5", size = 308594, upload-time = "2025-08-06T09:39:40.916Z" },
1226
+ { url = "https://files.pythonhosted.org/packages/6c/5d/195dc1917d7ae93dd990d9b2f8b9c88e451bcc78e0b63ee107beebc1e4be/safetensors-0.6.1-cp38-abi3-win_amd64.whl", hash = "sha256:4720957052d57c5ac48912c3f6e07e9a334d9632758c9b0c054afba477fcbe2d", size = 320282, upload-time = "2025-08-06T09:39:39.54Z" },
1227
+ ]
1228
+
1229
  [[package]]
1230
  name = "semantic-version"
1231
  version = "2.10.0"
 
1235
  { url = "https://files.pythonhosted.org/packages/6a/23/8146aad7d88f4fcb3a6218f41a60f6c2d4e3a72de72da1825dc7c8f7877c/semantic_version-2.10.0-py2.py3-none-any.whl", hash = "sha256:de78a3b8e0feda74cabc54aab2da702113e33ac9d9eb9d2389bcf1f58b7d9177", size = 15552, upload-time = "2022-05-26T13:35:21.206Z" },
1236
  ]
1237
 
1238
+ [[package]]
1239
+ name = "setuptools"
1240
+ version = "80.9.0"
1241
+ source = { registry = "https://pypi.org/simple" }
1242
+ sdist = { url = "https://files.pythonhosted.org/packages/18/5d/3bf57dcd21979b887f014ea83c24ae194cfcd12b9e0fda66b957c69d1fca/setuptools-80.9.0.tar.gz", hash = "sha256:f36b47402ecde768dbfafc46e8e4207b4360c654f1f3bb84475f0a28628fb19c", size = 1319958, upload-time = "2025-05-27T00:56:51.443Z" }
1243
+ wheels = [
1244
+ { url = "https://files.pythonhosted.org/packages/a3/dc/17031897dae0efacfea57dfd3a82fdd2a2aeb58e0ff71b77b87e44edc772/setuptools-80.9.0-py3-none-any.whl", hash = "sha256:062d34222ad13e0cc312a4c02d73f059e86a4acbfbdea8f8f76b28c99f306922", size = 1201486, upload-time = "2025-05-27T00:56:49.664Z" },
1245
+ ]
1246
+
1247
  [[package]]
1248
  name = "shellingham"
1249
  version = "1.5.4"
 
1284
  { url = "https://files.pythonhosted.org/packages/f7/1f/b876b1f83aef204198a42dc101613fefccb32258e5428b5f9259677864b4/starlette-0.47.2-py3-none-any.whl", hash = "sha256:c5847e96134e5c5371ee9fac6fdf1a67336d5815e09eb2a01fdb57a351ef915b", size = 72984, upload-time = "2025-07-20T17:31:56.738Z" },
1285
  ]
1286
 
1287
+ [[package]]
1288
+ name = "sympy"
1289
+ version = "1.14.0"
1290
+ source = { registry = "https://pypi.org/simple" }
1291
+ dependencies = [
1292
+ { name = "mpmath" },
1293
+ ]
1294
+ sdist = { url = "https://files.pythonhosted.org/packages/83/d3/803453b36afefb7c2bb238361cd4ae6125a569b4db67cd9e79846ba2d68c/sympy-1.14.0.tar.gz", hash = "sha256:d3d3fe8df1e5a0b42f0e7bdf50541697dbe7d23746e894990c030e2b05e72517", size = 7793921, upload-time = "2025-04-27T18:05:01.611Z" }
1295
+ wheels = [
1296
+ { url = "https://files.pythonhosted.org/packages/a2/09/77d55d46fd61b4a135c444fc97158ef34a095e5681d0a6c10b75bf356191/sympy-1.14.0-py3-none-any.whl", hash = "sha256:e091cc3e99d2141a0ba2847328f5479b05d94a6635cb96148ccb3f34671bd8f5", size = 6299353, upload-time = "2025-04-27T18:04:59.103Z" },
1297
+ ]
1298
+
1299
+ [[package]]
1300
+ name = "tokenizers"
1301
+ version = "0.21.4"
1302
+ source = { registry = "https://pypi.org/simple" }
1303
+ dependencies = [
1304
+ { name = "huggingface-hub" },
1305
+ ]
1306
+ sdist = { url = "https://files.pythonhosted.org/packages/c2/2f/402986d0823f8d7ca139d969af2917fefaa9b947d1fb32f6168c509f2492/tokenizers-0.21.4.tar.gz", hash = "sha256:fa23f85fbc9a02ec5c6978da172cdcbac23498c3ca9f3645c5c68740ac007880", size = 351253, upload-time = "2025-07-28T15:48:54.325Z" }
1307
+ wheels = [
1308
+ { url = "https://files.pythonhosted.org/packages/98/c6/fdb6f72bf6454f52eb4a2510be7fb0f614e541a2554d6210e370d85efff4/tokenizers-0.21.4-cp39-abi3-macosx_10_12_x86_64.whl", hash = "sha256:2ccc10a7c3bcefe0f242867dc914fc1226ee44321eb618cfe3019b5df3400133", size = 2863987, upload-time = "2025-07-28T15:48:44.877Z" },
1309
+ { url = "https://files.pythonhosted.org/packages/8d/a6/28975479e35ddc751dc1ddc97b9b69bf7fcf074db31548aab37f8116674c/tokenizers-0.21.4-cp39-abi3-macosx_11_0_arm64.whl", hash = "sha256:5e2f601a8e0cd5be5cc7506b20a79112370b9b3e9cb5f13f68ab11acd6ca7d60", size = 2732457, upload-time = "2025-07-28T15:48:43.265Z" },
1310
+ { url = "https://files.pythonhosted.org/packages/aa/8f/24f39d7b5c726b7b0be95dca04f344df278a3fe3a4deb15a975d194cbb32/tokenizers-0.21.4-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:39b376f5a1aee67b4d29032ee85511bbd1b99007ec735f7f35c8a2eb104eade5", size = 3012624, upload-time = "2025-07-28T13:22:43.895Z" },
1311
+ { url = "https://files.pythonhosted.org/packages/58/47/26358925717687a58cb74d7a508de96649544fad5778f0cd9827398dc499/tokenizers-0.21.4-cp39-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:2107ad649e2cda4488d41dfd031469e9da3fcbfd6183e74e4958fa729ffbf9c6", size = 2939681, upload-time = "2025-07-28T13:22:47.499Z" },
1312
+ { url = "https://files.pythonhosted.org/packages/99/6f/cc300fea5db2ab5ddc2c8aea5757a27b89c84469899710c3aeddc1d39801/tokenizers-0.21.4-cp39-abi3-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:3c73012da95afafdf235ba80047699df4384fdc481527448a078ffd00e45a7d9", size = 3247445, upload-time = "2025-07-28T15:48:39.711Z" },
1313
+ { url = "https://files.pythonhosted.org/packages/be/bf/98cb4b9c3c4afd8be89cfa6423704337dc20b73eb4180397a6e0d456c334/tokenizers-0.21.4-cp39-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:f23186c40395fc390d27f519679a58023f368a0aad234af145e0f39ad1212732", size = 3428014, upload-time = "2025-07-28T13:22:49.569Z" },
1314
+ { url = "https://files.pythonhosted.org/packages/75/c7/96c1cc780e6ca7f01a57c13235dd05b7bc1c0f3588512ebe9d1331b5f5ae/tokenizers-0.21.4-cp39-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:cc88bb34e23a54cc42713d6d98af5f1bf79c07653d24fe984d2d695ba2c922a2", size = 3193197, upload-time = "2025-07-28T13:22:51.471Z" },
1315
+ { url = "https://files.pythonhosted.org/packages/f2/90/273b6c7ec78af547694eddeea9e05de771278bd20476525ab930cecaf7d8/tokenizers-0.21.4-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:51b7eabb104f46c1c50b486520555715457ae833d5aee9ff6ae853d1130506ff", size = 3115426, upload-time = "2025-07-28T15:48:41.439Z" },
1316
+ { url = "https://files.pythonhosted.org/packages/91/43/c640d5a07e95f1cf9d2c92501f20a25f179ac53a4f71e1489a3dcfcc67ee/tokenizers-0.21.4-cp39-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:714b05b2e1af1288bd1bc56ce496c4cebb64a20d158ee802887757791191e6e2", size = 9089127, upload-time = "2025-07-28T15:48:46.472Z" },
1317
+ { url = "https://files.pythonhosted.org/packages/44/a1/dd23edd6271d4dca788e5200a807b49ec3e6987815cd9d0a07ad9c96c7c2/tokenizers-0.21.4-cp39-abi3-musllinux_1_2_armv7l.whl", hash = "sha256:1340ff877ceedfa937544b7d79f5b7becf33a4cfb58f89b3b49927004ef66f78", size = 9055243, upload-time = "2025-07-28T15:48:48.539Z" },
1318
+ { url = "https://files.pythonhosted.org/packages/21/2b/b410d6e9021c4b7ddb57248304dc817c4d4970b73b6ee343674914701197/tokenizers-0.21.4-cp39-abi3-musllinux_1_2_i686.whl", hash = "sha256:3c1f4317576e465ac9ef0d165b247825a2a4078bcd01cba6b54b867bdf9fdd8b", size = 9298237, upload-time = "2025-07-28T15:48:50.443Z" },
1319
+ { url = "https://files.pythonhosted.org/packages/b7/0a/42348c995c67e2e6e5c89ffb9cfd68507cbaeb84ff39c49ee6e0a6dd0fd2/tokenizers-0.21.4-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:c212aa4e45ec0bb5274b16b6f31dd3f1c41944025c2358faaa5782c754e84c24", size = 9461980, upload-time = "2025-07-28T15:48:52.325Z" },
1320
+ { url = "https://files.pythonhosted.org/packages/3d/d3/dacccd834404cd71b5c334882f3ba40331ad2120e69ded32cf5fda9a7436/tokenizers-0.21.4-cp39-abi3-win32.whl", hash = "sha256:6c42a930bc5f4c47f4ea775c91de47d27910881902b0f20e4990ebe045a415d0", size = 2329871, upload-time = "2025-07-28T15:48:56.841Z" },
1321
+ { url = "https://files.pythonhosted.org/packages/41/f2/fd673d979185f5dcbac4be7d09461cbb99751554ffb6718d0013af8604cb/tokenizers-0.21.4-cp39-abi3-win_amd64.whl", hash = "sha256:475d807a5c3eb72c59ad9b5fcdb254f6e17f53dfcbb9903233b0dfa9c943b597", size = 2507568, upload-time = "2025-07-28T15:48:55.456Z" },
1322
+ ]
1323
+
1324
  [[package]]
1325
  name = "tomlkit"
1326
  version = "0.13.3"
 
1330
  { url = "https://files.pythonhosted.org/packages/bd/75/8539d011f6be8e29f339c42e633aae3cb73bffa95dd0f9adec09b9c58e85/tomlkit-0.13.3-py3-none-any.whl", hash = "sha256:c89c649d79ee40629a9fda55f8ace8c6a1b42deb912b2a8fd8d942ddadb606b0", size = 38901, upload-time = "2025-06-05T07:13:43.546Z" },
1331
  ]
1332
 
1333
+ [[package]]
1334
+ name = "torch"
1335
+ version = "2.8.0"
1336
+ source = { registry = "https://pypi.org/simple" }
1337
+ dependencies = [
1338
+ { name = "filelock" },
1339
+ { name = "fsspec" },
1340
+ { name = "jinja2" },
1341
+ { name = "networkx" },
1342
+ { name = "nvidia-cublas-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1343
+ { name = "nvidia-cuda-cupti-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1344
+ { name = "nvidia-cuda-nvrtc-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1345
+ { name = "nvidia-cuda-runtime-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1346
+ { name = "nvidia-cudnn-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1347
+ { name = "nvidia-cufft-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1348
+ { name = "nvidia-cufile-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1349
+ { name = "nvidia-curand-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1350
+ { name = "nvidia-cusolver-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1351
+ { name = "nvidia-cusparse-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1352
+ { name = "nvidia-cusparselt-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1353
+ { name = "nvidia-nccl-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1354
+ { name = "nvidia-nvjitlink-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1355
+ { name = "nvidia-nvtx-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1356
+ { name = "setuptools" },
1357
+ { name = "sympy" },
1358
+ { name = "triton", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
1359
+ { name = "typing-extensions" },
1360
+ ]
1361
+ wheels = [
1362
+ { url = "https://files.pythonhosted.org/packages/49/0c/2fd4df0d83a495bb5e54dca4474c4ec5f9c62db185421563deeb5dabf609/torch-2.8.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:e2fab4153768d433f8ed9279c8133a114a034a61e77a3a104dcdf54388838705", size = 101906089, upload-time = "2025-08-06T14:53:52.631Z" },
1363
+ { url = "https://files.pythonhosted.org/packages/99/a8/6acf48d48838fb8fe480597d98a0668c2beb02ee4755cc136de92a0a956f/torch-2.8.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:b2aca0939fb7e4d842561febbd4ffda67a8e958ff725c1c27e244e85e982173c", size = 887913624, upload-time = "2025-08-06T14:56:44.33Z" },
1364
+ { url = "https://files.pythonhosted.org/packages/af/8a/5c87f08e3abd825c7dfecef5a0f1d9aa5df5dd0e3fd1fa2f490a8e512402/torch-2.8.0-cp312-cp312-win_amd64.whl", hash = "sha256:2f4ac52f0130275d7517b03a33d2493bab3693c83dcfadf4f81688ea82147d2e", size = 241326087, upload-time = "2025-08-06T14:53:46.503Z" },
1365
+ { url = "https://files.pythonhosted.org/packages/be/66/5c9a321b325aaecb92d4d1855421e3a055abd77903b7dab6575ca07796db/torch-2.8.0-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:619c2869db3ada2c0105487ba21b5008defcc472d23f8b80ed91ac4a380283b0", size = 73630478, upload-time = "2025-08-06T14:53:57.144Z" },
1366
+ { url = "https://files.pythonhosted.org/packages/10/4e/469ced5a0603245d6a19a556e9053300033f9c5baccf43a3d25ba73e189e/torch-2.8.0-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:2b2f96814e0345f5a5aed9bf9734efa913678ed19caf6dc2cddb7930672d6128", size = 101936856, upload-time = "2025-08-06T14:54:01.526Z" },
1367
+ { url = "https://files.pythonhosted.org/packages/16/82/3948e54c01b2109238357c6f86242e6ecbf0c63a1af46906772902f82057/torch-2.8.0-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:65616ca8ec6f43245e1f5f296603e33923f4c30f93d65e103d9e50c25b35150b", size = 887922844, upload-time = "2025-08-06T14:55:50.78Z" },
1368
+ { url = "https://files.pythonhosted.org/packages/e3/54/941ea0a860f2717d86a811adf0c2cd01b3983bdd460d0803053c4e0b8649/torch-2.8.0-cp313-cp313-win_amd64.whl", hash = "sha256:659df54119ae03e83a800addc125856effda88b016dfc54d9f65215c3975be16", size = 241330968, upload-time = "2025-08-06T14:54:45.293Z" },
1369
+ { url = "https://files.pythonhosted.org/packages/de/69/8b7b13bba430f5e21d77708b616f767683629fc4f8037564a177d20f90ed/torch-2.8.0-cp313-cp313t-macosx_14_0_arm64.whl", hash = "sha256:1a62a1ec4b0498930e2543535cf70b1bef8c777713de7ceb84cd79115f553767", size = 73915128, upload-time = "2025-08-06T14:54:34.769Z" },
1370
+ { url = "https://files.pythonhosted.org/packages/15/0e/8a800e093b7f7430dbaefa80075aee9158ec22e4c4fc3c1a66e4fb96cb4f/torch-2.8.0-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:83c13411a26fac3d101fe8035a6b0476ae606deb8688e904e796a3534c197def", size = 102020139, upload-time = "2025-08-06T14:54:39.047Z" },
1371
+ { url = "https://files.pythonhosted.org/packages/4a/15/5e488ca0bc6162c86a33b58642bc577c84ded17c7b72d97e49b5833e2d73/torch-2.8.0-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:8f0a9d617a66509ded240add3754e462430a6c1fc5589f86c17b433dd808f97a", size = 887990692, upload-time = "2025-08-06T14:56:18.286Z" },
1372
+ { url = "https://files.pythonhosted.org/packages/b4/a8/6a04e4b54472fc5dba7ca2341ab219e529f3c07b6941059fbf18dccac31f/torch-2.8.0-cp313-cp313t-win_amd64.whl", hash = "sha256:a7242b86f42be98ac674b88a4988643b9bc6145437ec8f048fea23f72feb5eca", size = 241603453, upload-time = "2025-08-06T14:55:22.945Z" },
1373
+ { url = "https://files.pythonhosted.org/packages/04/6e/650bb7f28f771af0cb791b02348db8b7f5f64f40f6829ee82aa6ce99aabe/torch-2.8.0-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:7b677e17f5a3e69fdef7eb3b9da72622f8d322692930297e4ccb52fefc6c8211", size = 73632395, upload-time = "2025-08-06T14:55:28.645Z" },
1374
+ ]
1375
+
1376
+ [[package]]
1377
+ name = "torchvision"
1378
+ version = "0.23.0"
1379
+ source = { registry = "https://pypi.org/simple" }
1380
+ dependencies = [
1381
+ { name = "numpy" },
1382
+ { name = "pillow" },
1383
+ { name = "torch" },
1384
+ ]
1385
+ wheels = [
1386
+ { url = "https://files.pythonhosted.org/packages/df/1d/0ea0b34bde92a86d42620f29baa6dcbb5c2fc85990316df5cb8f7abb8ea2/torchvision-0.23.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:e0e2c04a91403e8dd3af9756c6a024a1d9c0ed9c0d592a8314ded8f4fe30d440", size = 1856885, upload-time = "2025-08-06T14:58:06.503Z" },
1387
+ { url = "https://files.pythonhosted.org/packages/e2/00/2f6454decc0cd67158c7890364e446aad4b91797087a57a78e72e1a8f8bc/torchvision-0.23.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:6dd7c4d329a0e03157803031bc856220c6155ef08c26d4f5bbac938acecf0948", size = 2396614, upload-time = "2025-08-06T14:58:03.116Z" },
1388
+ { url = "https://files.pythonhosted.org/packages/e4/b5/3e580dcbc16f39a324f3dd71b90edbf02a42548ad44d2b4893cc92b1194b/torchvision-0.23.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:4e7d31c43bc7cbecbb1a5652ac0106b436aa66e26437585fc2c4b2cf04d6014c", size = 8627108, upload-time = "2025-08-06T14:58:12.956Z" },
1389
+ { url = "https://files.pythonhosted.org/packages/82/c1/c2fe6d61e110a8d0de2f94276899a2324a8f1e6aee559eb6b4629ab27466/torchvision-0.23.0-cp312-cp312-win_amd64.whl", hash = "sha256:a2e45272abe7b8bf0d06c405e78521b5757be1bd0ed7e5cd78120f7fdd4cbf35", size = 1600723, upload-time = "2025-08-06T14:57:57.986Z" },
1390
+ { url = "https://files.pythonhosted.org/packages/91/37/45a5b9407a7900f71d61b2b2f62db4b7c632debca397f205fdcacb502780/torchvision-0.23.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:1c37e325e09a184b730c3ef51424f383ec5745378dc0eca244520aca29722600", size = 1856886, upload-time = "2025-08-06T14:58:05.491Z" },
1391
+ { url = "https://files.pythonhosted.org/packages/ac/da/a06c60fc84fc849377cf035d3b3e9a1c896d52dbad493b963c0f1cdd74d0/torchvision-0.23.0-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:2f7fd6c15f3697e80627b77934f77705f3bc0e98278b989b2655de01f6903e1d", size = 2353112, upload-time = "2025-08-06T14:58:26.265Z" },
1392
+ { url = "https://files.pythonhosted.org/packages/a0/27/5ce65ba5c9d3b7d2ccdd79892ab86a2f87ac2ca6638f04bb0280321f1a9c/torchvision-0.23.0-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:a76fafe113b2977be3a21bf78f115438c1f88631d7a87203acb3dd6ae55889e6", size = 8627658, upload-time = "2025-08-06T14:58:15.999Z" },
1393
+ { url = "https://files.pythonhosted.org/packages/1f/e4/028a27b60aa578a2fa99d9d7334ff1871bb17008693ea055a2fdee96da0d/torchvision-0.23.0-cp313-cp313-win_amd64.whl", hash = "sha256:07d069cb29691ff566e3b7f11f20d91044f079e1dbdc9d72e0655899a9b06938", size = 1600749, upload-time = "2025-08-06T14:58:10.719Z" },
1394
+ { url = "https://files.pythonhosted.org/packages/05/35/72f91ad9ac7c19a849dedf083d347dc1123f0adeb401f53974f84f1d04c8/torchvision-0.23.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:2df618e1143805a7673aaf82cb5720dd9112d4e771983156aaf2ffff692eebf9", size = 2047192, upload-time = "2025-08-06T14:58:11.813Z" },
1395
+ { url = "https://files.pythonhosted.org/packages/1d/9d/406cea60a9eb9882145bcd62a184ee61e823e8e1d550cdc3c3ea866a9445/torchvision-0.23.0-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:2a3299d2b1d5a7aed2d3b6ffb69c672ca8830671967eb1cee1497bacd82fe47b", size = 2359295, upload-time = "2025-08-06T14:58:17.469Z" },
1396
+ { url = "https://files.pythonhosted.org/packages/2b/f4/34662f71a70fa1e59de99772142f22257ca750de05ccb400b8d2e3809c1d/torchvision-0.23.0-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:76bc4c0b63d5114aa81281390f8472a12a6a35ce9906e67ea6044e5af4cab60c", size = 8800474, upload-time = "2025-08-06T14:58:22.53Z" },
1397
+ { url = "https://files.pythonhosted.org/packages/6e/f5/b5a2d841a8d228b5dbda6d524704408e19e7ca6b7bb0f24490e081da1fa1/torchvision-0.23.0-cp313-cp313t-win_amd64.whl", hash = "sha256:b9e2dabf0da9c8aa9ea241afb63a8f3e98489e706b22ac3f30416a1be377153b", size = 1527667, upload-time = "2025-08-06T14:58:14.446Z" },
1398
+ ]
1399
+
1400
  [[package]]
1401
  name = "tqdm"
1402
  version = "4.67.1"
 
1409
  { url = "https://files.pythonhosted.org/packages/d0/30/dc54f88dd4a2b5dc8a0279bdd7270e735851848b762aeb1c1184ed1f6b14/tqdm-4.67.1-py3-none-any.whl", hash = "sha256:26445eca388f82e72884e0d580d5464cd801a3ea01e63e5601bdff9ba6a48de2", size = 78540, upload-time = "2024-11-24T20:12:19.698Z" },
1410
  ]
1411
 
1412
+ [[package]]
1413
+ name = "transformers"
1414
+ version = "4.55.0"
1415
+ source = { registry = "https://pypi.org/simple" }
1416
+ dependencies = [
1417
+ { name = "filelock" },
1418
+ { name = "huggingface-hub" },
1419
+ { name = "numpy" },
1420
+ { name = "packaging" },
1421
+ { name = "pyyaml" },
1422
+ { name = "regex" },
1423
+ { name = "requests" },
1424
+ { name = "safetensors" },
1425
+ { name = "tokenizers" },
1426
+ { name = "tqdm" },
1427
+ ]
1428
+ sdist = { url = "https://files.pythonhosted.org/packages/27/5d/f7dc746eef83336a6b34197311fe0c1da0d1192f637c726c6a5cf0d83502/transformers-4.55.0.tar.gz", hash = "sha256:15aa138a05d07a15b30d191ea2c45e23061ebf9fcc928a1318e03fe2234f3ae1", size = 9569089, upload-time = "2025-08-05T16:13:48.997Z" }
1429
+ wheels = [
1430
+ { url = "https://files.pythonhosted.org/packages/1c/93/bcb22fb52ed65084c0199270832aa4cdd4b41296d896f3e7ade188bccb68/transformers-4.55.0-py3-none-any.whl", hash = "sha256:29d9b8800e32a4a831bb16efb5f762f6a9742fef9fce5d693ed018d19b106490", size = 11267905, upload-time = "2025-08-05T16:13:34.814Z" },
1431
+ ]
1432
+
1433
+ [[package]]
1434
+ name = "triton"
1435
+ version = "3.4.0"
1436
+ source = { registry = "https://pypi.org/simple" }
1437
+ dependencies = [
1438
+ { name = "setuptools" },
1439
+ ]
1440
+ wheels = [
1441
+ { url = "https://files.pythonhosted.org/packages/d0/66/b1eb52839f563623d185f0927eb3530ee4d5ffe9d377cdaf5346b306689e/triton-3.4.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:31c1d84a5c0ec2c0f8e8a072d7fd150cab84a9c239eaddc6706c081bfae4eb04", size = 155560068, upload-time = "2025-07-30T19:58:37.081Z" },
1442
+ { url = "https://files.pythonhosted.org/packages/30/7b/0a685684ed5322d2af0bddefed7906674f67974aa88b0fae6e82e3b766f6/triton-3.4.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:00be2964616f4c619193cb0d1b29a99bd4b001d7dc333816073f92cf2a8ccdeb", size = 155569223, upload-time = "2025-07-30T19:58:44.017Z" },
1443
+ { url = "https://files.pythonhosted.org/packages/20/63/8cb444ad5cdb25d999b7d647abac25af0ee37d292afc009940c05b82dda0/triton-3.4.0-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7936b18a3499ed62059414d7df563e6c163c5e16c3773678a3ee3d417865035d", size = 155659780, upload-time = "2025-07-30T19:58:51.171Z" },
1444
+ ]
1445
+
1446
  [[package]]
1447
  name = "typer"
1448
  version = "0.16.0"
 
1515
  version = "0.1.0"
1516
  source = { editable = "." }
1517
  dependencies = [
1518
+ { name = "accelerate" },
1519
+ { name = "einops" },
1520
  { name = "gradio" },
1521
+ { name = "huggingface-hub", extra = ["cli"] },
1522
+ { name = "pillow" },
1523
+ { name = "pymupdf" },
1524
+ { name = "qwen-vl-utils" },
1525
+ { name = "requests" },
1526
+ { name = "safetensors" },
1527
+ { name = "torch" },
1528
+ { name = "torchvision" },
1529
+ { name = "transformers" },
1530
  ]
1531
 
1532
  [package.metadata]
1533
+ requires-dist = [
1534
+ { name = "accelerate", specifier = ">=0.33.0" },
1535
+ { name = "einops", specifier = ">=0.7.0" },
1536
+ { name = "gradio", specifier = ">=5.41.1" },
1537
+ { name = "huggingface-hub", extras = ["cli"], specifier = ">=0.34.3" },
1538
+ { name = "pillow", specifier = ">=10.3.0" },
1539
+ { name = "pymupdf", specifier = ">=1.26.3" },
1540
+ { name = "qwen-vl-utils", specifier = ">=0.0.11" },
1541
+ { name = "requests", specifier = ">=2.32.0" },
1542
+ { name = "safetensors", specifier = ">=0.4.5" },
1543
+ { name = "torch", specifier = ">=2.8.0" },
1544
+ { name = "torchvision", specifier = ">=0.23.0" },
1545
+ { name = "transformers", specifier = ">=4.55.0" },
1546
+ ]
1547
+
1548
+ [[package]]
1549
+ name = "wcwidth"
1550
+ version = "0.2.13"
1551
+ source = { registry = "https://pypi.org/simple" }
1552
+ sdist = { url = "https://files.pythonhosted.org/packages/6c/63/53559446a878410fc5a5974feb13d31d78d752eb18aeba59c7fef1af7598/wcwidth-0.2.13.tar.gz", hash = "sha256:72ea0c06399eb286d978fdedb6923a9eb47e1c486ce63e9b4e64fc18303972b5", size = 101301, upload-time = "2024-01-06T02:10:57.829Z" }
1553
+ wheels = [
1554
+ { url = "https://files.pythonhosted.org/packages/fd/84/fd2ba7aafacbad3c4201d395674fc6348826569da3c0937e75505ead3528/wcwidth-0.2.13-py2.py3-none-any.whl", hash = "sha256:3da69048e4540d84af32131829ff948f1e022c1c6bdb8d6102117aac784f6859", size = 34166, upload-time = "2024-01-06T02:10:55.763Z" },
1555
+ ]
1556
 
1557
  [[package]]
1558
  name = "websockets"