davideuler commited on
Commit
9d793d0
·
1 Parent(s): 4a4fa23

WebUI, PDF Translator for Human web interface

Browse files
.gitignore CHANGED
@@ -22,6 +22,7 @@ sdist/
22
  var/
23
  .idea/
24
  .idea
 
25
  wheels/
26
  *.egg-info/
27
  .installed.cfg
 
22
  var/
23
  .idea/
24
  .idea
25
+ .cached
26
  wheels/
27
  *.egg-info/
28
  .installed.cfg
PDF-Translator-for-Human.jpg ADDED
README.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # PDF Translator for Human: A PDF Reader/Translator with Local LLM/ChatGPT or Google
2
+
3
+ ## Use Case
4
+
5
+ There is tons of PDF reader/translator with AI supported. However none of them meets my need. I hope it could run totally on local with local LLMs.
6
+
7
+ I hope I could read both the original PDF and the translated pages side by side.
8
+ Also I don't like to translate a 1000 pages long PDF file all at one.
9
+
10
+ ## Features in PDF Translator for Human
11
+ You can read both the original PDF file and the translated content side by side.
12
+
13
+ The local/remote translation API is invoked on a per-page basis as needed, triggered by page turns during reading.
14
+
15
+
16
+ ## Supported translators and LLMS:
17
+ * Google Translator (NO need api-key, it it totally free)
18
+ * ChatGPT
19
+ * DeepSeek (Use the OpenAI Compatible endpoint at https://api.deepseek.com/v1)
20
+ * Qwen (Use the OpenAI Compatible endpoint)
21
+ * Local deployed LLMs by ollama, llama.cpp, mlx_lm
22
+ * Other OpenAI Compatible LLMs like GLM/Moonshot etc.
23
+
24
+ ## Start the Web Application for PDF Translator for Human
25
+
26
+
27
+ ``` bash
28
+ ./run_translator_web.sh
29
+
30
+ # or just start the streamlit application if you have run the previous script:
31
+ streamlit run pdf_translator_web.py
32
+
33
+ ```
34
+
35
+ ## Notes on deploy and start local llm
36
+
37
+ ### Option 1.Start local llm By mlx_lm (works on Mac Sillicon.)
38
+
39
+ Here I download aya-expanse-8b 4bit as an example.
40
+
41
+ ``` Bash
42
+ # download mlx models from huggingface to local folder
43
+ git clone https://huggingface.co/mlx-community/aya-expanse-8b-4bit
44
+
45
+ # install mlx_lm
46
+ pip install mlx_lm
47
+
48
+ # start the server
49
+ mlx_lm.server --model ./aya-expanse-8b-4bit --port 8080
50
+
51
+ ```
52
+
53
+ ### Option 2. By llama.cpp (Works on CPU/GPU/Mac Machines)
54
+
55
+ Llama.cpp works on CPU machines and Mac Intel/Sillicon machines, you need 48GB memories for aya-expanse-32b-q4_k_m.gguf.
56
+
57
+ ``` Bash
58
+ # download gguf models from huggingface to local folder
59
+ wget https://hf-mirror.co/bartowski/aya-expanse-32b-GGUF/resolve/main/aya-expanse-32b-Q4_K_M.gguf -O aya-expanse-32b-Q4_K_M.gguf
60
+
61
+ # download llama.cpp and install llama.cpp
62
+ git clone https://github.com/ggerganov/llama.cpp
63
+ cd llama.cpp
64
+ mkdir -p build && cmake -B build
65
+ cmake --build build --config Release -j 12
66
+
67
+ # start llama.cpp server
68
+ ./llama-server -m ~/models/aya-expanse-32b-Q4_K_M.gguf --port 8080
69
+
70
+ ```
71
+
72
+ ## Snapshot
73
+
74
+ ![PDF Translator for Human](PDF-Translator-for-Human.jpg)
75
+
76
+
77
+ ## Acknowlegement
78
+
79
+ https://github.com/nidhaloff/deep-translator
80
+
81
+ The project is based on the awesome deep-translator.
pdf_translator_web.py ADDED
@@ -0,0 +1,291 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import json
3
+ import hashlib
4
+ from pathlib import Path
5
+ import streamlit as st
6
+ import pymupdf
7
+ from deep_translator import (
8
+ GoogleTranslator,
9
+ ChatGptTranslator,
10
+ )
11
+
12
+ # Constants
13
+ DEFAULT_PAGES_PER_LOAD = 2
14
+ DEFAULT_MODEL = "default_model"
15
+ DEFAULT_API_BASE = "http://localhost:8080/v1"
16
+
17
+ # Supported translators
18
+ TRANSLATORS = {
19
+ 'google': GoogleTranslator,
20
+ 'chatgpt': ChatGptTranslator,
21
+ }
22
+
23
+ # Color options
24
+ COLOR_MAP = {
25
+ "darkred": (0.8, 0, 0),
26
+ "black": (0, 0, 0),
27
+ "blue": (0, 0, 0.8),
28
+ "darkgreen": (0, 0.5, 0),
29
+ "purple": (0.5, 0, 0.5),
30
+ }
31
+
32
+ # Target language options for ChatGPT
33
+ LANGUAGE_OPTIONS = {
34
+ "简体中文": "zh-CN",
35
+ "繁體中文": "zh-TW",
36
+ "English": "en",
37
+ "日本語": "ja",
38
+ "한국어": "ko",
39
+ "Español": "es",
40
+ "Français": "fr",
41
+ "Deutsch": "de",
42
+ }
43
+
44
+ # Add source language options
45
+ SOURCE_LANGUAGE_OPTIONS = {
46
+ "English": "en",
47
+ "简体中文": "zh-CN",
48
+ "繁體中文": "zh-TW",
49
+ "日本語": "ja",
50
+ "한국어": "ko",
51
+ "Español": "es",
52
+ "Français": "fr",
53
+ "Deutsch": "de",
54
+ "Auto": "auto",
55
+ }
56
+
57
+ def get_cache_dir():
58
+ """Get or create cache directory"""
59
+ cache_dir = Path('.cached')
60
+ cache_dir.mkdir(exist_ok=True)
61
+ return cache_dir
62
+
63
+ def get_cache_key(file_content: bytes, page_num: int, translator_name: str, target_lang: str):
64
+ """Generate cache key for a specific page translation"""
65
+ # 使用文件内容的hash作为缓存key的一部分
66
+ file_hash = hashlib.md5(file_content).hexdigest()
67
+ return f"{file_hash}_page{page_num}_{translator_name}_{target_lang}.pdf"
68
+
69
+ def get_cached_translation(cache_key: str) -> pymupdf.Document:
70
+ """Get cached translation if exists"""
71
+ cache_path = get_cache_dir() / cache_key
72
+ if cache_path.exists():
73
+ return pymupdf.open(str(cache_path))
74
+ return None
75
+
76
+ def save_translation_cache(doc: pymupdf.Document, cache_key: str):
77
+ """Save translation to cache"""
78
+ cache_path = get_cache_dir() / cache_key
79
+ doc.save(str(cache_path))
80
+
81
+ def translate_pdf_pages(doc, doc_bytes, start_page, num_pages, translator, text_color, translator_name, target_lang):
82
+ """Translate specific pages of a PDF document with progress and caching"""
83
+ WHITE = pymupdf.pdfcolor["white"]
84
+ rgb_color = COLOR_MAP.get(text_color.lower(), COLOR_MAP["darkred"])
85
+
86
+ translated_pages = []
87
+
88
+ # Create a progress bar
89
+ progress_bar = st.progress(0)
90
+ status_text = st.empty()
91
+
92
+ for i, page_num in enumerate(range(start_page, min(start_page + num_pages, doc.page_count))):
93
+ status_text.text(f"Translating page {page_num + 1}...")
94
+
95
+ # Check cache first
96
+ cache_key = get_cache_key(doc_bytes, page_num, translator_name, target_lang)
97
+ cached_doc = get_cached_translation(cache_key)
98
+
99
+ if cached_doc is not None:
100
+ translated_pages.append(cached_doc)
101
+ else:
102
+ # Create a new PDF document for this page
103
+ new_doc = pymupdf.open()
104
+ new_doc.insert_pdf(doc, from_page=page_num, to_page=page_num)
105
+ page = new_doc[0]
106
+
107
+ # Extract and translate text blocks
108
+ blocks = page.get_text("blocks", flags=pymupdf.TEXT_DEHYPHENATE)
109
+
110
+ for block in blocks:
111
+ bbox = block[:4]
112
+ text = block[4]
113
+
114
+ # Translate the text
115
+ translated = translator.translate(text)
116
+
117
+ # Cover original text with white and add translation in color
118
+ page.draw_rect(bbox, color=None, fill=WHITE)
119
+ page.insert_htmlbox(
120
+ bbox,
121
+ translated,
122
+ css=f"* {{font-family: sans-serif; color: rgb({int(rgb_color[0]*255)}, {int(rgb_color[1]*255)}, {int(rgb_color[2]*255)});}}"
123
+ )
124
+
125
+ # Save to cache
126
+ save_translation_cache(new_doc, cache_key)
127
+ translated_pages.append(new_doc)
128
+
129
+ # Update progress
130
+ progress = (i + 1) / min(num_pages, doc.page_count - start_page)
131
+ progress_bar.progress(progress)
132
+
133
+ # Clear progress indicators
134
+ progress_bar.empty()
135
+ status_text.empty()
136
+
137
+ return translated_pages
138
+
139
+ def get_page_image(page, scale=2.0):
140
+ """Get high quality image from PDF page"""
141
+ # 计算缩放后的尺寸
142
+ zoom = scale
143
+ mat = pymupdf.Matrix(zoom, zoom)
144
+
145
+ # 使用高分辨率渲染页面
146
+ pix = page.get_pixmap(matrix=mat, alpha=False)
147
+
148
+ return pix
149
+
150
+ def main():
151
+ st.set_page_config(layout="wide", page_title="PDF Translator for Human: with Local-LLM/GPT")
152
+ st.title("PDF Translator for Human: with Local-LLM/GPT")
153
+
154
+ # Sidebar configuration
155
+ with st.sidebar:
156
+ st.header("Settings")
157
+
158
+ uploaded_file = st.file_uploader("Choose a PDF file", type="pdf")
159
+
160
+ # Add source language selection
161
+ source_lang_name = st.selectbox(
162
+ "Source Language",
163
+ options=list(SOURCE_LANGUAGE_OPTIONS.keys()),
164
+ index=0 # Default to English
165
+ )
166
+ source_lang = SOURCE_LANGUAGE_OPTIONS[source_lang_name]
167
+
168
+ translator_name = st.selectbox(
169
+ "Select Translator",
170
+ options=list(TRANSLATORS.keys()),
171
+ index=0
172
+ )
173
+
174
+ pages_per_load = st.number_input(
175
+ "Pages per load",
176
+ min_value=1,
177
+ max_value=5,
178
+ value=DEFAULT_PAGES_PER_LOAD
179
+ )
180
+
181
+ text_color = st.selectbox(
182
+ "Translation Color",
183
+ options=list(COLOR_MAP.keys()),
184
+ index=0
185
+ )
186
+
187
+ # ChatGPT specific settings
188
+ if translator_name == 'chatgpt':
189
+ st.subheader("ChatGPT Settings")
190
+ target_lang = st.selectbox(
191
+ "Target Language",
192
+ options=list(LANGUAGE_OPTIONS.keys()),
193
+ index=0
194
+ )
195
+ api_key = st.text_input(
196
+ "OpenAI API Key",
197
+ value=os.getenv("OPENAI_API_KEY", ""),
198
+ type="password"
199
+ )
200
+ api_base = st.text_input(
201
+ "API Base URL",
202
+ value=os.getenv("OPENAI_API_BASE", DEFAULT_API_BASE)
203
+ )
204
+ model = st.text_input(
205
+ "Model Name",
206
+ value=os.getenv("OPENAI_MODEL", DEFAULT_MODEL)
207
+ )
208
+
209
+ # Update environment variables
210
+ os.environ["OPENAI_API_KEY"] = api_key
211
+ os.environ["OPENAI_API_BASE"] = api_base
212
+ os.environ["OPENAI_MODEL"] = model
213
+ target_lang = LANGUAGE_OPTIONS[target_lang]
214
+ else:
215
+ # For Google Translator, also show target language selection
216
+ target_lang_name = st.selectbox(
217
+ "Target Language",
218
+ options=list(SOURCE_LANGUAGE_OPTIONS.keys())[:-1], # Remove "Auto" option
219
+ index=0 # Default to first language
220
+ )
221
+ target_lang = SOURCE_LANGUAGE_OPTIONS[target_lang_name]
222
+
223
+ # Main content area
224
+ if uploaded_file is not None:
225
+ doc_bytes = uploaded_file.read()
226
+ doc = pymupdf.open(stream=doc_bytes)
227
+
228
+ # Create two columns for side-by-side display
229
+ col1, col2 = st.columns(2)
230
+
231
+ # Initialize session state
232
+ if 'current_page' not in st.session_state:
233
+ st.session_state.current_page = 0
234
+ st.session_state.translation_started = True # 自动开始翻译
235
+
236
+ # Display original pages immediately
237
+ with col1:
238
+ st.header("Original")
239
+ for page_num in range(st.session_state.current_page,
240
+ min(st.session_state.current_page + pages_per_load, doc.page_count)):
241
+ page = doc[page_num]
242
+ pix = get_page_image(page)
243
+ st.image(pix.tobytes(), caption=f"Page {page_num + 1}", use_container_width=True)
244
+
245
+ # Translation column
246
+ with col2:
247
+ st.header("Translated")
248
+
249
+ # Configure translator with selected source language
250
+ TranslatorClass = TRANSLATORS[translator_name]
251
+ translator = TranslatorClass(source=source_lang, target=target_lang)
252
+
253
+ # Translate current batch of pages
254
+ translated_pages = translate_pdf_pages(
255
+ doc,
256
+ doc_bytes,
257
+ st.session_state.current_page,
258
+ pages_per_load,
259
+ translator,
260
+ text_color,
261
+ translator_name,
262
+ target_lang
263
+ )
264
+
265
+ # Display translated pages
266
+ for i, trans_doc in enumerate(translated_pages):
267
+ page = trans_doc[0]
268
+ pix = get_page_image(page)
269
+ st.image(pix.tobytes(), caption=f"Page {st.session_state.current_page + i + 1}", use_container_width=True)
270
+
271
+ # Navigation buttons
272
+ nav_col1, nav_col2 = st.columns(2)
273
+ with nav_col1:
274
+ if st.session_state.current_page > 0:
275
+ if st.button("Previous Pages"):
276
+ st.session_state.current_page = max(0, st.session_state.current_page - pages_per_load)
277
+ st.rerun()
278
+
279
+ with nav_col2:
280
+ if st.session_state.current_page + pages_per_load < doc.page_count:
281
+ if st.button("Next Pages"):
282
+ st.session_state.current_page = min(
283
+ doc.page_count - 1,
284
+ st.session_state.current_page + pages_per_load
285
+ )
286
+ st.rerun()
287
+ else:
288
+ st.info("Please upload a PDF file to begin translation")
289
+
290
+ if __name__ == "__main__":
291
+ main()
pyproject.toml CHANGED
@@ -30,7 +30,7 @@ description = "A flexible free and unlimited python tool to translate between di
30
  license = "MIT"
31
  authors = ["Nidhal Baccouri <[email protected]>"]
32
  maintainers = ["Nidhal Baccouri <[email protected]>", "Chris Trenthem <[email protected]>"]
33
- readme = "docs/README.rst"
34
  homepage = "https://github.com/nidhaloff/deep_translator"
35
  repository = "https://github.com/nidhaloff/deep_translator"
36
  documentation = "https://deep-translator.readthedocs.io/en/latest/"
 
30
  license = "MIT"
31
  authors = ["Nidhal Baccouri <[email protected]>"]
32
  maintainers = ["Nidhal Baccouri <[email protected]>", "Chris Trenthem <[email protected]>"]
33
+ readme = "README.md"
34
  homepage = "https://github.com/nidhaloff/deep_translator"
35
  repository = "https://github.com/nidhaloff/deep_translator"
36
  documentation = "https://deep-translator.readthedocs.io/en/latest/"
run_translator_web.sh ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ # install the project and dependencies
2
+
3
+ git clone https://github.com/davideuler/pdf-translator-for-human
4
+ cd pdf-translator-for-human
5
+ pip install -e .
6
+ pip install streamlit pymupdf openai
7
+
8
+ # Start the Web Application
9
+ streamlit run pdf_translator_web.py
translator_cli.py CHANGED
@@ -118,8 +118,10 @@ def main():
118
  export OPENAI_API_KEY=sk-proj-xxxx
119
  export OPENAI_API_BASE=https://api.xxxx.com/v1
120
  export OPENAI_API_BASE=http://localhost:8080/v1 # for local llm api
121
- python translator_cli.py --source english --translator chatgpt --target zh-CN input.pdf
122
 
 
 
123
  # do not keep original text as an optional layer:
124
  python translator_cli.py --source english --translator chatgpt --target zh-CN --no-original input.pdf
125
 
 
118
  export OPENAI_API_KEY=sk-proj-xxxx
119
  export OPENAI_API_BASE=https://api.xxxx.com/v1
120
  export OPENAI_API_BASE=http://localhost:8080/v1 # for local llm api
121
+ export OPENAI_MODEL=default_model
122
 
123
+ python translator_cli.py --source english --translator chatgpt --target zh-CN input.pdf
124
+
125
  # do not keep original text as an optional layer:
126
  python translator_cli.py --source english --translator chatgpt --target zh-CN --no-original input.pdf
127