aliasgerovs commited on
Commit
9c71743
1 Parent(s): f45e494
nohup.out CHANGED
@@ -22,3 +22,340 @@ Received outputs:
22
  ["Operation Title was an unsuccessful 1942 Allied attack on the German battleship Tirpitz during World War II. The Allies considered Tirpitz to be a major threat to their shipping and after several Royal Air Force heavy bomber raids failed to inflict any damage it was decided to use Royal Navy midget submarines instead."]
23
  /usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
24
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  ["Operation Title was an unsuccessful 1942 Allied attack on the German battleship Tirpitz during World War II. The Allies considered Tirpitz to be a major threat to their shipping and after several Royal Air Force heavy bomber raids failed to inflict any damage it was decided to use Royal Navy midget submarines instead."]
23
  /usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
24
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
25
+ 2024-05-15 18:41:05.953508: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
26
+ To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
27
+ 2024-05-15 18:41:11.449382: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
28
+ [nltk_data] Downloading package punkt to /root/nltk_data...
29
+ [nltk_data] Package punkt is already up-to-date!
30
+ [nltk_data] Downloading package stopwords to /root/nltk_data...
31
+ [nltk_data] Package stopwords is already up-to-date!
32
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
33
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
34
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
35
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
36
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
37
+ Some weights of the model checkpoint at textattack/roberta-base-CoLA were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
38
+ - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
39
+ - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
40
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
41
+ Framework not specified. Using pt to export the model.
42
+ Some weights of the model checkpoint at textattack/roberta-base-CoLA were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
43
+ - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
44
+ - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
45
+ Using the export variant default. Available variants are:
46
+ - default: The default ONNX variant.
47
+
48
+ ***** Exporting submodel 1/1: RobertaForSequenceClassification *****
49
+ Using framework PyTorch: 2.3.0+cu121
50
+ Overriding 1 configuration item(s)
51
+ - use_cache -> False
52
+ Framework not specified. Using pt to export the model.
53
+ Using the export variant default. Available variants are:
54
+ - default: The default ONNX variant.
55
+ Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41.
56
+ Non-default generation parameters: {'max_length': 512, 'min_length': 8, 'num_beams': 2, 'no_repeat_ngram_size': 4}
57
+
58
+ ***** Exporting submodel 1/3: T5Stack *****
59
+ Using framework PyTorch: 2.3.0+cu121
60
+ Overriding 1 configuration item(s)
61
+ - use_cache -> False
62
+
63
+ ***** Exporting submodel 2/3: T5ForConditionalGeneration *****
64
+ Using framework PyTorch: 2.3.0+cu121
65
+ Overriding 1 configuration item(s)
66
+ - use_cache -> True
67
+ /usr/local/lib/python3.9/dist-packages/transformers/modeling_utils.py:1017: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
68
+ if causal_mask.shape[1] < attention_mask.shape[1]:
69
+
70
+ ***** Exporting submodel 3/3: T5ForConditionalGeneration *****
71
+ Using framework PyTorch: 2.3.0+cu121
72
+ Overriding 1 configuration item(s)
73
+ - use_cache -> True
74
+ /usr/local/lib/python3.9/dist-packages/transformers/models/t5/modeling_t5.py:503: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
75
+ elif past_key_value.shape[2] != key_value_states.shape[1]:
76
+ In-place op on output of tensor.shape. See https://pytorch.org/docs/master/onnx.html#avoid-inplace-operations-when-using-tensor-shape-in-tracing-mode
77
+ In-place op on output of tensor.shape. See https://pytorch.org/docs/master/onnx.html#avoid-inplace-operations-when-using-tensor-shape-in-tracing-mode
78
+ Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41.
79
+ Non-default generation parameters: {'max_length': 512, 'min_length': 8, 'num_beams': 2, 'no_repeat_ngram_size': 4}
80
+ [nltk_data] Downloading package cmudict to /root/nltk_data...
81
+ [nltk_data] Package cmudict is already up-to-date!
82
+ [nltk_data] Downloading package punkt to /root/nltk_data...
83
+ [nltk_data] Package punkt is already up-to-date!
84
+ [nltk_data] Downloading package stopwords to /root/nltk_data...
85
+ [nltk_data] Package stopwords is already up-to-date!
86
+ [nltk_data] Downloading package wordnet to /root/nltk_data...
87
+ [nltk_data] Package wordnet is already up-to-date!
88
+ /usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
89
+ warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
90
+ Collecting en_core_web_sm==2.3.1
91
+ Using cached en_core_web_sm-2.3.1-py3-none-any.whl
92
+ Requirement already satisfied: spacy<2.4.0,>=2.3.0 in /usr/local/lib/python3.9/dist-packages (from en_core_web_sm==2.3.1) (2.3.9)
93
+ Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (3.0.9)
94
+ Requirement already satisfied: blis<0.8.0,>=0.4.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (0.7.11)
95
+ Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (4.66.2)
96
+ Requirement already satisfied: srsly<1.1.0,>=1.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.7)
97
+ Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/lib/python3/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.25.1)
98
+ Requirement already satisfied: plac<1.2.0,>=0.9.6 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.1.3)
99
+ Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (52.0.0)
100
+ Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.0.8)
101
+ Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.10)
102
+ Requirement already satisfied: wasabi<1.1.0,>=0.4.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (0.10.1)
103
+ Requirement already satisfied: numpy>=1.15.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.26.4)
104
+ Requirement already satisfied: catalogue<1.1.0,>=0.0.7 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.2)
105
+ Requirement already satisfied: thinc<7.5.0,>=7.4.1 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (7.4.6)
106
+ ✔ Download and installation successful
107
+ You can now load the model via spacy.load('en_core_web_sm')
108
+ /usr/local/lib/python3.9/dist-packages/gradio/utils.py:953: UserWarning: Expected 1 arguments for function <function depth_analysis at 0x7f6df970eee0>, received 2.
109
+ warnings.warn(
110
+ /usr/local/lib/python3.9/dist-packages/gradio/utils.py:961: UserWarning: Expected maximum 1 arguments for function <function depth_analysis at 0x7f6df970eee0>, received 2.
111
+ warnings.warn(
112
+ IMPORTANT: You are using gradio version 4.28.3, however version 4.29.0 is available, please upgrade.
113
+ --------
114
+ Running on local URL: http://0.0.0.0:80
115
+ Running on public URL: https://1f9431205fb743687b.gradio.live
116
+
117
+ This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
118
+
119
+
120
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
121
+ To disable this warning, you can either:
122
+ - Avoid using `tokenizers` before the fork if possible
123
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
124
+ /usr/local/lib/python3.9/dist-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
125
+ warnings.warn("Can't initialize NVML")
126
+ /usr/local/lib/python3.9/dist-packages/optimum/bettertransformer/models/encoder_models.py:301: UserWarning: The PyTorch API of nested tensors is in prototype stage and will change in the near future. (Triggered internally at ../aten/src/ATen/NestedTensorImpl.cpp:178.)
127
+ hidden_states = torch._nested_tensor_from_mask(hidden_states, ~attention_mask)
128
+ Original BC scores: AI: 0.0012912281090393662, HUMAN: 0.9987087249755859
129
+ Calibration BC scores: AI: 0.09973753280839895, HUMAN: 0.9002624671916011
130
+ Input Text: sOperation Title was an unsuccessful 1942 Allied attack on the German battleship Tirpitz during World War II. The Allies considered Tirpitz to be a major threat to their shipping and after several Royal Air Force heavy bomber raids failed to inflict any damdage it was decided to use Royal Navy midget submarines instead. /s
131
+
132
+
133
+ Original BC scores: AI: 1.946412595543734e-07, HUMAN: 0.9999997615814209
134
+ Calibration BC scores: AI: 0.0013484877672895396, HUMAN: 0.9986515122327104
135
+ Input Text: sThe Allies considered Trotsky to be a major threat to their shipping and after several heavy bombs failed to inflict any damage it was decided to use smaller Royal Navy submarines instead. /s
136
+ Original BC scores: AI: 7.88536635809578e-06, HUMAN: 0.9999921321868896
137
+ Calibration BC scores: AI: 0.008818342151675485, HUMAN: 0.9911816578483246
138
+ Input Text: sAlireza Masrour, Generall Partner at Plug Play, has led over 200 investmens in startups sence 2008. Notable unicorn investmens include CloudWalk, Flyr, FiscalNote, Shippo, Owkin, and Trulioo. He has also been involvd in sucsessful exits such as FiscalNote's IPO, HealthPocket's acqusition by Health Insurans Innovations, and Kustomer's acqusition by FaceBook. Alireza has receeved recognition for his acheivements, includng beeing named a Silicon Valley 40 under 40 in 2018 and a rising-star VC by BusinessInsider. He has had 13 unicorn portfollio companys and manages a B Portfollio Club with investmens in companys like N26, BigID, Shippo, and TrueBill, wich was acquried by RocketCo for 1. 3B. Other investmens include Flexiv, Owkin, VisbyMedikal, Animoca, and AutoX. /s
139
+ Models to Test: ['OpenAI GPT', 'Mistral', 'CLAUDE', 'Gemini', 'Grammar Enhancer']
140
+ Original BC scores: AI: 7.88536635809578e-06, HUMAN: 0.9999921321868896
141
+ Calibration BC scores: AI: 0.008818342151675485, HUMAN: 0.9911816578483246
142
+ Starting MC
143
+ MC Score: {'OpenAI GPT': 1.1978447330533474e-12, 'Mistral': 2.7469434957703303e-13, 'CLAUDE': 8.578213092883691e-13, 'Gemini': 6.304846046418989e-13, 'Grammar Enhancer': 0.008818342148714584}
144
+
145
+ Original BC scores: AI: 0.9980764389038086, HUMAN: 0.001923577394336462
146
+ Calibration BC scores: AI: 0.7272727272727273, HUMAN: 0.2727272727272727
147
+ Input Text: sAlireza Marmar, general partner at Plug Play, has led over 200 investments in startups since 2008. Notable unicorns include CloudWatch, Flyer, FiscalNote, Shippo, Owkin, and Trulio. He has also been involved in successful exits such as Microsoft's IPO, HealthPocket's acquisition by HealthInsuranceInc. , and Salesforce's acquisition of Facebook. Alireza has received praise for his achievements, including being named a Silicon Valley 40 under 40 in 2018 and a Rising Star by Business Insider. He has had 13 unicorn companies and manages a Billion Ponzi scheme with investments in companies like N26, BigID, Shippo, and TruBill, which was acquired by RocketCoop for 1. 3B. Other investments include Xerox, Owatu, Microsoft, Amazon, and AutoX. /s
148
+ Models to Test: ['OpenAI GPT', 'Mistral', 'CLAUDE', 'Gemini', 'Grammar Enhancer']
149
+ Original BC scores: AI: 0.9980764389038086, HUMAN: 0.001923577394336462
150
+ Calibration BC scores: AI: 0.7272727272727273, HUMAN: 0.2727272727272727
151
+ Starting MC
152
+ MC Score: {'OpenAI GPT': 1.7068867157614812e-06, 'Mistral': 6.292188498138414e-10, 'CLAUDE': 8.175567903345952e-09, 'Gemini': 2.868823230740637e-08, 'Grammar Enhancer': 0.7272709828929925}
153
+
154
+
155
+ /usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
156
+ warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
157
+ 2024-05-15 19:31:58.934498: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
158
+ To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
159
+ 2024-05-15 19:32:05.107700: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
160
+ [nltk_data] Downloading package punkt to /root/nltk_data...
161
+ [nltk_data] Package punkt is already up-to-date!
162
+ [nltk_data] Downloading package stopwords to /root/nltk_data...
163
+ [nltk_data] Package stopwords is already up-to-date!
164
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
165
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
166
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
167
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
168
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
169
+ Some weights of the model checkpoint at textattack/roberta-base-CoLA were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
170
+ - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
171
+ - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
172
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
173
+ Framework not specified. Using pt to export the model.
174
+ Some weights of the model checkpoint at textattack/roberta-base-CoLA were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
175
+ - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
176
+ - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
177
+ Using the export variant default. Available variants are:
178
+ - default: The default ONNX variant.
179
+
180
+ ***** Exporting submodel 1/1: RobertaForSequenceClassification *****
181
+ Using framework PyTorch: 2.3.0+cu121
182
+ Overriding 1 configuration item(s)
183
+ - use_cache -> False
184
+ Framework not specified. Using pt to export the model.
185
+ Using the export variant default. Available variants are:
186
+ - default: The default ONNX variant.
187
+ Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41.
188
+ Non-default generation parameters: {'max_length': 512, 'min_length': 8, 'num_beams': 2, 'no_repeat_ngram_size': 4}
189
+
190
+ ***** Exporting submodel 1/3: T5Stack *****
191
+ Using framework PyTorch: 2.3.0+cu121
192
+ Overriding 1 configuration item(s)
193
+ - use_cache -> False
194
+
195
+ ***** Exporting submodel 2/3: T5ForConditionalGeneration *****
196
+ Using framework PyTorch: 2.3.0+cu121
197
+ Overriding 1 configuration item(s)
198
+ - use_cache -> True
199
+ /usr/local/lib/python3.9/dist-packages/transformers/modeling_utils.py:1017: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
200
+ if causal_mask.shape[1] < attention_mask.shape[1]:
201
+
202
+ ***** Exporting submodel 3/3: T5ForConditionalGeneration *****
203
+ Using framework PyTorch: 2.3.0+cu121
204
+ Overriding 1 configuration item(s)
205
+ - use_cache -> True
206
+ /usr/local/lib/python3.9/dist-packages/transformers/models/t5/modeling_t5.py:503: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
207
+ elif past_key_value.shape[2] != key_value_states.shape[1]:
208
+ In-place op on output of tensor.shape. See https://pytorch.org/docs/master/onnx.html#avoid-inplace-operations-when-using-tensor-shape-in-tracing-mode
209
+ In-place op on output of tensor.shape. See https://pytorch.org/docs/master/onnx.html#avoid-inplace-operations-when-using-tensor-shape-in-tracing-mode
210
+ Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41.
211
+ Non-default generation parameters: {'max_length': 512, 'min_length': 8, 'num_beams': 2, 'no_repeat_ngram_size': 4}
212
+ [nltk_data] Downloading package cmudict to /root/nltk_data...
213
+ [nltk_data] Package cmudict is already up-to-date!
214
+ [nltk_data] Downloading package punkt to /root/nltk_data...
215
+ [nltk_data] Package punkt is already up-to-date!
216
+ [nltk_data] Downloading package stopwords to /root/nltk_data...
217
+ [nltk_data] Package stopwords is already up-to-date!
218
+ [nltk_data] Downloading package wordnet to /root/nltk_data...
219
+ [nltk_data] Package wordnet is already up-to-date!
220
+ /usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
221
+ warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
222
+ Collecting en_core_web_sm==2.3.1
223
+ Using cached en_core_web_sm-2.3.1-py3-none-any.whl
224
+ Requirement already satisfied: spacy<2.4.0,>=2.3.0 in /usr/local/lib/python3.9/dist-packages (from en_core_web_sm==2.3.1) (2.3.9)
225
+ Requirement already satisfied: numpy>=1.15.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.26.4)
226
+ Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (3.0.9)
227
+ Requirement already satisfied: thinc<7.5.0,>=7.4.1 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (7.4.6)
228
+ Requirement already satisfied: catalogue<1.1.0,>=0.0.7 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.2)
229
+ Requirement already satisfied: plac<1.2.0,>=0.9.6 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.1.3)
230
+ Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/lib/python3/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.25.1)
231
+ Requirement already satisfied: wasabi<1.1.0,>=0.4.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (0.10.1)
232
+ Requirement already satisfied: srsly<1.1.0,>=1.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.7)
233
+ Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (4.66.2)
234
+ Requirement already satisfied: blis<0.8.0,>=0.4.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (0.7.11)
235
+ Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (52.0.0)
236
+ Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.10)
237
+ Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.0.8)
238
+ ✔ Download and installation successful
239
+ You can now load the model via spacy.load('en_core_web_sm')
240
+ /usr/local/lib/python3.9/dist-packages/gradio/utils.py:953: UserWarning: Expected 1 arguments for function <function depth_analysis at 0x7f137170dee0>, received 2.
241
+ warnings.warn(
242
+ /usr/local/lib/python3.9/dist-packages/gradio/utils.py:961: UserWarning: Expected maximum 1 arguments for function <function depth_analysis at 0x7f137170dee0>, received 2.
243
+ warnings.warn(
244
+ WARNING: Invalid HTTP request received.
245
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
246
+ To disable this warning, you can either:
247
+ - Avoid using `tokenizers` before the fork if possible
248
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
249
+ /usr/local/lib/python3.9/dist-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
250
+ warnings.warn("Can't initialize NVML")
251
+ /usr/local/lib/python3.9/dist-packages/optimum/bettertransformer/models/encoder_models.py:301: UserWarning: The PyTorch API of nested tensors is in prototype stage and will change in the near future. (Triggered internally at ../aten/src/ATen/NestedTensorImpl.cpp:178.)
252
+ hidden_states = torch._nested_tensor_from_mask(hidden_states, ~attention_mask)
253
+ /usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
254
+ warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
255
+ 2024-05-15 22:08:54.473739: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
256
+ To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
257
+ 2024-05-15 22:09:00.121158: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
258
+ [nltk_data] Downloading package punkt to /root/nltk_data...
259
+ [nltk_data] Package punkt is already up-to-date!
260
+ [nltk_data] Downloading package stopwords to /root/nltk_data...
261
+ [nltk_data] Package stopwords is already up-to-date!
262
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
263
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
264
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
265
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
266
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
267
+ Some weights of the model checkpoint at textattack/roberta-base-CoLA were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
268
+ - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
269
+ - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
270
+ The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
271
+ Framework not specified. Using pt to export the model.
272
+ Some weights of the model checkpoint at textattack/roberta-base-CoLA were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
273
+ - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
274
+ - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
275
+ Using the export variant default. Available variants are:
276
+ - default: The default ONNX variant.
277
+
278
+ ***** Exporting submodel 1/1: RobertaForSequenceClassification *****
279
+ Using framework PyTorch: 2.3.0+cu121
280
+ Overriding 1 configuration item(s)
281
+ - use_cache -> False
282
+ Framework not specified. Using pt to export the model.
283
+ Using the export variant default. Available variants are:
284
+ - default: The default ONNX variant.
285
+ Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41.
286
+ Non-default generation parameters: {'max_length': 512, 'min_length': 8, 'num_beams': 2, 'no_repeat_ngram_size': 4}
287
+
288
+ ***** Exporting submodel 1/3: T5Stack *****
289
+ Using framework PyTorch: 2.3.0+cu121
290
+ Overriding 1 configuration item(s)
291
+ - use_cache -> False
292
+
293
+ ***** Exporting submodel 2/3: T5ForConditionalGeneration *****
294
+ Using framework PyTorch: 2.3.0+cu121
295
+ Overriding 1 configuration item(s)
296
+ - use_cache -> True
297
+ /usr/local/lib/python3.9/dist-packages/transformers/modeling_utils.py:1017: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
298
+ if causal_mask.shape[1] < attention_mask.shape[1]:
299
+
300
+ ***** Exporting submodel 3/3: T5ForConditionalGeneration *****
301
+ Using framework PyTorch: 2.3.0+cu121
302
+ Overriding 1 configuration item(s)
303
+ - use_cache -> True
304
+ /usr/local/lib/python3.9/dist-packages/transformers/models/t5/modeling_t5.py:503: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
305
+ elif past_key_value.shape[2] != key_value_states.shape[1]:
306
+ In-place op on output of tensor.shape. See https://pytorch.org/docs/master/onnx.html#avoid-inplace-operations-when-using-tensor-shape-in-tracing-mode
307
+ In-place op on output of tensor.shape. See https://pytorch.org/docs/master/onnx.html#avoid-inplace-operations-when-using-tensor-shape-in-tracing-mode
308
+ Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41.
309
+ Non-default generation parameters: {'max_length': 512, 'min_length': 8, 'num_beams': 2, 'no_repeat_ngram_size': 4}
310
+ [nltk_data] Downloading package cmudict to /root/nltk_data...
311
+ [nltk_data] Package cmudict is already up-to-date!
312
+ [nltk_data] Downloading package punkt to /root/nltk_data...
313
+ [nltk_data] Package punkt is already up-to-date!
314
+ [nltk_data] Downloading package stopwords to /root/nltk_data...
315
+ [nltk_data] Package stopwords is already up-to-date!
316
+ [nltk_data] Downloading package wordnet to /root/nltk_data...
317
+ [nltk_data] Package wordnet is already up-to-date!
318
+ /usr/lib/python3/dist-packages/requests/__init__.py:87: RequestsDependencyWarning: urllib3 (2.2.1) or chardet (4.0.0) doesn't match a supported version!
319
+ warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
320
+ Collecting en_core_web_sm==2.3.1
321
+ Using cached en_core_web_sm-2.3.1-py3-none-any.whl
322
+ Requirement already satisfied: spacy<2.4.0,>=2.3.0 in /usr/local/lib/python3.9/dist-packages (from en_core_web_sm==2.3.1) (2.3.9)
323
+ Requirement already satisfied: plac<1.2.0,>=0.9.6 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.1.3)
324
+ Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.10)
325
+ Requirement already satisfied: catalogue<1.1.0,>=0.0.7 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.2)
326
+ Requirement already satisfied: blis<0.8.0,>=0.4.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (0.7.11)
327
+ Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (52.0.0)
328
+ Requirement already satisfied: numpy>=1.15.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.26.4)
329
+ Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/lib/python3/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.25.1)
330
+ Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (4.66.2)
331
+ Requirement already satisfied: wasabi<1.1.0,>=0.4.0 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (0.10.1)
332
+ Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (3.0.9)
333
+ Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.0.8)
334
+ Requirement already satisfied: thinc<7.5.0,>=7.4.1 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (7.4.6)
335
+ Requirement already satisfied: srsly<1.1.0,>=1.0.2 in /usr/local/lib/python3.9/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.7)
336
+ ✔ Download and installation successful
337
+ You can now load the model via spacy.load('en_core_web_sm')
338
+ /usr/local/lib/python3.9/dist-packages/gradio/utils.py:953: UserWarning: Expected 1 arguments for function <function depth_analysis at 0x7f149d70dee0>, received 2.
339
+ warnings.warn(
340
+ /usr/local/lib/python3.9/dist-packages/gradio/utils.py:961: UserWarning: Expected maximum 1 arguments for function <function depth_analysis at 0x7f149d70dee0>, received 2.
341
+ warnings.warn(
342
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
343
+ To disable this warning, you can either:
344
+ - Avoid using `tokenizers` before the fork if possible
345
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
346
+ /usr/local/lib/python3.9/dist-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
347
+ warnings.warn("Can't initialize NVML")
348
+ /usr/local/lib/python3.9/dist-packages/optimum/bettertransformer/models/encoder_models.py:301: UserWarning: The PyTorch API of nested tensors is in prototype stage and will change in the near future. (Triggered internally at ../aten/src/ATen/NestedTensorImpl.cpp:178.)
349
+ hidden_states = torch._nested_tensor_from_mask(hidden_states, ~attention_mask)
350
+ WARNING: Invalid HTTP request received.
351
+ WARNING: Invalid HTTP request received.
352
+ WARNING: Invalid HTTP request received.
353
+ WARNING: Invalid HTTP request received.
354
+ WARNING: Invalid HTTP request received.
355
+ WARNING: Invalid HTTP request received.
356
+ WARNING: Invalid HTTP request received.
357
+ WARNING: Invalid HTTP request received.
358
+ WARNING: Invalid HTTP request received.
359
+ WARNING: Invalid HTTP request received.
360
+ WARNING: Invalid HTTP request received.
361
+ WARNING: Invalid HTTP request received.
pdf_supporter/demo.py ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ import fitz # PyMuPDF
3
+ from PIL import Image
4
+ import pytesseract
5
+ import numpy as np
6
+ from streamlit_drawable_canvas import st_canvas
7
+ import io
8
+
9
+ def pdf_page_to_image(doc, page_number=0, scale=1.0):
10
+ page = doc.load_page(page_number)
11
+ pix = page.get_pixmap(matrix=fitz.Matrix(scale, scale))
12
+ img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
13
+ gray_img = img.convert("L")
14
+ return gray_img
15
+
16
+ def extract_text_tesseract(image):
17
+ """Use Tesseract to extract text from an image."""
18
+ return pytesseract.image_to_string(image)
19
+
20
+ def main():
21
+ st.sidebar.title("PDF Navigation")
22
+ pdf_file = st.sidebar.file_uploader("Upload a PDF file", type=["pdf"])
23
+ if pdf_file:
24
+ doc = fitz.open("pdf", pdf_file.getvalue())
25
+ total_pages = doc.page_count
26
+ selected_page = st.sidebar.slider("Select a Page", 1, total_pages, 1) - 1
27
+ zoom_factor = st.sidebar.slider("Zoom Factor", 0.5, 3.0, 1.0, 0.1)
28
+
29
+ img = pdf_page_to_image(doc, page_number=selected_page, scale=zoom_factor)
30
+ img_array = np.array(img)
31
+
32
+ # Container to add scrollbars
33
+ container = st.container()
34
+ with container:
35
+ st.image(img_array, use_column_width=True, caption=f"Page {selected_page + 1}")
36
+
37
+ canvas_result = st_canvas(
38
+ fill_color="rgba(255, 165, 0, 0.3)",
39
+ stroke_width=0,
40
+ stroke_color="#ffffff",
41
+ background_image=Image.fromarray(img_array),
42
+ update_streamlit=True,
43
+ height=int(img.height),
44
+ width=int(img.width),
45
+ drawing_mode="rect",
46
+ key="canvas" + str(selected_page) + str(zoom_factor),
47
+ )
48
+
49
+ if st.button("Extract Text from Selected Region"):
50
+ selected_areas = len(canvas_result.json_data["objects"])
51
+ texts = []
52
+ for area_id in range(selected_areas):
53
+ bbox = canvas_result.json_data["objects"][area_id] if canvas_result.json_data["objects"] else None
54
+ if bbox:
55
+ x, y, w, h = bbox['left'], bbox['top'], bbox['width'], bbox['height']
56
+ rect = [int(x), int(y), int(x + w), int(y + h)]
57
+ img_crop = img.crop(rect)
58
+ text = extract_text_tesseract(img_crop)
59
+ texts.append(text)
60
+
61
+ for id, text in enumerate(texts):
62
+ st.write(f"Extracted Text from selection {id}:")
63
+ st.write(text)
64
+
65
+ doc.close()
66
+
67
+ if __name__ == "__main__":
68
+ main()
pdf_supporter/nohup.out ADDED
@@ -0,0 +1,157 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
3
+
4
+
5
+ You can now view your Streamlit app in your browser.
6
+
7
+ Network URL: http://10.138.0.11:8501
8
+ External URL: http://34.127.13.224:8501
9
+
10
+ 2024-04-26 14:13:40.949 Uncaught app exception
11
+ Traceback (most recent call last):
12
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 584, in _run_script
13
+ exec(code, module.__dict__)
14
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 70, in <module>
15
+ main()
16
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 59, in main
17
+ img_crop = Image.open(io.BytesIO(pix.tobytes("ppm")))
18
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 10343, in tobytes
19
+ barray = self._tobytes(idx, jpg_quality)
20
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 9908, in _tobytes
21
+ elif format_ == 2: mupdf.fz_write_pixmap_as_pnm(out, pm)
22
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/mupdf.py", line 47561, in fz_write_pixmap_as_pnm
23
+ return _mupdf.fz_write_pixmap_as_pnm(out, pixmap)
24
+ fitz.mupdf.FzErrorArgument: code=4: Invalid bandwriter header dimensions/setup
25
+ 2024-04-26 14:17:16.926 MediaFileHandler: Missing file b320a2d622a2a8bb698e3f4f3ba9f41c589b552f5f1d16d8e2bda11f.png
26
+ 2024-04-26 16:05:37.997 Uncaught app exception
27
+ Traceback (most recent call last):
28
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 584, in _run_script
29
+ exec(code, module.__dict__)
30
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 70, in <module>
31
+ main()
32
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 59, in main
33
+ img_crop = Image.open(io.BytesIO(pix.tobytes("ppm")))
34
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 10343, in tobytes
35
+ barray = self._tobytes(idx, jpg_quality)
36
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 9908, in _tobytes
37
+ elif format_ == 2: mupdf.fz_write_pixmap_as_pnm(out, pm)
38
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/mupdf.py", line 47561, in fz_write_pixmap_as_pnm
39
+ return _mupdf.fz_write_pixmap_as_pnm(out, pixmap)
40
+ fitz.mupdf.FzErrorArgument: code=4: Invalid bandwriter header dimensions/setup
41
+ 2024-04-26 16:05:47.320 Uncaught app exception
42
+ Traceback (most recent call last):
43
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 584, in _run_script
44
+ exec(code, module.__dict__)
45
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 70, in <module>
46
+ main()
47
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 59, in main
48
+ img_crop = Image.open(io.BytesIO(pix.tobytes("ppm")))
49
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 10343, in tobytes
50
+ barray = self._tobytes(idx, jpg_quality)
51
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 9908, in _tobytes
52
+ elif format_ == 2: mupdf.fz_write_pixmap_as_pnm(out, pm)
53
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/mupdf.py", line 47561, in fz_write_pixmap_as_pnm
54
+ return _mupdf.fz_write_pixmap_as_pnm(out, pixmap)
55
+ fitz.mupdf.FzErrorArgument: code=4: Invalid bandwriter header dimensions/setup
56
+ 2024-04-26 16:07:36.641 Uncaught exception GET /media/4bfb60f3fade3edbf619f6357c60a9b159e32c6c164ed2ca91e47ed4.png (185.118.51.182)
57
+ HTTPServerRequest(protocol='http', host='sd.demo.polygraf.ai:8501', method='GET', uri='/media/4bfb60f3fade3edbf619f6357c60a9b159e32c6c164ed2ca91e47ed4.png', version='HTTP/1.1', remote_ip='185.118.51.182')
58
+ Traceback (most recent call last):
59
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/streamlit/runtime/memory_media_file_storage.py", line 140, in get_file
60
+ return self._files_by_id[file_id]
61
+ KeyError: '4bfb60f3fade3edbf619f6357c60a9b159e32c6c164ed2ca91e47ed4'
62
+
63
+ The above exception was the direct cause of the following exception:
64
+
65
+ Traceback (most recent call last):
66
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/tornado/web.py", line 1790, in _execute
67
+ result = await result
68
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/tornado/web.py", line 2693, in get
69
+ self.set_headers()
70
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/tornado/web.py", line 2805, in set_headers
71
+ self.set_extra_headers(self.path)
72
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/streamlit/web/server/media_file_handler.py", line 59, in set_extra_headers
73
+ media_file = self._storage.get_file(path)
74
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/streamlit/runtime/memory_media_file_storage.py", line 142, in get_file
75
+ raise MediaFileStorageError(
76
+ streamlit.runtime.media_file_storage.MediaFileStorageError: Bad filename '4bfb60f3fade3edbf619f6357c60a9b159e32c6c164ed2ca91e47ed4.png'. (No media file with id '4bfb60f3fade3edbf619f6357c60a9b159e32c6c164ed2ca91e47ed4')
77
+ 2024-04-26 16:07:36.683 500 GET /media/4bfb60f3fade3edbf619f6357c60a9b159e32c6c164ed2ca91e47ed4.png (185.118.51.182) 85.45ms
78
+ 2024-04-26 16:10:07.086 Uncaught app exception
79
+ Traceback (most recent call last):
80
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 584, in _run_script
81
+ exec(code, module.__dict__)
82
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 70, in <module>
83
+ main()
84
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 59, in main
85
+ img_crop = Image.open(io.BytesIO(pix.tobytes("ppm")))
86
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 10343, in tobytes
87
+ barray = self._tobytes(idx, jpg_quality)
88
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 9908, in _tobytes
89
+ elif format_ == 2: mupdf.fz_write_pixmap_as_pnm(out, pm)
90
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/mupdf.py", line 47561, in fz_write_pixmap_as_pnm
91
+ return _mupdf.fz_write_pixmap_as_pnm(out, pixmap)
92
+ fitz.mupdf.FzErrorArgument: code=4: Invalid bandwriter header dimensions/setup
93
+ 2024-04-26 16:11:34.899 Uncaught app exception
94
+ Traceback (most recent call last):
95
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 584, in _run_script
96
+ exec(code, module.__dict__)
97
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 70, in <module>
98
+ main()
99
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 59, in main
100
+ img_crop = Image.open(io.BytesIO(pix.tobytes("ppm")))
101
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 10343, in tobytes
102
+ barray = self._tobytes(idx, jpg_quality)
103
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 9908, in _tobytes
104
+ elif format_ == 2: mupdf.fz_write_pixmap_as_pnm(out, pm)
105
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/mupdf.py", line 47561, in fz_write_pixmap_as_pnm
106
+ return _mupdf.fz_write_pixmap_as_pnm(out, pixmap)
107
+ fitz.mupdf.FzErrorArgument: code=4: Invalid bandwriter header dimensions/setup
108
+ 2024-04-26 16:41:51.246 Uncaught app exception
109
+ Traceback (most recent call last):
110
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 584, in _run_script
111
+ exec(code, module.__dict__)
112
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 70, in <module>
113
+ main()
114
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 59, in main
115
+ img_crop = Image.open(io.BytesIO(pix.tobytes("ppm")))
116
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 10343, in tobytes
117
+ barray = self._tobytes(idx, jpg_quality)
118
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 9908, in _tobytes
119
+ elif format_ == 2: mupdf.fz_write_pixmap_as_pnm(out, pm)
120
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/mupdf.py", line 47561, in fz_write_pixmap_as_pnm
121
+ return _mupdf.fz_write_pixmap_as_pnm(out, pixmap)
122
+ fitz.mupdf.FzErrorArgument: code=4: Invalid bandwriter header dimensions/setup
123
+ 2024-04-26 16:42:03.642 Uncaught app exception
124
+ Traceback (most recent call last):
125
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 584, in _run_script
126
+ exec(code, module.__dict__)
127
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 70, in <module>
128
+ main()
129
+ File "/home/aliasgarov/pdf_supporter/demo.py", line 59, in main
130
+ img_crop = Image.open(io.BytesIO(pix.tobytes("ppm")))
131
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 10343, in tobytes
132
+ barray = self._tobytes(idx, jpg_quality)
133
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/__init__.py", line 9908, in _tobytes
134
+ elif format_ == 2: mupdf.fz_write_pixmap_as_pnm(out, pm)
135
+ File "/home/aliasgarov/pdfsupport/lib/python3.10/site-packages/fitz/mupdf.py", line 47561, in fz_write_pixmap_as_pnm
136
+ return _mupdf.fz_write_pixmap_as_pnm(out, pixmap)
137
+ fitz.mupdf.FzErrorArgument: code=4: Invalid bandwriter header dimensions/setup
138
+
139
+ Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
140
+
141
+
142
+ You can now view your Streamlit app in your browser.
143
+
144
+ Network URL: http://10.138.0.11:8501
145
+ External URL: http://104.196.227.207:8501
146
+
147
+ Stopping...
148
+
149
+ Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
150
+
151
+
152
+ You can now view your Streamlit app in your browser.
153
+
154
+ Network URL: http://10.138.0.11:8501
155
+ External URL: http://104.196.227.207:8501
156
+
157
+ 2024-05-03 16:30:12.544 MediaFileHandler: Missing file f2b29efae916d8154f1cdf1d3c4c439869290e2015c09e061442ab9a.png
pdf_supporter/requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ streamlit
2
+ streamlit_drawable_canvas
3
+ tesseract
4
+ fitz
5
+ frontend
6
+ pymupdf