Spaces:
Running
Running
Pylint & CVE fix
#1
by
barunsaha
- opened
This view is limited to 50 files because it contains too many changes.
See the raw diff here.
- .gitattributes +0 -2
- .streamlit/config.toml +0 -10
- README.md +14 -66
- app.py +215 -398
- clarifai_grpc_helper.py +71 -0
- examples/example_04.json +0 -3
- file_embeddings/embeddings.npy +0 -3
- file_embeddings/icons.npy +0 -3
- global_config.py +12 -119
- helpers/__init__.py +0 -0
- helpers/icons_embeddings.py +0 -166
- helpers/image_search.py +0 -148
- helpers/llm_helper.py +0 -187
- helpers/pptx_helper.py +0 -987
- helpers/text_helper.py +0 -83
- icons/png128/0-circle.png +0 -0
- icons/png128/1-circle.png +0 -0
- icons/png128/123.png +0 -0
- icons/png128/2-circle.png +0 -0
- icons/png128/3-circle.png +0 -0
- icons/png128/4-circle.png +0 -0
- icons/png128/5-circle.png +0 -0
- icons/png128/6-circle.png +0 -0
- icons/png128/7-circle.png +0 -0
- icons/png128/8-circle.png +0 -0
- icons/png128/9-circle.png +0 -0
- icons/png128/activity.png +0 -0
- icons/png128/airplane.png +0 -0
- icons/png128/alarm.png +0 -0
- icons/png128/alien-head.png +0 -0
- icons/png128/alphabet.png +0 -0
- icons/png128/amazon.png +0 -0
- icons/png128/amritsar-golden-temple.png +0 -0
- icons/png128/amsterdam-canal.png +0 -0
- icons/png128/amsterdam-windmill.png +0 -0
- icons/png128/android.png +0 -0
- icons/png128/angkor-wat.png +0 -0
- icons/png128/apple.png +0 -0
- icons/png128/archive.png +0 -0
- icons/png128/argentina-obelisk.png +0 -0
- icons/png128/artificial-intelligence-brain.png +0 -0
- icons/png128/atlanta.png +0 -0
- icons/png128/austin.png +0 -0
- icons/png128/automation-decision.png +0 -0
- icons/png128/award.png +0 -0
- icons/png128/balloon.png +0 -0
- icons/png128/ban.png +0 -0
- icons/png128/bandaid.png +0 -0
- icons/png128/bangalore.png +0 -0
- icons/png128/bank.png +0 -0
.gitattributes
CHANGED
@@ -33,5 +33,3 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
-
*.pptx filter=lfs diff=lfs merge=lfs -text
|
37 |
-
pptx_templates/Minimalist_sales_pitch.pptx filter=lfs diff=lfs merge=lfs -text
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
.streamlit/config.toml
DELETED
@@ -1,10 +0,0 @@
|
|
1 |
-
[server]
|
2 |
-
runOnSave = true
|
3 |
-
headless = false
|
4 |
-
maxUploadSize = 0
|
5 |
-
|
6 |
-
[browser]
|
7 |
-
gatherUsageStats = false
|
8 |
-
|
9 |
-
[theme]
|
10 |
-
base = "dark"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
README.md
CHANGED
@@ -4,7 +4,7 @@ emoji: 🏢
|
|
4 |
colorFrom: yellow
|
5 |
colorTo: green
|
6 |
sdk: streamlit
|
7 |
-
sdk_version: 1.
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
license: mit
|
@@ -16,88 +16,36 @@ We spend a lot of time on creating the slides and organizing our thoughts for an
|
|
16 |
With SlideDeck AI, co-create slide decks on any topic with Generative Artificial Intelligence.
|
17 |
Describe your topic and let SlideDeck AI generate a PowerPoint slide deck for you—it's as simple as that!
|
18 |
|
|
|
|
|
19 |
|
20 |
# Process
|
21 |
|
22 |
SlideDeck AI works in the following way:
|
23 |
|
24 |
-
1. Given a topic description, it uses
|
25 |
The output is generated as structured JSON data based on a pre-defined schema.
|
26 |
-
2.
|
27 |
-
3. Subsequently, it uses the `python-pptx` library to generate the slides,
|
28 |
based on the JSON data from the previous step.
|
29 |
-
|
30 |
-
|
31 |
-
For example, one can ask to add another slide or modify an existing slide.
|
32 |
-
A history of instructions is maintained.
|
33 |
-
5. Every time SlideDeck AI generates a PowerPoint presentation, a download button is provided.
|
34 |
-
Clicking on the button will download the file.
|
35 |
|
36 |
-
|
37 |
-
# Summary of the LLMs
|
38 |
-
|
39 |
-
Different LLMs offer different styles of content generation. Use one of the following LLMs along with relevant API keys/access tokens, as appropriate, to create the content of the slide deck:
|
40 |
-
|
41 |
-
| LLM | Provider (code) | Requires API key | Characteristics |
|
42 |
-
| :-------- | :------- |:----------------------------------------------------------------------------| :------- |
|
43 |
-
| Mistral 7B Instruct v0.2 | Hugging Face (`hf`) | Optional but encouraged; [get here](https://huggingface.co/settings/tokens) | Faster, shorter content |
|
44 |
-
| Mistral Nemo Instruct 2407 | Hugging Face (`hf`) | Optional but encouraged; [get here](https://huggingface.co/settings/tokens) | Slower, longer content |
|
45 |
-
| Gemini 1.5 Flash | Google Gemini API (`gg`) | Mandatory; [get here](https://aistudio.google.com/apikey) | Faster, longer content |
|
46 |
-
| Command R+ | Cohere (`co`) | Mandatory; [get here](https://dashboard.cohere.com/api-keys) | Shorter, simpler content |
|
47 |
-
|
48 |
-
The Mistral models do not mandatorily require an access token. However, you are encouraged to get and use your own Hugging Face access token.
|
49 |
-
|
50 |
-
In addition, offline LLMs provided by Ollama can be used. Read below to know more.
|
51 |
-
|
52 |
-
|
53 |
-
# Icons
|
54 |
-
|
55 |
-
SlideDeck AI uses a subset of icons from [bootstrap-icons-1.11.3](https://github.com/twbs/icons)
|
56 |
-
(MIT license) in the slides. A few icons from [SVG Repo](https://www.svgrepo.com/)
|
57 |
-
(CC0, MIT, and Apache licenses) are also used.
|
58 |
|
59 |
|
60 |
# Local Development
|
61 |
|
62 |
-
SlideDeck AI uses
|
63 |
-
|
64 |
-
|
65 |
-
Visit the respective websites to obtain the
|
66 |
-
|
67 |
-
## Offline LLMs Using Ollama
|
68 |
-
|
69 |
-
SlideDeck AI allows the use of offline LLMs to generate the contents of the slide decks. This is typically suitable for individuals or organizations who would like to use self-hosted LLMs for privacy concerns, for example.
|
70 |
-
|
71 |
-
Offline LLMs are made available via Ollama. Therefore, a pre-requisite here is to have [Ollama installed](https://ollama.com/download) on the system and the desired [LLM](https://ollama.com/search) pulled locally.
|
72 |
-
|
73 |
-
In addition, the `RUN_IN_OFFLINE_MODE` environment variable needs to be set to `True` to enable the offline mode. This, for example, can be done using a `.env` file or from the terminal. The typical steps to use SlideDeck AI in offline mode (in a `bash` shell) are as follows:
|
74 |
-
|
75 |
-
```bash
|
76 |
-
ollama list # View locally available LLMs
|
77 |
-
export RUN_IN_OFFLINE_MODE=True # Enable the offline mode to use Ollama
|
78 |
-
git clone https://github.com/barun-saha/slide-deck-ai.git
|
79 |
-
cd slide-deck-ai
|
80 |
-
python -m venv venv # Create a virtual environment
|
81 |
-
source venv/bin/activate # On a Linux system
|
82 |
-
pip install -r requirements.txt
|
83 |
-
streamlit run ./app.py # Run the application
|
84 |
-
```
|
85 |
-
|
86 |
-
The `.env` file should be created inside the `slide-deck-ai` directory.
|
87 |
-
|
88 |
-
The UI is similar to the online mode. However, rather than selecting an LLM from a list, one has to write the name of the Ollama model to be used in a textbox. There is no API key asked here.
|
89 |
-
|
90 |
-
The online and offline modes are mutually exclusive. So, setting `RUN_IN_OFFLINE_MODE` to `False` will make SlideDeck AI use the online LLMs (i.e., the "original mode."). By default, `RUN_IN_OFFLINE_MODE` is set to `False`.
|
91 |
-
|
92 |
-
Finally, the focus is on using offline LLMs, not going completely offline. So, Internet connectivity would still be required to fetch the images from Pexels.
|
93 |
|
94 |
|
95 |
# Live Demo
|
96 |
|
97 |
-
|
98 |
-
- [Demo video](https://youtu.be/QvAKzNKtk9k) of the chat interface on YouTube
|
99 |
|
100 |
|
101 |
# Award
|
102 |
|
103 |
-
SlideDeck AI has won the 3rd Place in the [Llama 2 Hackathon with Clarifai](https://lablab.ai/event/llama-2-hackathon-with-clarifai)
|
|
|
4 |
colorFrom: yellow
|
5 |
colorTo: green
|
6 |
sdk: streamlit
|
7 |
+
sdk_version: 1.26.0
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
license: mit
|
|
|
16 |
With SlideDeck AI, co-create slide decks on any topic with Generative Artificial Intelligence.
|
17 |
Describe your topic and let SlideDeck AI generate a PowerPoint slide deck for you—it's as simple as that!
|
18 |
|
19 |
+
SlideDeck AI is powered by [Mistral 7B Instruct](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1).
|
20 |
+
Originally, it was built using the Llama 2 API provided by Clarifai.
|
21 |
|
22 |
# Process
|
23 |
|
24 |
SlideDeck AI works in the following way:
|
25 |
|
26 |
+
1. Given a topic description, it uses Mistral 7B Instruct to generate the outline/contents of the slides.
|
27 |
The output is generated as structured JSON data based on a pre-defined schema.
|
28 |
+
2. Subsequently, it uses the `python-pptx` library to generate the slides,
|
|
|
29 |
based on the JSON data from the previous step.
|
30 |
+
Here, a user can choose from a set of three pre-defined presentation templates.
|
31 |
+
3. In addition, it uses Metaphor to fetch Web pages related to the topic.
|
|
|
|
|
|
|
|
|
32 |
|
33 |
+
4. ~~Finally, it uses Stable Diffusion 2 to generate an image, based on the title and each slide heading.~~
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
|
36 |
# Local Development
|
37 |
|
38 |
+
SlideDeck AI uses [Mistral 7B Instruct](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
|
39 |
+
via the Hugging Face Inference API.
|
40 |
+
To run this project by yourself, you need to provide the `HUGGINGFACEHUB_API_TOKEN` and `METAPHOR_API_KEY` API keys,
|
41 |
+
for example, in a `.env` file. Visit the respective websites to obtain the keys.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
|
43 |
|
44 |
# Live Demo
|
45 |
|
46 |
+
[SlideDeck AI](https://huggingface.co/spaces/barunsaha/slide-deck-ai)
|
|
|
47 |
|
48 |
|
49 |
# Award
|
50 |
|
51 |
+
SlideDeck AI has won the 3rd Place in the [Llama 2 Hackathon with Clarifai](https://lablab.ai/event/llama-2-hackathon-with-clarifai).
|
app.py
CHANGED
@@ -1,493 +1,310 @@
|
|
1 |
-
"""
|
2 |
-
Streamlit app containing the UI and the application logic.
|
3 |
-
"""
|
4 |
-
import datetime
|
5 |
-
import logging
|
6 |
-
import os
|
7 |
import pathlib
|
8 |
-
import
|
9 |
import tempfile
|
10 |
-
from typing import List,
|
11 |
|
12 |
-
import httpx
|
13 |
-
import huggingface_hub
|
14 |
import json5
|
15 |
-
import
|
16 |
-
import requests
|
17 |
import streamlit as st
|
18 |
-
from dotenv import load_dotenv
|
19 |
-
from langchain_community.chat_message_histories import StreamlitChatMessageHistory
|
20 |
-
from langchain_core.messages import HumanMessage
|
21 |
-
from langchain_core.prompts import ChatPromptTemplate
|
22 |
|
23 |
-
import
|
|
|
24 |
from global_config import GlobalConfig
|
25 |
-
from helpers import llm_helper, pptx_helper, text_helper
|
26 |
-
|
27 |
-
|
28 |
-
load_dotenv()
|
29 |
-
|
30 |
|
31 |
-
RUN_IN_OFFLINE_MODE = os.getenv('RUN_IN_OFFLINE_MODE', 'False').lower() == 'true'
|
32 |
|
|
|
|
|
33 |
|
34 |
-
@st.cache_data
|
35 |
-
def _load_strings() -> dict:
|
36 |
-
"""
|
37 |
-
Load various strings to be displayed in the app.
|
38 |
-
:return: The dictionary of strings.
|
39 |
-
"""
|
40 |
|
41 |
-
|
42 |
-
|
|
|
|
|
43 |
|
44 |
|
45 |
@st.cache_data
|
46 |
-
def
|
47 |
"""
|
48 |
-
|
49 |
|
50 |
-
:param
|
51 |
-
:return: The
|
52 |
"""
|
53 |
|
54 |
-
|
55 |
-
|
56 |
-
template = in_file.read()
|
57 |
-
else:
|
58 |
-
with open(GlobalConfig.INITIAL_PROMPT_TEMPLATE, 'r', encoding='utf-8') as in_file:
|
59 |
-
template = in_file.read()
|
60 |
-
|
61 |
-
return template
|
62 |
|
63 |
|
64 |
-
|
65 |
-
|
66 |
-
selected_provider: str,
|
67 |
-
selected_model: str,
|
68 |
-
user_key: str,
|
69 |
-
) -> bool:
|
70 |
"""
|
71 |
-
|
72 |
|
73 |
-
:
|
74 |
-
:param selected_provider: The LLM provider.
|
75 |
-
:param selected_model: Name of the model.
|
76 |
-
:param user_key: User-provided API key.
|
77 |
-
:return: `True` if all inputs "look" OK; `False` otherwise.
|
78 |
"""
|
79 |
|
80 |
-
|
81 |
-
handle_error(
|
82 |
-
'Not enough information provided!'
|
83 |
-
' Please be a little more descriptive and type a few words'
|
84 |
-
' with a few characters :)',
|
85 |
-
False
|
86 |
-
)
|
87 |
-
return False
|
88 |
-
|
89 |
-
if not selected_provider or not selected_model:
|
90 |
-
handle_error('No valid LLM provider and/or model name found!', False)
|
91 |
-
return False
|
92 |
-
|
93 |
-
if not llm_helper.is_valid_llm_provider_model(selected_provider, selected_model, user_key):
|
94 |
-
handle_error(
|
95 |
-
'The LLM settings do not look correct. Make sure that an API key/access token'
|
96 |
-
' is provided if the selected LLM requires it. An API key should be 6-64 characters'
|
97 |
-
' long, only containing alphanumeric characters, hyphens, and underscores.',
|
98 |
-
False
|
99 |
-
)
|
100 |
-
return False
|
101 |
-
|
102 |
-
return True
|
103 |
|
104 |
|
105 |
-
|
106 |
-
|
107 |
-
Display an error message in the app.
|
108 |
-
|
109 |
-
:param error_msg: The error message to be displayed.
|
110 |
-
:param should_log: If `True`, log the message.
|
111 |
"""
|
|
|
112 |
|
113 |
-
|
114 |
-
|
115 |
-
|
116 |
-
st.error(error_msg)
|
117 |
-
|
118 |
-
|
119 |
-
def reset_api_key():
|
120 |
-
"""
|
121 |
-
Clear API key input when a different LLM is selected from the dropdown list.
|
122 |
"""
|
123 |
|
124 |
-
|
125 |
-
|
126 |
-
|
127 |
-
|
128 |
-
|
129 |
-
# Session variables
|
130 |
-
CHAT_MESSAGES = 'chat_messages'
|
131 |
-
DOWNLOAD_FILE_KEY = 'download_file_name'
|
132 |
-
IS_IT_REFINEMENT = 'is_it_refinement'
|
133 |
-
|
134 |
-
|
135 |
-
logger = logging.getLogger(__name__)
|
136 |
-
|
137 |
-
texts = list(GlobalConfig.PPTX_TEMPLATE_FILES.keys())
|
138 |
-
captions = [GlobalConfig.PPTX_TEMPLATE_FILES[x]['caption'] for x in texts]
|
139 |
-
|
140 |
-
with st.sidebar:
|
141 |
-
# The PPT templates
|
142 |
-
pptx_template = st.sidebar.radio(
|
143 |
-
'1: Select a presentation template:',
|
144 |
-
texts,
|
145 |
-
captions=captions,
|
146 |
-
horizontal=True
|
147 |
)
|
148 |
|
149 |
-
|
150 |
-
|
151 |
-
|
152 |
-
|
153 |
-
|
154 |
-
|
155 |
-
|
156 |
-
|
157 |
-
|
158 |
-
|
159 |
-
|
160 |
-
|
161 |
-
|
162 |
-
|
163 |
-
|
164 |
-
|
165 |
-
|
166 |
-
|
167 |
-
|
168 |
-
|
169 |
-
|
170 |
-
|
171 |
-
|
172 |
-
|
173 |
-
|
174 |
-
|
175 |
-
),
|
176 |
-
type='password',
|
177 |
-
key='api_key_input'
|
178 |
-
)
|
179 |
|
180 |
|
181 |
def build_ui():
|
182 |
"""
|
183 |
-
Display the input elements for content generation.
|
184 |
"""
|
185 |
|
|
|
|
|
186 |
st.title(APP_TEXT['app_name'])
|
187 |
st.subheader(APP_TEXT['caption'])
|
188 |
st.markdown(
|
189 |
-
'
|
|
|
|
|
|
|
|
|
|
|
190 |
)
|
191 |
|
192 |
-
|
193 |
-
|
194 |
-
|
195 |
-
(
|
196 |
-
|
197 |
-
|
198 |
-
|
199 |
-
|
200 |
-
|
|
|
|
|
201 |
)
|
202 |
|
203 |
-
|
204 |
-
|
205 |
|
206 |
-
|
|
|
|
|
|
|
|
|
|
|
207 |
|
|
|
|
|
208 |
|
209 |
-
|
210 |
-
|
211 |
-
|
212 |
-
"""
|
213 |
|
214 |
-
|
215 |
-
|
|
|
|
|
216 |
|
217 |
-
|
218 |
-
|
219 |
|
220 |
-
|
221 |
-
|
222 |
-
|
223 |
-
|
224 |
-
|
|
|
|
|
225 |
)
|
226 |
|
227 |
-
# Since Streamlit app reloads at every interaction, display the chat history
|
228 |
-
# from the save session state
|
229 |
-
for msg in history.messages:
|
230 |
-
st.chat_message(msg.type).code(msg.content, language='json')
|
231 |
-
|
232 |
-
if prompt := st.chat_input(
|
233 |
-
placeholder=APP_TEXT['chat_placeholder'],
|
234 |
-
max_chars=GlobalConfig.LLM_MODEL_MAX_INPUT_LENGTH
|
235 |
-
):
|
236 |
-
provider, llm_name = llm_helper.get_provider_model(
|
237 |
-
llm_provider_to_use,
|
238 |
-
use_ollama=RUN_IN_OFFLINE_MODE
|
239 |
-
)
|
240 |
|
241 |
-
|
242 |
-
|
|
|
243 |
|
244 |
-
|
245 |
-
|
246 |
-
|
247 |
-
|
248 |
-
|
249 |
-
|
250 |
-
|
251 |
-
|
252 |
-
|
253 |
-
|
254 |
-
|
255 |
-
|
256 |
-
formatted_template = prompt_template.format(
|
257 |
-
**{
|
258 |
-
'instructions': '\n'.join(list_of_msgs),
|
259 |
-
'previous_content': _get_last_response(),
|
260 |
-
}
|
261 |
-
)
|
262 |
-
else:
|
263 |
-
formatted_template = prompt_template.format(**{'question': prompt})
|
264 |
-
|
265 |
-
progress_bar = st.progress(0, 'Preparing to call LLM...')
|
266 |
-
response = ''
|
267 |
|
268 |
try:
|
269 |
-
|
270 |
-
|
271 |
-
|
272 |
-
|
273 |
-
|
274 |
-
|
275 |
-
|
276 |
-
|
277 |
-
|
278 |
-
'
|
279 |
-
|
280 |
-
|
281 |
-
|
|
|
|
|
|
|
282 |
)
|
283 |
return
|
284 |
|
285 |
-
|
286 |
-
response += _
|
287 |
-
|
288 |
-
# Update the progress bar with an approx progress percentage
|
289 |
-
progress_bar.progress(
|
290 |
-
min(
|
291 |
-
len(response) / gcfg.get_max_output_tokens(llm_provider_to_use),
|
292 |
-
0.95
|
293 |
-
),
|
294 |
-
text='Streaming content...this might take a while...'
|
295 |
-
)
|
296 |
-
except (httpx.ConnectError, requests.exceptions.ConnectionError):
|
297 |
-
handle_error(
|
298 |
-
'A connection error occurred while streaming content from the LLM endpoint.'
|
299 |
-
' Unfortunately, the slide deck cannot be generated. Please try again later.'
|
300 |
-
' Alternatively, try selecting a different LLM from the dropdown list. If you are'
|
301 |
-
' using Ollama, make sure that Ollama is already running on your system.',
|
302 |
-
True
|
303 |
-
)
|
304 |
-
return
|
305 |
-
except huggingface_hub.errors.ValidationError as ve:
|
306 |
-
handle_error(
|
307 |
-
f'An error occurred while trying to generate the content: {ve}'
|
308 |
-
'\nPlease try again with a significantly shorter input text.',
|
309 |
-
True
|
310 |
-
)
|
311 |
-
return
|
312 |
-
except ollama.ResponseError:
|
313 |
-
handle_error(
|
314 |
-
f'The model `{llm_name}` is unavailable with Ollama on your system.'
|
315 |
-
f' Make sure that you have provided the correct LLM name or pull it using'
|
316 |
-
f' `ollama pull {llm_name}`. View LLMs available locally by running `ollama list`.',
|
317 |
-
True
|
318 |
-
)
|
319 |
-
return
|
320 |
-
except Exception as ex:
|
321 |
-
handle_error(
|
322 |
-
f'An unexpected error occurred while generating the content: {ex}'
|
323 |
-
'\nPlease try again later, possibly with different inputs.'
|
324 |
-
' Alternatively, try selecting a different LLM from the dropdown list.'
|
325 |
-
' If you are using Cohere or Gemini models, make sure that you have provided'
|
326 |
-
' a correct API key.',
|
327 |
-
True
|
328 |
-
)
|
329 |
-
return
|
330 |
-
|
331 |
-
history.add_user_message(prompt)
|
332 |
-
history.add_ai_message(response)
|
333 |
-
|
334 |
-
# The content has been generated as JSON
|
335 |
-
# There maybe trailing ``` at the end of the response -- remove them
|
336 |
-
# To be careful: ``` may be part of the content as well when code is generated
|
337 |
-
response = text_helper.get_clean_json(response)
|
338 |
-
logger.info(
|
339 |
-
'Cleaned JSON length: %d', len(response)
|
340 |
-
)
|
341 |
|
342 |
-
|
343 |
-
|
344 |
-
GlobalConfig.LLM_PROGRESS_MAX,
|
345 |
-
text='Finding photos online and generating the slide deck...'
|
346 |
-
)
|
347 |
-
progress_bar.progress(1.0, text='Done!')
|
348 |
-
st.chat_message('ai').code(response, language='json')
|
349 |
|
350 |
-
|
351 |
-
|
|
|
352 |
|
353 |
-
|
354 |
-
|
355 |
-
len(st.session_state[CHAT_MESSAGES]) / 2
|
356 |
-
)
|
357 |
|
358 |
|
359 |
-
def
|
360 |
"""
|
361 |
-
|
362 |
-
deck, the path may be to an empty file.
|
363 |
|
364 |
-
:param
|
365 |
-
:
|
|
|
366 |
"""
|
367 |
|
368 |
-
|
369 |
-
parsed_data = json5.loads(json_str)
|
370 |
-
except ValueError:
|
371 |
-
handle_error(
|
372 |
-
'Encountered error while parsing JSON...will fix it and retry',
|
373 |
-
True
|
374 |
-
)
|
375 |
-
try:
|
376 |
-
parsed_data = json5.loads(text_helper.fix_malformed_json(json_str))
|
377 |
-
except ValueError:
|
378 |
-
handle_error(
|
379 |
-
'Encountered an error again while fixing JSON...'
|
380 |
-
'the slide deck cannot be created, unfortunately ☹'
|
381 |
-
'\nPlease try again later.',
|
382 |
-
True
|
383 |
-
)
|
384 |
-
return None
|
385 |
-
except RecursionError:
|
386 |
-
handle_error(
|
387 |
-
'Encountered a recursion error while parsing JSON...'
|
388 |
-
'the slide deck cannot be created, unfortunately ☹'
|
389 |
-
'\nPlease try again later.',
|
390 |
-
True
|
391 |
-
)
|
392 |
-
return None
|
393 |
-
except Exception:
|
394 |
-
handle_error(
|
395 |
-
'Encountered an error while parsing JSON...'
|
396 |
-
'the slide deck cannot be created, unfortunately ☹'
|
397 |
-
'\nPlease try again later.',
|
398 |
-
True
|
399 |
-
)
|
400 |
-
return None
|
401 |
-
|
402 |
-
if DOWNLOAD_FILE_KEY in st.session_state:
|
403 |
-
path = pathlib.Path(st.session_state[DOWNLOAD_FILE_KEY])
|
404 |
-
else:
|
405 |
-
temp = tempfile.NamedTemporaryFile(delete=False, suffix='.pptx')
|
406 |
-
path = pathlib.Path(temp.name)
|
407 |
-
st.session_state[DOWNLOAD_FILE_KEY] = str(path)
|
408 |
-
|
409 |
-
if temp:
|
410 |
-
temp.close()
|
411 |
|
412 |
try:
|
413 |
-
|
414 |
-
|
415 |
-
parsed_data,
|
416 |
-
slides_template=pptx_template,
|
417 |
-
output_file_path=path
|
418 |
-
)
|
419 |
except Exception as ex:
|
420 |
-
st.error(
|
421 |
-
|
422 |
-
|
423 |
-
|
424 |
-
|
425 |
-
|
426 |
-
def _is_it_refinement() -> bool:
|
427 |
-
"""
|
428 |
-
Whether it is the initial prompt or a refinement.
|
429 |
-
|
430 |
-
:return: True if it is the initial prompt; False otherwise.
|
431 |
-
"""
|
432 |
-
|
433 |
-
if IS_IT_REFINEMENT in st.session_state:
|
434 |
-
return True
|
435 |
|
436 |
-
|
437 |
-
# Prepare for the next call
|
438 |
-
st.session_state[IS_IT_REFINEMENT] = True
|
439 |
-
return True
|
440 |
|
441 |
-
|
|
|
442 |
|
|
|
443 |
|
444 |
-
def _get_user_messages() -> List[str]:
|
445 |
-
"""
|
446 |
-
Get a list of user messages submitted until now from the session state.
|
447 |
|
448 |
-
|
449 |
"""
|
|
|
450 |
|
451 |
-
|
452 |
-
|
453 |
-
|
454 |
-
|
455 |
-
|
456 |
-
def _get_last_response() -> str:
|
457 |
"""
|
458 |
-
Get the last response generated by AI.
|
459 |
|
460 |
-
|
461 |
-
|
462 |
|
463 |
-
|
|
|
|
|
|
|
|
|
464 |
|
|
|
|
|
465 |
|
466 |
-
|
467 |
-
|
468 |
-
|
|
|
|
|
|
|
|
|
|
|
469 |
|
470 |
-
|
471 |
-
|
472 |
|
473 |
-
|
474 |
-
view_messages.json(st.session_state[CHAT_MESSAGES])
|
475 |
|
476 |
|
477 |
-
def
|
478 |
"""
|
479 |
-
|
480 |
|
481 |
-
:param
|
482 |
"""
|
483 |
|
484 |
-
|
485 |
-
|
486 |
-
|
487 |
-
|
488 |
-
|
489 |
-
|
490 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
491 |
|
492 |
|
493 |
def main():
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
import pathlib
|
2 |
+
import logging
|
3 |
import tempfile
|
4 |
+
from typing import List, Tuple
|
5 |
|
|
|
|
|
6 |
import json5
|
7 |
+
import metaphor_python as metaphor
|
|
|
8 |
import streamlit as st
|
|
|
|
|
|
|
|
|
9 |
|
10 |
+
import llm_helper
|
11 |
+
import pptx_helper
|
12 |
from global_config import GlobalConfig
|
|
|
|
|
|
|
|
|
|
|
13 |
|
|
|
14 |
|
15 |
+
APP_TEXT = json5.loads(open(GlobalConfig.APP_STRINGS_FILE, 'r', encoding='utf-8').read())
|
16 |
+
GB_CONVERTER = 2 ** 30
|
17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
+
logging.basicConfig(
|
20 |
+
level=GlobalConfig.LOG_LEVEL,
|
21 |
+
format='%(asctime)s - %(message)s',
|
22 |
+
)
|
23 |
|
24 |
|
25 |
@st.cache_data
|
26 |
+
def get_contents_wrapper(text: str) -> str:
|
27 |
"""
|
28 |
+
Fetch and cache the slide deck contents on a topic by calling an external API.
|
29 |
|
30 |
+
:param text: The presentation topic
|
31 |
+
:return: The slide deck contents or outline in JSON format
|
32 |
"""
|
33 |
|
34 |
+
logging.info('LLM call because of cache miss...')
|
35 |
+
return llm_helper.generate_slides_content(text).strip()
|
|
|
|
|
|
|
|
|
|
|
|
|
36 |
|
37 |
|
38 |
+
@st.cache_resource
|
39 |
+
def get_metaphor_client_wrapper() -> metaphor.Metaphor:
|
|
|
|
|
|
|
|
|
40 |
"""
|
41 |
+
Create a Metaphor client for semantic Web search.
|
42 |
|
43 |
+
:return: Metaphor instance
|
|
|
|
|
|
|
|
|
44 |
"""
|
45 |
|
46 |
+
return metaphor.Metaphor(api_key=GlobalConfig.METAPHOR_API_KEY)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
|
48 |
|
49 |
+
@st.cache_data
|
50 |
+
def get_web_search_results_wrapper(text: str) -> List[Tuple[str, str]]:
|
|
|
|
|
|
|
|
|
51 |
"""
|
52 |
+
Fetch and cache the Web search results on a given topic.
|
53 |
|
54 |
+
:param text: The topic
|
55 |
+
:return: A list of (title, link) tuples
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
56 |
"""
|
57 |
|
58 |
+
results = []
|
59 |
+
search_results = get_metaphor_client_wrapper().search(
|
60 |
+
text,
|
61 |
+
use_autoprompt=True,
|
62 |
+
num_results=5
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
63 |
)
|
64 |
|
65 |
+
for a_result in search_results.results:
|
66 |
+
results.append((a_result.title, a_result.url))
|
67 |
+
|
68 |
+
return results
|
69 |
+
|
70 |
+
|
71 |
+
# def get_disk_used_percentage() -> float:
|
72 |
+
# """
|
73 |
+
# Compute the disk usage.
|
74 |
+
#
|
75 |
+
# :return: Percentage of the disk space currently used
|
76 |
+
# """
|
77 |
+
#
|
78 |
+
# total, used, free = shutil.disk_usage(__file__)
|
79 |
+
# total = total // GB_CONVERTER
|
80 |
+
# used = used // GB_CONVERTER
|
81 |
+
# free = free // GB_CONVERTER
|
82 |
+
# used_perc = 100.0 * used / total
|
83 |
+
#
|
84 |
+
# logging.debug(f'Total: {total} GB\n'
|
85 |
+
# f'Used: {used} GB\n'
|
86 |
+
# f'Free: {free} GB')
|
87 |
+
#
|
88 |
+
# logging.debug('\n'.join(os.listdir()))
|
89 |
+
#
|
90 |
+
# return used_perc
|
|
|
|
|
|
|
|
|
91 |
|
92 |
|
93 |
def build_ui():
|
94 |
"""
|
95 |
+
Display the input elements for content generation. Only covers the first step.
|
96 |
"""
|
97 |
|
98 |
+
# get_disk_used_percentage()
|
99 |
+
|
100 |
st.title(APP_TEXT['app_name'])
|
101 |
st.subheader(APP_TEXT['caption'])
|
102 |
st.markdown(
|
103 |
+
'Powered by'
|
104 |
+
' [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).'
|
105 |
+
)
|
106 |
+
st.markdown(
|
107 |
+
'*If the JSON is generated or parsed incorrectly, try again later by making minor changes'
|
108 |
+
' to the input text.*'
|
109 |
)
|
110 |
|
111 |
+
with st.form('my_form'):
|
112 |
+
# Topic input
|
113 |
+
try:
|
114 |
+
with open(GlobalConfig.PRELOAD_DATA_FILE, 'r', encoding='utf-8') as in_file:
|
115 |
+
preload_data = json5.loads(in_file.read())
|
116 |
+
except (FileExistsError, FileNotFoundError):
|
117 |
+
preload_data = {'topic': '', 'audience': ''}
|
118 |
+
|
119 |
+
topic = st.text_area(
|
120 |
+
APP_TEXT['input_labels'][0],
|
121 |
+
value=preload_data['topic']
|
122 |
)
|
123 |
|
124 |
+
texts = list(GlobalConfig.PPTX_TEMPLATE_FILES.keys())
|
125 |
+
captions = [GlobalConfig.PPTX_TEMPLATE_FILES[x]['caption'] for x in texts]
|
126 |
|
127 |
+
pptx_template = st.radio(
|
128 |
+
'Select a presentation template:',
|
129 |
+
texts,
|
130 |
+
captions=captions,
|
131 |
+
horizontal=True
|
132 |
+
)
|
133 |
|
134 |
+
st.divider()
|
135 |
+
submit = st.form_submit_button('Generate slide deck')
|
136 |
|
137 |
+
if submit:
|
138 |
+
# st.write(f'Clicked {time.time()}')
|
139 |
+
st.session_state.submitted = True
|
|
|
140 |
|
141 |
+
# https://github.com/streamlit/streamlit/issues/3832#issuecomment-1138994421
|
142 |
+
if 'submitted' in st.session_state:
|
143 |
+
progress_text = 'Generating the slides...give it a moment'
|
144 |
+
progress_bar = st.progress(0, text=progress_text)
|
145 |
|
146 |
+
topic_txt = topic.strip()
|
147 |
+
generate_presentation(topic_txt, pptx_template, progress_bar)
|
148 |
|
149 |
+
st.divider()
|
150 |
+
st.text(APP_TEXT['tos'])
|
151 |
+
st.text(APP_TEXT['tos2'])
|
152 |
+
|
153 |
+
st.markdown(
|
154 |
+
'![Visitors]'
|
155 |
+
'(https://api.visitorbadge.io/api/visitors?path=https%3A%2F%2Fhuggingface.co%2Fspaces%2Fbarunsaha%2Fslide-deck-ai&countColor=%23263759)'
|
156 |
)
|
157 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
158 |
|
159 |
+
def generate_presentation(topic: str, pptx_template: str, progress_bar):
|
160 |
+
"""
|
161 |
+
Process the inputs to generate the slides.
|
162 |
|
163 |
+
:param topic: The presentation topic based on which contents are to be generated
|
164 |
+
:param pptx_template: The PowerPoint template name to be used
|
165 |
+
:param progress_bar: Progress bar from the page
|
166 |
+
:return:
|
167 |
+
"""
|
168 |
+
|
169 |
+
topic_length = len(topic)
|
170 |
+
logging.debug('Input length:: topic: %s', topic_length)
|
171 |
+
|
172 |
+
if topic_length >= 10:
|
173 |
+
logging.debug('Topic: %s', topic)
|
174 |
+
target_length = min(topic_length, GlobalConfig.LLM_MODEL_MAX_INPUT_LENGTH)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
175 |
|
176 |
try:
|
177 |
+
# Step 1: Generate the contents in JSON format using an LLM
|
178 |
+
json_str = process_slides_contents(topic[:target_length], progress_bar)
|
179 |
+
logging.debug('Truncated topic: %s', topic[:target_length])
|
180 |
+
logging.debug('Length of JSON: %d', len(json_str))
|
181 |
+
|
182 |
+
# Step 2: Generate the slide deck based on the template specified
|
183 |
+
if len(json_str) > 0:
|
184 |
+
st.info(
|
185 |
+
'Tip: The generated content doesn\'t look so great?'
|
186 |
+
' Need alternatives? Just change your description text and try again.',
|
187 |
+
icon="💡️"
|
188 |
+
)
|
189 |
+
else:
|
190 |
+
st.error(
|
191 |
+
'Unfortunately, JSON generation failed, so the next steps would lead'
|
192 |
+
' to nowhere. Try again or come back later.'
|
193 |
)
|
194 |
return
|
195 |
|
196 |
+
all_headers = generate_slide_deck(json_str, pptx_template, progress_bar)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
197 |
|
198 |
+
# Step 3: Bonus stuff: Web references and AI art
|
199 |
+
show_bonus_stuff(all_headers)
|
|
|
|
|
|
|
|
|
|
|
200 |
|
201 |
+
except ValueError as ve:
|
202 |
+
st.error(f'Unfortunately, an error occurred: {ve}! '
|
203 |
+
f'Please change the text, try again later, or report it, sharing your inputs.')
|
204 |
|
205 |
+
else:
|
206 |
+
st.error('Not enough information provided! Please be little more descriptive :)')
|
|
|
|
|
207 |
|
208 |
|
209 |
+
def process_slides_contents(text: str, progress_bar: st.progress) -> str:
|
210 |
"""
|
211 |
+
Convert given text into structured data and display. Update the UI.
|
|
|
212 |
|
213 |
+
:param text: The topic description for the presentation
|
214 |
+
:param progress_bar: Progress bar for this step
|
215 |
+
:return: The contents as a JSON-formatted string
|
216 |
"""
|
217 |
|
218 |
+
json_str = ''
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
219 |
|
220 |
try:
|
221 |
+
logging.info('Calling LLM for content generation on the topic: %s', text)
|
222 |
+
json_str = get_contents_wrapper(text)
|
|
|
|
|
|
|
|
|
223 |
except Exception as ex:
|
224 |
+
st.error(
|
225 |
+
f'An exception occurred while trying to convert to JSON. It could be because of heavy'
|
226 |
+
f' traffic or something else. Try doing it again or try again later.'
|
227 |
+
f'\nError message: {ex}'
|
228 |
+
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
229 |
|
230 |
+
progress_bar.progress(50, text='Contents generated')
|
|
|
|
|
|
|
231 |
|
232 |
+
with st.expander('The generated contents (in JSON format)'):
|
233 |
+
st.code(json_str, language='json')
|
234 |
|
235 |
+
return json_str
|
236 |
|
|
|
|
|
|
|
237 |
|
238 |
+
def generate_slide_deck(json_str: str, pptx_template: str, progress_bar) -> List:
|
239 |
"""
|
240 |
+
Create a slide deck.
|
241 |
|
242 |
+
:param json_str: The contents in JSON format
|
243 |
+
:param pptx_template: The PPTX template name
|
244 |
+
:param progress_bar: Progress bar
|
245 |
+
:return: A list of all slide headers and the title
|
|
|
|
|
246 |
"""
|
|
|
247 |
|
248 |
+
progress_text = 'Creating the slide deck...give it a moment'
|
249 |
+
progress_bar.progress(75, text=progress_text)
|
250 |
|
251 |
+
# # Get a unique name for the file to save -- use the session ID
|
252 |
+
# ctx = st_sr.get_script_run_ctx()
|
253 |
+
# session_id = ctx.session_id
|
254 |
+
# timestamp = time.time()
|
255 |
+
# output_file_name = f'{session_id}_{timestamp}.pptx'
|
256 |
|
257 |
+
temp = tempfile.NamedTemporaryFile(delete=False, suffix='.pptx')
|
258 |
+
path = pathlib.Path(temp.name)
|
259 |
|
260 |
+
logging.info('Creating PPTX file...')
|
261 |
+
all_headers = pptx_helper.generate_powerpoint_presentation(
|
262 |
+
json_str,
|
263 |
+
as_yaml=False,
|
264 |
+
slides_template=pptx_template,
|
265 |
+
output_file_path=path
|
266 |
+
)
|
267 |
+
progress_bar.progress(100, text='Done!')
|
268 |
|
269 |
+
with open(path, 'rb') as f:
|
270 |
+
st.download_button('Download PPTX file', f, file_name='Presentation.pptx')
|
271 |
|
272 |
+
return all_headers
|
|
|
273 |
|
274 |
|
275 |
+
def show_bonus_stuff(ppt_headers: List[str]):
|
276 |
"""
|
277 |
+
Show bonus stuff for the presentation.
|
278 |
|
279 |
+
:param ppt_headers: A list of the slide headings.
|
280 |
"""
|
281 |
|
282 |
+
# Use the presentation title and the slide headers to find relevant info online
|
283 |
+
logging.info('Calling Metaphor search...')
|
284 |
+
ppt_text = ' '.join(ppt_headers)
|
285 |
+
search_results = get_web_search_results_wrapper(ppt_text)
|
286 |
+
md_text_items = []
|
287 |
+
|
288 |
+
for (title, link) in search_results:
|
289 |
+
md_text_items.append(f'[{title}]({link})')
|
290 |
+
|
291 |
+
with st.expander('Related Web references'):
|
292 |
+
st.markdown('\n\n'.join(md_text_items))
|
293 |
+
|
294 |
+
logging.info('Done!')
|
295 |
+
|
296 |
+
# # Avoid image generation. It costs time and an API call, so just limit to the text generation.
|
297 |
+
# with st.expander('AI-generated image on the presentation topic'):
|
298 |
+
# logging.info('Calling SDXL for image generation...')
|
299 |
+
# # img_empty.write('')
|
300 |
+
# # img_text.write(APP_TEXT['image_info'])
|
301 |
+
# image = get_ai_image_wrapper(ppt_text)
|
302 |
+
#
|
303 |
+
# if len(image) > 0:
|
304 |
+
# image = base64.b64decode(image)
|
305 |
+
# st.image(image, caption=ppt_text)
|
306 |
+
# st.info('Tip: Right-click on the image to save it.', icon="💡️")
|
307 |
+
# logging.info('Image added')
|
308 |
|
309 |
|
310 |
def main():
|
clarifai_grpc_helper.py
ADDED
@@ -0,0 +1,71 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
|
2 |
+
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
|
3 |
+
from clarifai_grpc.grpc.api.status import status_code_pb2
|
4 |
+
|
5 |
+
from global_config import GlobalConfig
|
6 |
+
|
7 |
+
|
8 |
+
CHANNEL = ClarifaiChannel.get_grpc_channel()
|
9 |
+
STUB = service_pb2_grpc.V2Stub(CHANNEL)
|
10 |
+
|
11 |
+
METADATA = (
|
12 |
+
('authorization', 'Key ' + GlobalConfig.CLARIFAI_PAT),
|
13 |
+
)
|
14 |
+
|
15 |
+
USER_DATA_OBJECT = resources_pb2.UserAppIDSet(
|
16 |
+
user_id=GlobalConfig.CLARIFAI_USER_ID,
|
17 |
+
app_id=GlobalConfig.CLARIFAI_APP_ID
|
18 |
+
)
|
19 |
+
|
20 |
+
RAW_TEXT = '''You are a helpful, intelligent chatbot. Create the slides for a presentation on the given topic. Include main headings for each slide, detailed bullet points for each slide. Add relevant content to each slide. Do not output any blank line.
|
21 |
+
|
22 |
+
Topic:
|
23 |
+
Talk about AI, covering what it is and how it works. Add its pros, cons, and future prospects. Also, cover its job prospects.
|
24 |
+
'''
|
25 |
+
|
26 |
+
|
27 |
+
def get_text_from_llm(prompt: str) -> str:
|
28 |
+
post_model_outputs_response = STUB.PostModelOutputs(
|
29 |
+
service_pb2.PostModelOutputsRequest(
|
30 |
+
user_app_id=USER_DATA_OBJECT, # The userDataObject is created in the overview and is required when using a PAT
|
31 |
+
model_id=GlobalConfig.CLARIFAI_MODEL_ID,
|
32 |
+
# version_id=MODEL_VERSION_ID, # This is optional. Defaults to the latest model version
|
33 |
+
inputs=[
|
34 |
+
resources_pb2.Input(
|
35 |
+
data=resources_pb2.Data(
|
36 |
+
text=resources_pb2.Text(
|
37 |
+
raw=prompt
|
38 |
+
)
|
39 |
+
)
|
40 |
+
)
|
41 |
+
]
|
42 |
+
),
|
43 |
+
metadata=METADATA
|
44 |
+
)
|
45 |
+
|
46 |
+
if post_model_outputs_response.status.code != status_code_pb2.SUCCESS:
|
47 |
+
print(post_model_outputs_response.status)
|
48 |
+
raise Exception(f"Post model outputs failed, status: {post_model_outputs_response.status.description}")
|
49 |
+
|
50 |
+
# Since we have one input, one output will exist here
|
51 |
+
output = post_model_outputs_response.outputs[0]
|
52 |
+
|
53 |
+
# print("Completion:\n")
|
54 |
+
# print(output.data.text.raw)
|
55 |
+
|
56 |
+
return output.data.text.raw
|
57 |
+
|
58 |
+
|
59 |
+
if __name__ == '__main__':
|
60 |
+
topic = ('Talk about AI, covering what it is and how it works.'
|
61 |
+
' Add its pros, cons, and future prospects.'
|
62 |
+
' Also, cover its job prospects.'
|
63 |
+
)
|
64 |
+
print(topic)
|
65 |
+
|
66 |
+
with open(GlobalConfig.SLIDES_TEMPLATE_FILE, 'r') as in_file:
|
67 |
+
prompt_txt = in_file.read()
|
68 |
+
prompt_txt = prompt_txt.replace('{topic}', topic)
|
69 |
+
response_txt = get_text_from_llm(prompt_txt)
|
70 |
+
|
71 |
+
print('Output:\n', response_txt)
|
examples/example_04.json
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"topic": "12 slides on a basic tutorial on Python along with examples"
|
3 |
-
}
|
|
|
|
|
|
|
|
file_embeddings/embeddings.npy
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:64a1ba79b20c81ba7ed6604468736f74ae89813fe378191af1d8574c008b3ab5
|
3 |
-
size 326784
|
|
|
|
|
|
|
|
file_embeddings/icons.npy
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:ce5ce4c86bb213915606921084b3516464154edcae12f4bc708d62c6bd7acebb
|
3 |
-
size 51168
|
|
|
|
|
|
|
|
global_config.py
CHANGED
@@ -1,7 +1,3 @@
|
|
1 |
-
"""
|
2 |
-
A set of configurations used by the app.
|
3 |
-
"""
|
4 |
-
import logging
|
5 |
import os
|
6 |
|
7 |
from dataclasses import dataclass
|
@@ -13,135 +9,32 @@ load_dotenv()
|
|
13 |
|
14 |
@dataclass(frozen=True)
|
15 |
class GlobalConfig:
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
PROVIDER_GOOGLE_GEMINI = 'gg'
|
22 |
-
PROVIDER_HUGGING_FACE = 'hf'
|
23 |
-
PROVIDER_OLLAMA = 'ol'
|
24 |
-
VALID_PROVIDERS = {
|
25 |
-
PROVIDER_COHERE,
|
26 |
-
PROVIDER_GOOGLE_GEMINI,
|
27 |
-
PROVIDER_HUGGING_FACE,
|
28 |
-
PROVIDER_OLLAMA
|
29 |
-
}
|
30 |
-
VALID_MODELS = {
|
31 |
-
'[co]command-r-08-2024': {
|
32 |
-
'description': 'simpler, slower',
|
33 |
-
'max_new_tokens': 4096,
|
34 |
-
'paid': True,
|
35 |
-
},
|
36 |
-
'[gg]gemini-1.5-flash-002': {
|
37 |
-
'description': 'faster response',
|
38 |
-
'max_new_tokens': 8192,
|
39 |
-
'paid': True,
|
40 |
-
},
|
41 |
-
'[hf]mistralai/Mistral-7B-Instruct-v0.2': {
|
42 |
-
'description': 'faster, shorter',
|
43 |
-
'max_new_tokens': 8192,
|
44 |
-
'paid': False,
|
45 |
-
},
|
46 |
-
'[hf]mistralai/Mistral-Nemo-Instruct-2407': {
|
47 |
-
'description': 'longer response',
|
48 |
-
'max_new_tokens': 10240,
|
49 |
-
'paid': False,
|
50 |
-
},
|
51 |
-
}
|
52 |
-
LLM_PROVIDER_HELP = (
|
53 |
-
'LLM provider codes:\n\n'
|
54 |
-
'- **[co]**: Cohere\n'
|
55 |
-
'- **[gg]**: Google Gemini API\n'
|
56 |
-
'- **[hf]**: Hugging Face Inference API\n'
|
57 |
-
)
|
58 |
-
DEFAULT_MODEL_INDEX = 2
|
59 |
-
LLM_MODEL_TEMPERATURE = 0.2
|
60 |
-
LLM_MODEL_MIN_OUTPUT_LENGTH = 100
|
61 |
-
LLM_MODEL_MAX_INPUT_LENGTH = 400 # characters
|
62 |
|
63 |
HUGGINGFACEHUB_API_TOKEN = os.environ.get('HUGGINGFACEHUB_API_TOKEN', '')
|
|
|
64 |
|
65 |
LOG_LEVEL = 'DEBUG'
|
66 |
-
COUNT_TOKENS = False
|
67 |
APP_STRINGS_FILE = 'strings.json'
|
68 |
PRELOAD_DATA_FILE = 'examples/example_02.json'
|
69 |
SLIDES_TEMPLATE_FILE = 'langchain_templates/template_combined.txt'
|
70 |
-
|
71 |
-
REFINEMENT_PROMPT_TEMPLATE = 'langchain_templates/chat_prompts/refinement_template_v4_two_cols_img.txt'
|
72 |
-
|
73 |
-
LLM_PROGRESS_MAX = 90
|
74 |
-
ICONS_DIR = 'icons/png128/'
|
75 |
-
TINY_BERT_MODEL = 'gaunernst/bert-mini-uncased'
|
76 |
-
EMBEDDINGS_FILE_NAME = 'file_embeddings/embeddings.npy'
|
77 |
-
ICONS_FILE_NAME = 'file_embeddings/icons.npy'
|
78 |
|
79 |
PPTX_TEMPLATE_FILES = {
|
80 |
-
'
|
81 |
'file': 'pptx_templates/Blank.pptx',
|
82 |
-
'caption': 'A good start
|
83 |
-
},
|
84 |
-
'Minimalist Sales Pitch': {
|
85 |
-
'file': 'pptx_templates/Minimalist_sales_pitch.pptx',
|
86 |
-
'caption': 'In high contrast ⬛'
|
87 |
},
|
88 |
'Ion Boardroom': {
|
89 |
'file': 'pptx_templates/Ion_Boardroom.pptx',
|
90 |
-
'caption': 'Make some bold decisions
|
91 |
},
|
92 |
'Urban Monochrome': {
|
93 |
'file': 'pptx_templates/Urban_monochrome.pptx',
|
94 |
-
'caption': 'Marvel in a monochrome dream
|
95 |
-
}
|
96 |
}
|
97 |
-
|
98 |
-
# This is a long text, so not incorporated as a string in `strings.json`
|
99 |
-
CHAT_USAGE_INSTRUCTIONS = (
|
100 |
-
'Briefly describe your topic of presentation in the textbox provided below. For example:\n'
|
101 |
-
'- Make a slide deck on AI.'
|
102 |
-
'\n\n'
|
103 |
-
'Subsequently, you can add follow-up instructions, e.g.:\n'
|
104 |
-
'- Can you add a slide on GPUs?'
|
105 |
-
'\n\n'
|
106 |
-
' You can also ask it to refine any particular slide, e.g.:\n'
|
107 |
-
'- Make the slide with title \'Examples of AI\' a bit more descriptive.'
|
108 |
-
'\n\n'
|
109 |
-
'Finally, click on the download button at the bottom to download the slide deck.'
|
110 |
-
' See this [demo video](https://youtu.be/QvAKzNKtk9k) for a brief walkthrough.\n\n'
|
111 |
-
'Currently, three LLMs providers and four LLMs are supported:'
|
112 |
-
' **Mistral 7B Instruct v0.2** and **Mistral Nemo Instruct 2407** via Hugging Face'
|
113 |
-
' Inference Endpoint; **Gemini 1.5 Flash** via Gemini API; and **Command R+** via Cohere'
|
114 |
-
' API. If one is not available, choose the other from the dropdown list. A [summary of'
|
115 |
-
' the supported LLMs]('
|
116 |
-
'https://github.com/barun-saha/slide-deck-ai/blob/main/README.md#summary-of-the-llms)'
|
117 |
-
' is available for reference.\n\n'
|
118 |
-
' SlideDeck AI does not have access to the Web, apart for searching for images relevant'
|
119 |
-
' to the slides. Photos are added probabilistically; transparency needs to be changed'
|
120 |
-
' manually, if required.\n\n'
|
121 |
-
'[SlideDeck AI](https://github.com/barun-saha/slide-deck-ai) is an Open-Source project,'
|
122 |
-
' released under the'
|
123 |
-
' [MIT license](https://github.com/barun-saha/slide-deck-ai?tab=MIT-1-ov-file#readme).'
|
124 |
-
'\n\n---\n\n'
|
125 |
-
'© Copyright 2023-2024 Barun Saha.\n\n'
|
126 |
-
)
|
127 |
-
|
128 |
-
|
129 |
-
logging.basicConfig(
|
130 |
-
level=GlobalConfig.LOG_LEVEL,
|
131 |
-
format='%(asctime)s - %(levelname)s - %(name)s - %(message)s',
|
132 |
-
datefmt='%Y-%m-%d %H:%M:%S'
|
133 |
-
)
|
134 |
-
|
135 |
-
|
136 |
-
def get_max_output_tokens(llm_name: str) -> int:
|
137 |
-
"""
|
138 |
-
Get the max output tokens value configured for an LLM. Return a default value if not configured.
|
139 |
-
|
140 |
-
:param llm_name: The name of the LLM.
|
141 |
-
:return: Max output tokens or a default count.
|
142 |
-
"""
|
143 |
-
|
144 |
-
try:
|
145 |
-
return GlobalConfig.VALID_MODELS[llm_name]['max_new_tokens']
|
146 |
-
except KeyError:
|
147 |
-
return 2048
|
|
|
|
|
|
|
|
|
|
|
1 |
import os
|
2 |
|
3 |
from dataclasses import dataclass
|
|
|
9 |
|
10 |
@dataclass(frozen=True)
|
11 |
class GlobalConfig:
|
12 |
+
HF_LLM_MODEL_NAME = 'mistralai/Mistral-7B-Instruct-v0.2'
|
13 |
+
LLM_MODEL_TEMPERATURE: float = 0.2
|
14 |
+
LLM_MODEL_MIN_OUTPUT_LENGTH: int = 50
|
15 |
+
LLM_MODEL_MAX_OUTPUT_LENGTH: int = 2000
|
16 |
+
LLM_MODEL_MAX_INPUT_LENGTH: int = 300
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
|
18 |
HUGGINGFACEHUB_API_TOKEN = os.environ.get('HUGGINGFACEHUB_API_TOKEN', '')
|
19 |
+
METAPHOR_API_KEY = os.environ.get('METAPHOR_API_KEY', '')
|
20 |
|
21 |
LOG_LEVEL = 'DEBUG'
|
|
|
22 |
APP_STRINGS_FILE = 'strings.json'
|
23 |
PRELOAD_DATA_FILE = 'examples/example_02.json'
|
24 |
SLIDES_TEMPLATE_FILE = 'langchain_templates/template_combined.txt'
|
25 |
+
JSON_TEMPLATE_FILE = 'langchain_templates/text_to_json_template_02.txt'
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
PPTX_TEMPLATE_FILES = {
|
28 |
+
'Blank': {
|
29 |
'file': 'pptx_templates/Blank.pptx',
|
30 |
+
'caption': 'A good start'
|
|
|
|
|
|
|
|
|
31 |
},
|
32 |
'Ion Boardroom': {
|
33 |
'file': 'pptx_templates/Ion_Boardroom.pptx',
|
34 |
+
'caption': 'Make some bold decisions'
|
35 |
},
|
36 |
'Urban Monochrome': {
|
37 |
'file': 'pptx_templates/Urban_monochrome.pptx',
|
38 |
+
'caption': 'Marvel in a monochrome dream'
|
39 |
+
}
|
40 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
helpers/__init__.py
DELETED
File without changes
|
helpers/icons_embeddings.py
DELETED
@@ -1,166 +0,0 @@
|
|
1 |
-
"""
|
2 |
-
Generate and save the embeddings of a pre-defined list of icons.
|
3 |
-
Compare them with keywords embeddings to find most relevant icons.
|
4 |
-
"""
|
5 |
-
import os
|
6 |
-
import pathlib
|
7 |
-
import sys
|
8 |
-
from typing import List, Tuple
|
9 |
-
|
10 |
-
import numpy as np
|
11 |
-
from sklearn.metrics.pairwise import cosine_similarity
|
12 |
-
from transformers import BertTokenizer, BertModel
|
13 |
-
|
14 |
-
sys.path.append('..')
|
15 |
-
sys.path.append('../..')
|
16 |
-
|
17 |
-
from global_config import GlobalConfig
|
18 |
-
|
19 |
-
|
20 |
-
tokenizer = BertTokenizer.from_pretrained(GlobalConfig.TINY_BERT_MODEL)
|
21 |
-
model = BertModel.from_pretrained(GlobalConfig.TINY_BERT_MODEL)
|
22 |
-
|
23 |
-
|
24 |
-
def get_icons_list() -> List[str]:
|
25 |
-
"""
|
26 |
-
Get a list of available icons.
|
27 |
-
|
28 |
-
:return: The icons file names.
|
29 |
-
"""
|
30 |
-
|
31 |
-
items = pathlib.Path('../' + GlobalConfig.ICONS_DIR).glob('*.png')
|
32 |
-
items = [
|
33 |
-
os.path.basename(str(item)).removesuffix('.png') for item in items
|
34 |
-
]
|
35 |
-
|
36 |
-
return items
|
37 |
-
|
38 |
-
|
39 |
-
def get_embeddings(texts) -> np.ndarray:
|
40 |
-
"""
|
41 |
-
Generate embeddings for a list of texts using a pre-trained language model.
|
42 |
-
|
43 |
-
:param texts: A string or a list of strings to be converted into embeddings.
|
44 |
-
:type texts: Union[str, List[str]]
|
45 |
-
:return: A NumPy array containing the embeddings for the input texts.
|
46 |
-
:rtype: numpy.ndarray
|
47 |
-
|
48 |
-
:raises ValueError: If the input is not a string or a list of strings, or if any element
|
49 |
-
in the list is not a string.
|
50 |
-
|
51 |
-
Example usage:
|
52 |
-
>>> keyword = 'neural network'
|
53 |
-
>>> file_names = ['neural_network_icon.png', 'data_analysis_icon.png', 'machine_learning.png']
|
54 |
-
>>> keyword_embeddings = get_embeddings(keyword)
|
55 |
-
>>> file_name_embeddings = get_embeddings(file_names)
|
56 |
-
"""
|
57 |
-
|
58 |
-
inputs = tokenizer(texts, return_tensors='pt', padding=True, max_length=128, truncation=True)
|
59 |
-
outputs = model(**inputs)
|
60 |
-
|
61 |
-
return outputs.last_hidden_state.mean(dim=1).detach().numpy()
|
62 |
-
|
63 |
-
|
64 |
-
def save_icons_embeddings():
|
65 |
-
"""
|
66 |
-
Generate and save the embeddings for the icon file names.
|
67 |
-
"""
|
68 |
-
|
69 |
-
file_names = get_icons_list()
|
70 |
-
print(f'{len(file_names)} icon files available...')
|
71 |
-
file_name_embeddings = get_embeddings(file_names)
|
72 |
-
print(f'file_name_embeddings.shape: {file_name_embeddings.shape}')
|
73 |
-
|
74 |
-
# Save embeddings to a file
|
75 |
-
np.save(GlobalConfig.EMBEDDINGS_FILE_NAME, file_name_embeddings)
|
76 |
-
np.save(GlobalConfig.ICONS_FILE_NAME, file_names) # Save file names for reference
|
77 |
-
|
78 |
-
|
79 |
-
def load_saved_embeddings() -> Tuple[np.ndarray, np.ndarray]:
|
80 |
-
"""
|
81 |
-
Load precomputed embeddings and icons file names.
|
82 |
-
|
83 |
-
:return: The embeddings and the icon file names.
|
84 |
-
"""
|
85 |
-
|
86 |
-
file_name_embeddings = np.load(GlobalConfig.EMBEDDINGS_FILE_NAME)
|
87 |
-
file_names = np.load(GlobalConfig.ICONS_FILE_NAME)
|
88 |
-
|
89 |
-
return file_name_embeddings, file_names
|
90 |
-
|
91 |
-
|
92 |
-
def find_icons(keywords: List[str]) -> List[str]:
|
93 |
-
"""
|
94 |
-
Find relevant icon file names for a list of keywords.
|
95 |
-
|
96 |
-
:param keywords: The list of one or more keywords.
|
97 |
-
:return: A list of the file names relevant for each keyword.
|
98 |
-
"""
|
99 |
-
|
100 |
-
keyword_embeddings = get_embeddings(keywords)
|
101 |
-
file_name_embeddings, file_names = load_saved_embeddings()
|
102 |
-
|
103 |
-
# Compute similarity
|
104 |
-
similarities = cosine_similarity(keyword_embeddings, file_name_embeddings)
|
105 |
-
icon_files = file_names[np.argmax(similarities, axis=-1)]
|
106 |
-
|
107 |
-
return icon_files
|
108 |
-
|
109 |
-
|
110 |
-
def main():
|
111 |
-
"""
|
112 |
-
Example usage.
|
113 |
-
"""
|
114 |
-
|
115 |
-
# Run this again if icons are to be added/removed
|
116 |
-
save_icons_embeddings()
|
117 |
-
|
118 |
-
keywords = [
|
119 |
-
'deep learning',
|
120 |
-
'',
|
121 |
-
'recycling',
|
122 |
-
'handshake',
|
123 |
-
'Ferry',
|
124 |
-
'rain drop',
|
125 |
-
'speech bubble',
|
126 |
-
'mental resilience',
|
127 |
-
'turmeric',
|
128 |
-
'Art',
|
129 |
-
'price tag',
|
130 |
-
'Oxygen',
|
131 |
-
'oxygen',
|
132 |
-
'Social Connection',
|
133 |
-
'Accomplishment',
|
134 |
-
'Python',
|
135 |
-
'XML',
|
136 |
-
'Handshake',
|
137 |
-
]
|
138 |
-
icon_files = find_icons(keywords)
|
139 |
-
print(
|
140 |
-
f'The relevant icon files are:\n'
|
141 |
-
f'{list(zip(keywords, icon_files))}'
|
142 |
-
)
|
143 |
-
|
144 |
-
# BERT tiny:
|
145 |
-
# [('deep learning', 'deep-learning'), ('', '123'), ('recycling', 'refinery'),
|
146 |
-
# ('handshake', 'dash-circle'), ('Ferry', 'cart'), ('rain drop', 'bucket'),
|
147 |
-
# ('speech bubble', 'globe'), ('mental resilience', 'exclamation-triangle'),
|
148 |
-
# ('turmeric', 'kebab'), ('Art', 'display'), ('price tag', 'bug-fill'),
|
149 |
-
# ('Oxygen', 'radioactive')]
|
150 |
-
|
151 |
-
# BERT mini
|
152 |
-
# [('deep learning', 'deep-learning'), ('', 'compass'), ('recycling', 'tools'),
|
153 |
-
# ('handshake', 'bandaid'), ('Ferry', 'cart'), ('rain drop', 'trash'),
|
154 |
-
# ('speech bubble', 'image'), ('mental resilience', 'recycle'), ('turmeric', 'linkedin'),
|
155 |
-
# ('Art', 'book'), ('price tag', 'card-image'), ('Oxygen', 'radioactive')]
|
156 |
-
|
157 |
-
# BERT small
|
158 |
-
# [('deep learning', 'deep-learning'), ('', 'gem'), ('recycling', 'tools'),
|
159 |
-
# ('handshake', 'handbag'), ('Ferry', 'truck'), ('rain drop', 'bucket'),
|
160 |
-
# ('speech bubble', 'strategy'), ('mental resilience', 'deep-learning'),
|
161 |
-
# ('turmeric', 'flower'),
|
162 |
-
# ('Art', 'book'), ('price tag', 'hotdog'), ('Oxygen', 'radioactive')]
|
163 |
-
|
164 |
-
|
165 |
-
if __name__ == '__main__':
|
166 |
-
main()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
helpers/image_search.py
DELETED
@@ -1,148 +0,0 @@
|
|
1 |
-
"""
|
2 |
-
Search photos using Pexels API.
|
3 |
-
"""
|
4 |
-
import logging
|
5 |
-
import os
|
6 |
-
import random
|
7 |
-
from io import BytesIO
|
8 |
-
from typing import Union, Tuple, Literal
|
9 |
-
from urllib.parse import urlparse, parse_qs
|
10 |
-
|
11 |
-
import requests
|
12 |
-
from dotenv import load_dotenv
|
13 |
-
|
14 |
-
|
15 |
-
load_dotenv()
|
16 |
-
|
17 |
-
|
18 |
-
REQUEST_TIMEOUT = 12
|
19 |
-
MAX_PHOTOS = 3
|
20 |
-
|
21 |
-
|
22 |
-
# Only show errors
|
23 |
-
logging.getLogger('urllib3').setLevel(logging.ERROR)
|
24 |
-
# Disable all child loggers of urllib3, e.g. urllib3.connectionpool
|
25 |
-
# logging.getLogger('urllib3').propagate = True
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
def search_pexels(
|
30 |
-
query: str,
|
31 |
-
size: Literal['small', 'medium', 'large'] = 'medium',
|
32 |
-
per_page: int = MAX_PHOTOS
|
33 |
-
) -> dict:
|
34 |
-
"""
|
35 |
-
Searches for images on Pexels using the provided query.
|
36 |
-
|
37 |
-
This function sends a GET request to the Pexels API with the specified search query
|
38 |
-
and authorization header containing the API key. It returns the JSON response from the API.
|
39 |
-
|
40 |
-
[2024-08-31] Note:
|
41 |
-
`curl` succeeds but API call via Python `requests` fail. Apparently, this could be due to
|
42 |
-
Cloudflare (or others) blocking the requests, perhaps identifying as Web-scraping. So,
|
43 |
-
changing the user-agent to Firefox.
|
44 |
-
https://stackoverflow.com/a/74674276/147021
|
45 |
-
https://stackoverflow.com/a/51268523/147021
|
46 |
-
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent/Firefox#linux
|
47 |
-
|
48 |
-
:param query: The search query for finding images.
|
49 |
-
:param size: The size of the images: small, medium, or large.
|
50 |
-
:param per_page: No. of results to be displayed per page.
|
51 |
-
:return: The JSON response from the Pexels API containing search results.
|
52 |
-
:raises requests.exceptions.RequestException: If the request to the Pexels API fails.
|
53 |
-
"""
|
54 |
-
|
55 |
-
url = 'https://api.pexels.com/v1/search'
|
56 |
-
headers = {
|
57 |
-
'Authorization': os.getenv('PEXEL_API_KEY'),
|
58 |
-
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0',
|
59 |
-
}
|
60 |
-
params = {
|
61 |
-
'query': query,
|
62 |
-
'size': size,
|
63 |
-
'page': 1,
|
64 |
-
'per_page': per_page
|
65 |
-
}
|
66 |
-
response = requests.get(url, headers=headers, params=params, timeout=REQUEST_TIMEOUT)
|
67 |
-
response.raise_for_status() # Ensure the request was successful
|
68 |
-
|
69 |
-
return response.json()
|
70 |
-
|
71 |
-
|
72 |
-
def get_photo_url_from_api_response(
|
73 |
-
json_response: dict
|
74 |
-
) -> Tuple[Union[str, None], Union[str, None]]:
|
75 |
-
"""
|
76 |
-
Return a randomly chosen photo from a Pexels search API response. In addition, also return
|
77 |
-
the original URL of the page on Pexels.
|
78 |
-
|
79 |
-
:param json_response: The JSON response.
|
80 |
-
:return: The selected photo URL and page URL or `None`.
|
81 |
-
"""
|
82 |
-
|
83 |
-
page_url = None
|
84 |
-
photo_url = None
|
85 |
-
|
86 |
-
if 'photos' in json_response:
|
87 |
-
photos = json_response['photos']
|
88 |
-
|
89 |
-
if photos:
|
90 |
-
photo_idx = random.choice(list(range(MAX_PHOTOS)))
|
91 |
-
photo = photos[photo_idx]
|
92 |
-
|
93 |
-
if 'url' in photo:
|
94 |
-
page_url = photo['url']
|
95 |
-
|
96 |
-
if 'src' in photo:
|
97 |
-
if 'large' in photo['src']:
|
98 |
-
photo_url = photo['src']['large']
|
99 |
-
elif 'original' in photo['src']:
|
100 |
-
photo_url = photo['src']['original']
|
101 |
-
|
102 |
-
return photo_url, page_url
|
103 |
-
|
104 |
-
|
105 |
-
def get_image_from_url(url: str) -> BytesIO:
|
106 |
-
"""
|
107 |
-
Fetches an image from the specified URL and returns it as a BytesIO object.
|
108 |
-
|
109 |
-
This function sends a GET request to the provided URL, retrieves the image data,
|
110 |
-
and wraps it in a BytesIO object, which can be used like a file.
|
111 |
-
|
112 |
-
:param url: The URL of the image to be fetched.
|
113 |
-
:return: A BytesIO object containing the image data.
|
114 |
-
:raises requests.exceptions.RequestException: If the request to the URL fails.
|
115 |
-
"""
|
116 |
-
|
117 |
-
headers = {
|
118 |
-
'Authorization': os.getenv('PEXEL_API_KEY'),
|
119 |
-
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0',
|
120 |
-
}
|
121 |
-
response = requests.get(url, headers=headers, stream=True, timeout=REQUEST_TIMEOUT)
|
122 |
-
response.raise_for_status()
|
123 |
-
image_data = BytesIO(response.content)
|
124 |
-
|
125 |
-
return image_data
|
126 |
-
|
127 |
-
|
128 |
-
def extract_dimensions(url: str) -> Tuple[int, int]:
|
129 |
-
"""
|
130 |
-
Extracts the height and width from the URL parameters.
|
131 |
-
|
132 |
-
:param url: The URL containing the image dimensions.
|
133 |
-
:return: A tuple containing the width and height as integers.
|
134 |
-
"""
|
135 |
-
parsed_url = urlparse(url)
|
136 |
-
query_params = parse_qs(parsed_url.query)
|
137 |
-
width = int(query_params.get('w', [0])[0])
|
138 |
-
height = int(query_params.get('h', [0])[0])
|
139 |
-
|
140 |
-
return width, height
|
141 |
-
|
142 |
-
|
143 |
-
if __name__ == '__main__':
|
144 |
-
print(
|
145 |
-
search_pexels(
|
146 |
-
query='people'
|
147 |
-
)
|
148 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
helpers/llm_helper.py
DELETED
@@ -1,187 +0,0 @@
|
|
1 |
-
"""
|
2 |
-
Helper functions to access LLMs.
|
3 |
-
"""
|
4 |
-
import logging
|
5 |
-
import re
|
6 |
-
import sys
|
7 |
-
from typing import Tuple, Union
|
8 |
-
|
9 |
-
import requests
|
10 |
-
from requests.adapters import HTTPAdapter
|
11 |
-
from urllib3.util import Retry
|
12 |
-
from langchain_core.language_models import BaseLLM
|
13 |
-
|
14 |
-
sys.path.append('..')
|
15 |
-
|
16 |
-
from global_config import GlobalConfig
|
17 |
-
|
18 |
-
|
19 |
-
LLM_PROVIDER_MODEL_REGEX = re.compile(r'\[(.*?)\](.*)')
|
20 |
-
OLLAMA_MODEL_REGEX = re.compile(r'[a-zA-Z0-9._:-]+$')
|
21 |
-
# 6-64 characters long, only containing alphanumeric characters, hyphens, and underscores
|
22 |
-
API_KEY_REGEX = re.compile(r'^[a-zA-Z0-9_-]{6,64}$')
|
23 |
-
HF_API_HEADERS = {'Authorization': f'Bearer {GlobalConfig.HUGGINGFACEHUB_API_TOKEN}'}
|
24 |
-
REQUEST_TIMEOUT = 35
|
25 |
-
|
26 |
-
logger = logging.getLogger(__name__)
|
27 |
-
logging.getLogger('httpx').setLevel(logging.WARNING)
|
28 |
-
logging.getLogger('httpcore').setLevel(logging.WARNING)
|
29 |
-
|
30 |
-
retries = Retry(
|
31 |
-
total=5,
|
32 |
-
backoff_factor=0.25,
|
33 |
-
backoff_jitter=0.3,
|
34 |
-
status_forcelist=[502, 503, 504],
|
35 |
-
allowed_methods={'POST'},
|
36 |
-
)
|
37 |
-
adapter = HTTPAdapter(max_retries=retries)
|
38 |
-
http_session = requests.Session()
|
39 |
-
http_session.mount('https://', adapter)
|
40 |
-
http_session.mount('http://', adapter)
|
41 |
-
|
42 |
-
|
43 |
-
def get_provider_model(provider_model: str, use_ollama: bool) -> Tuple[str, str]:
|
44 |
-
"""
|
45 |
-
Parse and get LLM provider and model name from strings like `[provider]model/name-version`.
|
46 |
-
|
47 |
-
:param provider_model: The provider, model name string from `GlobalConfig`.
|
48 |
-
:param use_ollama: Whether Ollama is used (i.e., running in offline mode).
|
49 |
-
:return: The provider and the model name; empty strings in case no matching pattern found.
|
50 |
-
"""
|
51 |
-
|
52 |
-
provider_model = provider_model.strip()
|
53 |
-
|
54 |
-
if use_ollama:
|
55 |
-
match = OLLAMA_MODEL_REGEX.match(provider_model)
|
56 |
-
if match:
|
57 |
-
return GlobalConfig.PROVIDER_OLLAMA, match.group(0)
|
58 |
-
else:
|
59 |
-
match = LLM_PROVIDER_MODEL_REGEX.match(provider_model)
|
60 |
-
|
61 |
-
if match:
|
62 |
-
inside_brackets = match.group(1)
|
63 |
-
outside_brackets = match.group(2)
|
64 |
-
return inside_brackets, outside_brackets
|
65 |
-
|
66 |
-
return '', ''
|
67 |
-
|
68 |
-
|
69 |
-
def is_valid_llm_provider_model(provider: str, model: str, api_key: str) -> bool:
|
70 |
-
"""
|
71 |
-
Verify whether LLM settings are proper.
|
72 |
-
This function does not verify whether `api_key` is correct. It only confirms that the key has
|
73 |
-
at least five characters. Key verification is done when the LLM is created.
|
74 |
-
|
75 |
-
:param provider: Name of the LLM provider.
|
76 |
-
:param model: Name of the model.
|
77 |
-
:param api_key: The API key or access token.
|
78 |
-
:return: `True` if the settings "look" OK; `False` otherwise.
|
79 |
-
"""
|
80 |
-
|
81 |
-
if not provider or not model or provider not in GlobalConfig.VALID_PROVIDERS:
|
82 |
-
return False
|
83 |
-
|
84 |
-
if provider in [
|
85 |
-
GlobalConfig.PROVIDER_GOOGLE_GEMINI,
|
86 |
-
GlobalConfig.PROVIDER_COHERE,
|
87 |
-
] and not api_key:
|
88 |
-
return False
|
89 |
-
|
90 |
-
if api_key:
|
91 |
-
return API_KEY_REGEX.match(api_key) is not None
|
92 |
-
|
93 |
-
return True
|
94 |
-
|
95 |
-
|
96 |
-
def get_langchain_llm(
|
97 |
-
provider: str,
|
98 |
-
model: str,
|
99 |
-
max_new_tokens: int,
|
100 |
-
api_key: str = ''
|
101 |
-
) -> Union[BaseLLM, None]:
|
102 |
-
"""
|
103 |
-
Get an LLM based on the provider and model specified.
|
104 |
-
|
105 |
-
:param provider: The LLM provider. Valid values are `hf` for Hugging Face.
|
106 |
-
:param model: The name of the LLM.
|
107 |
-
:param max_new_tokens: The maximum number of tokens to generate.
|
108 |
-
:param api_key: API key or access token to use.
|
109 |
-
:return: An instance of the LLM or `None` in case of any error.
|
110 |
-
"""
|
111 |
-
|
112 |
-
if provider == GlobalConfig.PROVIDER_HUGGING_FACE:
|
113 |
-
from langchain_community.llms.huggingface_endpoint import HuggingFaceEndpoint
|
114 |
-
|
115 |
-
logger.debug('Getting LLM via HF endpoint: %s', model)
|
116 |
-
return HuggingFaceEndpoint(
|
117 |
-
repo_id=model,
|
118 |
-
max_new_tokens=max_new_tokens,
|
119 |
-
top_k=40,
|
120 |
-
top_p=0.95,
|
121 |
-
temperature=GlobalConfig.LLM_MODEL_TEMPERATURE,
|
122 |
-
repetition_penalty=1.03,
|
123 |
-
streaming=True,
|
124 |
-
huggingfacehub_api_token=api_key or GlobalConfig.HUGGINGFACEHUB_API_TOKEN,
|
125 |
-
return_full_text=False,
|
126 |
-
stop_sequences=['</s>'],
|
127 |
-
)
|
128 |
-
|
129 |
-
if provider == GlobalConfig.PROVIDER_GOOGLE_GEMINI:
|
130 |
-
from google.generativeai.types.safety_types import HarmBlockThreshold, HarmCategory
|
131 |
-
from langchain_google_genai import GoogleGenerativeAI
|
132 |
-
|
133 |
-
logger.debug('Getting LLM via Google Gemini: %s', model)
|
134 |
-
return GoogleGenerativeAI(
|
135 |
-
model=model,
|
136 |
-
temperature=GlobalConfig.LLM_MODEL_TEMPERATURE,
|
137 |
-
max_tokens=max_new_tokens,
|
138 |
-
timeout=None,
|
139 |
-
max_retries=2,
|
140 |
-
google_api_key=api_key,
|
141 |
-
safety_settings={
|
142 |
-
HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT:
|
143 |
-
HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
|
144 |
-
HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
|
145 |
-
HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
|
146 |
-
HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT:
|
147 |
-
HarmBlockThreshold.BLOCK_LOW_AND_ABOVE
|
148 |
-
}
|
149 |
-
)
|
150 |
-
|
151 |
-
if provider == GlobalConfig.PROVIDER_COHERE:
|
152 |
-
from langchain_cohere.llms import Cohere
|
153 |
-
|
154 |
-
logger.debug('Getting LLM via Cohere: %s', model)
|
155 |
-
return Cohere(
|
156 |
-
temperature=GlobalConfig.LLM_MODEL_TEMPERATURE,
|
157 |
-
max_tokens=max_new_tokens,
|
158 |
-
timeout_seconds=None,
|
159 |
-
max_retries=2,
|
160 |
-
cohere_api_key=api_key,
|
161 |
-
streaming=True,
|
162 |
-
)
|
163 |
-
|
164 |
-
if provider == GlobalConfig.PROVIDER_OLLAMA:
|
165 |
-
from langchain_ollama.llms import OllamaLLM
|
166 |
-
|
167 |
-
logger.debug('Getting LLM via Ollama: %s', model)
|
168 |
-
return OllamaLLM(
|
169 |
-
model=model,
|
170 |
-
temperature=GlobalConfig.LLM_MODEL_TEMPERATURE,
|
171 |
-
num_predict=max_new_tokens,
|
172 |
-
format='json',
|
173 |
-
streaming=True,
|
174 |
-
)
|
175 |
-
|
176 |
-
return None
|
177 |
-
|
178 |
-
|
179 |
-
if __name__ == '__main__':
|
180 |
-
inputs = [
|
181 |
-
'[co]Cohere',
|
182 |
-
'[hf]mistralai/Mistral-7B-Instruct-v0.2',
|
183 |
-
'[gg]gemini-1.5-flash-002'
|
184 |
-
]
|
185 |
-
|
186 |
-
for text in inputs:
|
187 |
-
print(get_provider_model(text, use_ollama=False))
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
helpers/pptx_helper.py
DELETED
@@ -1,987 +0,0 @@
|
|
1 |
-
"""
|
2 |
-
A set of functions to create a PowerPoint slide deck.
|
3 |
-
"""
|
4 |
-
import logging
|
5 |
-
import os
|
6 |
-
import pathlib
|
7 |
-
import random
|
8 |
-
import re
|
9 |
-
import sys
|
10 |
-
import tempfile
|
11 |
-
from typing import List, Tuple, Optional
|
12 |
-
|
13 |
-
import json5
|
14 |
-
import pptx
|
15 |
-
from dotenv import load_dotenv
|
16 |
-
from pptx.enum.shapes import MSO_AUTO_SHAPE_TYPE
|
17 |
-
from pptx.shapes.placeholder import PicturePlaceholder, SlidePlaceholder
|
18 |
-
|
19 |
-
sys.path.append('..')
|
20 |
-
sys.path.append('../..')
|
21 |
-
|
22 |
-
import helpers.icons_embeddings as ice
|
23 |
-
import helpers.image_search as ims
|
24 |
-
from global_config import GlobalConfig
|
25 |
-
|
26 |
-
|
27 |
-
load_dotenv()
|
28 |
-
|
29 |
-
|
30 |
-
# English Metric Unit (used by PowerPoint) to inches
|
31 |
-
EMU_TO_INCH_SCALING_FACTOR = 1.0 / 914400
|
32 |
-
INCHES_3 = pptx.util.Inches(3)
|
33 |
-
INCHES_2 = pptx.util.Inches(2)
|
34 |
-
INCHES_1_5 = pptx.util.Inches(1.5)
|
35 |
-
INCHES_1 = pptx.util.Inches(1)
|
36 |
-
INCHES_0_8 = pptx.util.Inches(0.8)
|
37 |
-
INCHES_0_9 = pptx.util.Inches(0.9)
|
38 |
-
INCHES_0_5 = pptx.util.Inches(0.5)
|
39 |
-
INCHES_0_4 = pptx.util.Inches(0.4)
|
40 |
-
INCHES_0_3 = pptx.util.Inches(0.3)
|
41 |
-
INCHES_0_2 = pptx.util.Inches(0.2)
|
42 |
-
|
43 |
-
STEP_BY_STEP_PROCESS_MARKER = '>> '
|
44 |
-
ICON_BEGINNING_MARKER = '[['
|
45 |
-
ICON_END_MARKER = ']]'
|
46 |
-
|
47 |
-
ICON_SIZE = INCHES_0_8
|
48 |
-
ICON_BG_SIZE = INCHES_1
|
49 |
-
|
50 |
-
IMAGE_DISPLAY_PROBABILITY = 1 / 3.0
|
51 |
-
FOREGROUND_IMAGE_PROBABILITY = 0.8
|
52 |
-
|
53 |
-
SLIDE_NUMBER_REGEX = re.compile(r"^slide[ ]+\d+:", re.IGNORECASE)
|
54 |
-
ICONS_REGEX = re.compile(r"\[\[(.*?)\]\]\s*(.*)")
|
55 |
-
|
56 |
-
ICON_COLORS = [
|
57 |
-
pptx.dml.color.RGBColor.from_string('800000'), # Maroon
|
58 |
-
pptx.dml.color.RGBColor.from_string('6A5ACD'), # SlateBlue
|
59 |
-
pptx.dml.color.RGBColor.from_string('556B2F'), # DarkOliveGreen
|
60 |
-
pptx.dml.color.RGBColor.from_string('2F4F4F'), # DarkSlateGray
|
61 |
-
pptx.dml.color.RGBColor.from_string('4682B4'), # SteelBlue
|
62 |
-
pptx.dml.color.RGBColor.from_string('5F9EA0'), # CadetBlue
|
63 |
-
]
|
64 |
-
|
65 |
-
|
66 |
-
logger = logging.getLogger(__name__)
|
67 |
-
logging.getLogger('PIL.PngImagePlugin').setLevel(logging.ERROR)
|
68 |
-
|
69 |
-
|
70 |
-
def remove_slide_number_from_heading(header: str) -> str:
|
71 |
-
"""
|
72 |
-
Remove the slide number from a given slide header.
|
73 |
-
|
74 |
-
:param header: The header of a slide.
|
75 |
-
:return: The header without slide number.
|
76 |
-
"""
|
77 |
-
|
78 |
-
if SLIDE_NUMBER_REGEX.match(header):
|
79 |
-
idx = header.find(':')
|
80 |
-
header = header[idx + 1:]
|
81 |
-
|
82 |
-
return header
|
83 |
-
|
84 |
-
|
85 |
-
def generate_powerpoint_presentation(
|
86 |
-
parsed_data: dict,
|
87 |
-
slides_template: str,
|
88 |
-
output_file_path: pathlib.Path
|
89 |
-
) -> List:
|
90 |
-
"""
|
91 |
-
Create and save a PowerPoint presentation file containing the content in JSON format.
|
92 |
-
|
93 |
-
:param parsed_data: The presentation content as parsed JSON data.
|
94 |
-
:param slides_template: The PPTX template to use.
|
95 |
-
:param output_file_path: The path of the PPTX file to save as.
|
96 |
-
:return: A list of presentation title and slides headers.
|
97 |
-
"""
|
98 |
-
|
99 |
-
presentation = pptx.Presentation(GlobalConfig.PPTX_TEMPLATE_FILES[slides_template]['file'])
|
100 |
-
slide_width_inch, slide_height_inch = _get_slide_width_height_inches(presentation)
|
101 |
-
|
102 |
-
# The title slide
|
103 |
-
title_slide_layout = presentation.slide_layouts[0]
|
104 |
-
slide = presentation.slides.add_slide(title_slide_layout)
|
105 |
-
title = slide.shapes.title
|
106 |
-
subtitle = slide.placeholders[1]
|
107 |
-
title.text = parsed_data['title']
|
108 |
-
logger.info(
|
109 |
-
'PPT title: %s | #slides: %d | template: %s',
|
110 |
-
title.text, len(parsed_data['slides']),
|
111 |
-
GlobalConfig.PPTX_TEMPLATE_FILES[slides_template]['file']
|
112 |
-
)
|
113 |
-
subtitle.text = 'by Myself and SlideDeck AI :)'
|
114 |
-
all_headers = [title.text, ]
|
115 |
-
|
116 |
-
# Add content in a loop
|
117 |
-
for a_slide in parsed_data['slides']:
|
118 |
-
try:
|
119 |
-
is_processing_done = _handle_icons_ideas(
|
120 |
-
presentation=presentation,
|
121 |
-
slide_json=a_slide,
|
122 |
-
slide_width_inch=slide_width_inch,
|
123 |
-
slide_height_inch=slide_height_inch
|
124 |
-
)
|
125 |
-
|
126 |
-
if not is_processing_done:
|
127 |
-
is_processing_done = _handle_double_col_layout(
|
128 |
-
presentation=presentation,
|
129 |
-
slide_json=a_slide,
|
130 |
-
slide_width_inch=slide_width_inch,
|
131 |
-
slide_height_inch=slide_height_inch
|
132 |
-
)
|
133 |
-
|
134 |
-
if not is_processing_done:
|
135 |
-
is_processing_done = _handle_step_by_step_process(
|
136 |
-
presentation=presentation,
|
137 |
-
slide_json=a_slide,
|
138 |
-
slide_width_inch=slide_width_inch,
|
139 |
-
slide_height_inch=slide_height_inch
|
140 |
-
)
|
141 |
-
|
142 |
-
if not is_processing_done:
|
143 |
-
_handle_default_display(
|
144 |
-
presentation=presentation,
|
145 |
-
slide_json=a_slide,
|
146 |
-
slide_width_inch=slide_width_inch,
|
147 |
-
slide_height_inch=slide_height_inch
|
148 |
-
)
|
149 |
-
|
150 |
-
except Exception:
|
151 |
-
# In case of any unforeseen error, try to salvage what is available
|
152 |
-
continue
|
153 |
-
|
154 |
-
# The thank-you slide
|
155 |
-
last_slide_layout = presentation.slide_layouts[0]
|
156 |
-
slide = presentation.slides.add_slide(last_slide_layout)
|
157 |
-
title = slide.shapes.title
|
158 |
-
title.text = 'Thank you!'
|
159 |
-
|
160 |
-
presentation.save(output_file_path)
|
161 |
-
|
162 |
-
return all_headers
|
163 |
-
|
164 |
-
|
165 |
-
def get_flat_list_of_contents(items: list, level: int) -> List[Tuple]:
|
166 |
-
"""
|
167 |
-
Flatten a (hierarchical) list of bullet points to a single list containing each item and
|
168 |
-
its level.
|
169 |
-
|
170 |
-
:param items: A bullet point (string or list).
|
171 |
-
:param level: The current level of hierarchy.
|
172 |
-
:return: A list of (bullet item text, hierarchical level) tuples.
|
173 |
-
"""
|
174 |
-
|
175 |
-
flat_list = []
|
176 |
-
|
177 |
-
for item in items:
|
178 |
-
if isinstance(item, str):
|
179 |
-
flat_list.append((item, level))
|
180 |
-
elif isinstance(item, list):
|
181 |
-
flat_list = flat_list + get_flat_list_of_contents(item, level + 1)
|
182 |
-
|
183 |
-
return flat_list
|
184 |
-
|
185 |
-
|
186 |
-
def get_slide_placeholders(
|
187 |
-
slide: pptx.slide.Slide,
|
188 |
-
layout_number: int,
|
189 |
-
is_debug: bool = False
|
190 |
-
) -> List[Tuple[int, str]]:
|
191 |
-
"""
|
192 |
-
Return the index and name (lower case) of all placeholders present in a slide, except
|
193 |
-
the title placeholder.
|
194 |
-
|
195 |
-
A placeholder in a slide is a place to add content. Each placeholder has a name and an index.
|
196 |
-
This index is NOT a list index, rather a set of keys used to look up a dict. So, `idx` is
|
197 |
-
non-contiguous. Also, the title placeholder of a slide always has index 0. User-added
|
198 |
-
placeholder get indices assigned starting from 10.
|
199 |
-
|
200 |
-
With user-edited or added placeholders, their index may be difficult to track. This function
|
201 |
-
returns the placeholders name as well, which could be useful to distinguish between the
|
202 |
-
different placeholder.
|
203 |
-
|
204 |
-
:param slide: The slide.
|
205 |
-
:param layout_number: The layout number used by the slide.
|
206 |
-
:param is_debug: Whether to print debugging statements.
|
207 |
-
:return: A list containing placeholders (idx, name) tuples, except the title placeholder.
|
208 |
-
"""
|
209 |
-
|
210 |
-
if is_debug:
|
211 |
-
print(
|
212 |
-
f'Slide layout #{layout_number}:'
|
213 |
-
f' # of placeholders: {len(slide.shapes.placeholders)} (including the title)'
|
214 |
-
)
|
215 |
-
|
216 |
-
placeholders = [
|
217 |
-
(shape.placeholder_format.idx, shape.name.lower()) for shape in slide.shapes.placeholders
|
218 |
-
]
|
219 |
-
placeholders.pop(0) # Remove the title placeholder
|
220 |
-
|
221 |
-
if is_debug:
|
222 |
-
print(placeholders)
|
223 |
-
|
224 |
-
return placeholders
|
225 |
-
|
226 |
-
|
227 |
-
def _handle_default_display(
|
228 |
-
presentation: pptx.Presentation,
|
229 |
-
slide_json: dict,
|
230 |
-
slide_width_inch: float,
|
231 |
-
slide_height_inch: float
|
232 |
-
):
|
233 |
-
"""
|
234 |
-
Display a list of text in a slide.
|
235 |
-
|
236 |
-
:param presentation: The presentation object.
|
237 |
-
:param slide_json: The content of the slide as JSON data.
|
238 |
-
:param slide_width_inch: The width of the slide in inches.
|
239 |
-
:param slide_height_inch: The height of the slide in inches.
|
240 |
-
"""
|
241 |
-
|
242 |
-
status = False
|
243 |
-
|
244 |
-
if 'img_keywords' in slide_json:
|
245 |
-
if random.random() < IMAGE_DISPLAY_PROBABILITY:
|
246 |
-
if random.random() < FOREGROUND_IMAGE_PROBABILITY:
|
247 |
-
status = _handle_display_image__in_foreground(
|
248 |
-
presentation,
|
249 |
-
slide_json,
|
250 |
-
slide_width_inch,
|
251 |
-
slide_height_inch
|
252 |
-
)
|
253 |
-
else:
|
254 |
-
status = _handle_display_image__in_background(
|
255 |
-
presentation,
|
256 |
-
slide_json,
|
257 |
-
slide_width_inch,
|
258 |
-
slide_height_inch
|
259 |
-
)
|
260 |
-
|
261 |
-
if status:
|
262 |
-
return
|
263 |
-
|
264 |
-
# Image display failed, so display only text
|
265 |
-
bullet_slide_layout = presentation.slide_layouts[1]
|
266 |
-
slide = presentation.slides.add_slide(bullet_slide_layout)
|
267 |
-
|
268 |
-
shapes = slide.shapes
|
269 |
-
title_shape = shapes.title
|
270 |
-
|
271 |
-
try:
|
272 |
-
body_shape = shapes.placeholders[1]
|
273 |
-
except KeyError:
|
274 |
-
placeholders = get_slide_placeholders(slide, layout_number=1)
|
275 |
-
body_shape = shapes.placeholders[placeholders[0][0]]
|
276 |
-
|
277 |
-
title_shape.text = remove_slide_number_from_heading(slide_json['heading'])
|
278 |
-
text_frame = body_shape.text_frame
|
279 |
-
|
280 |
-
# The bullet_points may contain a nested hierarchy of JSON arrays
|
281 |
-
# In some scenarios, it may contain objects (dictionaries) because the LLM generated so
|
282 |
-
# ^ The second scenario is not covered
|
283 |
-
|
284 |
-
flat_items_list = get_flat_list_of_contents(slide_json['bullet_points'], level=0)
|
285 |
-
|
286 |
-
for idx, an_item in enumerate(flat_items_list):
|
287 |
-
if idx == 0:
|
288 |
-
text_frame.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
|
289 |
-
else:
|
290 |
-
paragraph = text_frame.add_paragraph()
|
291 |
-
paragraph.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
|
292 |
-
paragraph.level = an_item[1]
|
293 |
-
|
294 |
-
_handle_key_message(
|
295 |
-
the_slide=slide,
|
296 |
-
slide_json=slide_json,
|
297 |
-
slide_height_inch=slide_height_inch,
|
298 |
-
slide_width_inch=slide_width_inch
|
299 |
-
)
|
300 |
-
|
301 |
-
|
302 |
-
def _handle_display_image__in_foreground(
|
303 |
-
presentation: pptx.Presentation(),
|
304 |
-
slide_json: dict,
|
305 |
-
slide_width_inch: float,
|
306 |
-
slide_height_inch: float
|
307 |
-
) -> bool:
|
308 |
-
"""
|
309 |
-
Create a slide with text and image using a picture placeholder layout. If not image keyword is
|
310 |
-
available, it will add only text to the slide.
|
311 |
-
|
312 |
-
:param presentation: The presentation object.
|
313 |
-
:param slide_json: The content of the slide as JSON data.
|
314 |
-
:param slide_width_inch: The width of the slide in inches.
|
315 |
-
:param slide_height_inch: The height of the slide in inches.
|
316 |
-
:return: True if the side has been processed.
|
317 |
-
"""
|
318 |
-
|
319 |
-
img_keywords = slide_json['img_keywords'].strip()
|
320 |
-
slide = presentation.slide_layouts[8] # Picture with Caption
|
321 |
-
slide = presentation.slides.add_slide(slide)
|
322 |
-
placeholders = None
|
323 |
-
|
324 |
-
title_placeholder = slide.shapes.title
|
325 |
-
title_placeholder.text = remove_slide_number_from_heading(slide_json['heading'])
|
326 |
-
|
327 |
-
try:
|
328 |
-
pic_col: PicturePlaceholder = slide.shapes.placeholders[1]
|
329 |
-
except KeyError:
|
330 |
-
placeholders = get_slide_placeholders(slide, layout_number=8)
|
331 |
-
pic_col = None
|
332 |
-
for idx, name in placeholders:
|
333 |
-
if 'picture' in name:
|
334 |
-
pic_col: PicturePlaceholder = slide.shapes.placeholders[idx]
|
335 |
-
|
336 |
-
try:
|
337 |
-
text_col: SlidePlaceholder = slide.shapes.placeholders[2]
|
338 |
-
except KeyError:
|
339 |
-
text_col = None
|
340 |
-
if not placeholders:
|
341 |
-
placeholders = get_slide_placeholders(slide, layout_number=8)
|
342 |
-
|
343 |
-
for idx, name in placeholders:
|
344 |
-
if 'content' in name:
|
345 |
-
text_col: SlidePlaceholder = slide.shapes.placeholders[idx]
|
346 |
-
|
347 |
-
flat_items_list = get_flat_list_of_contents(slide_json['bullet_points'], level=0)
|
348 |
-
|
349 |
-
for idx, an_item in enumerate(flat_items_list):
|
350 |
-
if idx == 0:
|
351 |
-
text_col.text_frame.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
|
352 |
-
else:
|
353 |
-
paragraph = text_col.text_frame.add_paragraph()
|
354 |
-
paragraph.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
|
355 |
-
paragraph.level = an_item[1]
|
356 |
-
|
357 |
-
if not img_keywords:
|
358 |
-
# No keywords, so no image search and addition
|
359 |
-
return True
|
360 |
-
|
361 |
-
try:
|
362 |
-
photo_url, page_url = ims.get_photo_url_from_api_response(
|
363 |
-
ims.search_pexels(query=img_keywords, size='medium')
|
364 |
-
)
|
365 |
-
|
366 |
-
if photo_url:
|
367 |
-
pic_col.insert_picture(
|
368 |
-
ims.get_image_from_url(photo_url)
|
369 |
-
)
|
370 |
-
|
371 |
-
_add_text_at_bottom(
|
372 |
-
slide=slide,
|
373 |
-
slide_width_inch=slide_width_inch,
|
374 |
-
slide_height_inch=slide_height_inch,
|
375 |
-
text='Photo provided by Pexels',
|
376 |
-
hyperlink=page_url
|
377 |
-
)
|
378 |
-
except Exception as ex:
|
379 |
-
logger.error(
|
380 |
-
'*** Error occurred while running adding image to slide: %s',
|
381 |
-
str(ex)
|
382 |
-
)
|
383 |
-
|
384 |
-
return True
|
385 |
-
|
386 |
-
|
387 |
-
def _handle_display_image__in_background(
|
388 |
-
presentation: pptx.Presentation(),
|
389 |
-
slide_json: dict,
|
390 |
-
slide_width_inch: float,
|
391 |
-
slide_height_inch: float
|
392 |
-
) -> bool:
|
393 |
-
"""
|
394 |
-
Add a slide with text and an image in the background. It works just like
|
395 |
-
`_handle_default_display()` but with a background image added. If not image keyword is
|
396 |
-
available, it will add only text to the slide.
|
397 |
-
|
398 |
-
:param presentation: The presentation object.
|
399 |
-
:param slide_json: The content of the slide as JSON data.
|
400 |
-
:param slide_width_inch: The width of the slide in inches.
|
401 |
-
:param slide_height_inch: The height of the slide in inches.
|
402 |
-
:return: True if the slide has been processed.
|
403 |
-
"""
|
404 |
-
|
405 |
-
img_keywords = slide_json['img_keywords'].strip()
|
406 |
-
|
407 |
-
# Add a photo in the background, text in the foreground
|
408 |
-
slide = presentation.slides.add_slide(presentation.slide_layouts[1])
|
409 |
-
title_shape = slide.shapes.title
|
410 |
-
|
411 |
-
try:
|
412 |
-
body_shape = slide.shapes.placeholders[1]
|
413 |
-
except KeyError:
|
414 |
-
placeholders = get_slide_placeholders(slide, layout_number=1)
|
415 |
-
# Layout 1 usually has two placeholders, including the title
|
416 |
-
body_shape = slide.shapes.placeholders[placeholders[0][0]]
|
417 |
-
|
418 |
-
title_shape.text = remove_slide_number_from_heading(slide_json['heading'])
|
419 |
-
|
420 |
-
flat_items_list = get_flat_list_of_contents(slide_json['bullet_points'], level=0)
|
421 |
-
|
422 |
-
for idx, an_item in enumerate(flat_items_list):
|
423 |
-
if idx == 0:
|
424 |
-
body_shape.text_frame.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
|
425 |
-
else:
|
426 |
-
paragraph = body_shape.text_frame.add_paragraph()
|
427 |
-
paragraph.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
|
428 |
-
paragraph.level = an_item[1]
|
429 |
-
|
430 |
-
if not img_keywords:
|
431 |
-
# No keywords, so no image search and addition
|
432 |
-
return True
|
433 |
-
|
434 |
-
try:
|
435 |
-
photo_url, page_url = ims.get_photo_url_from_api_response(
|
436 |
-
ims.search_pexels(query=img_keywords, size='large')
|
437 |
-
)
|
438 |
-
|
439 |
-
if photo_url:
|
440 |
-
picture = slide.shapes.add_picture(
|
441 |
-
image_file=ims.get_image_from_url(photo_url),
|
442 |
-
left=0,
|
443 |
-
top=0,
|
444 |
-
width=pptx.util.Inches(slide_width_inch),
|
445 |
-
)
|
446 |
-
|
447 |
-
_add_text_at_bottom(
|
448 |
-
slide=slide,
|
449 |
-
slide_width_inch=slide_width_inch,
|
450 |
-
slide_height_inch=slide_height_inch,
|
451 |
-
text='Photo provided by Pexels',
|
452 |
-
hyperlink=page_url
|
453 |
-
)
|
454 |
-
|
455 |
-
# Move picture to background
|
456 |
-
# https://github.com/scanny/python-pptx/issues/49#issuecomment-137172836
|
457 |
-
slide.shapes._spTree.remove(picture._element)
|
458 |
-
slide.shapes._spTree.insert(2, picture._element)
|
459 |
-
except Exception as ex:
|
460 |
-
logger.error(
|
461 |
-
'*** Error occurred while running adding image to the slide background: %s',
|
462 |
-
str(ex)
|
463 |
-
)
|
464 |
-
|
465 |
-
return True
|
466 |
-
|
467 |
-
|
468 |
-
def _handle_icons_ideas(
|
469 |
-
presentation: pptx.Presentation(),
|
470 |
-
slide_json: dict,
|
471 |
-
slide_width_inch: float,
|
472 |
-
slide_height_inch: float
|
473 |
-
):
|
474 |
-
"""
|
475 |
-
Add a slide with some icons and text.
|
476 |
-
If no suitable icons are found, the step numbers are shown.
|
477 |
-
|
478 |
-
:param presentation: The presentation object.
|
479 |
-
:param slide_json: The content of the slide as JSON data.
|
480 |
-
:param slide_width_inch: The width of the slide in inches.
|
481 |
-
:param slide_height_inch: The height of the slide in inches.
|
482 |
-
:return: True if the slide has been processed.
|
483 |
-
"""
|
484 |
-
|
485 |
-
if 'bullet_points' in slide_json and slide_json['bullet_points']:
|
486 |
-
items = slide_json['bullet_points']
|
487 |
-
|
488 |
-
# Ensure that it is a single list of strings without any sub-list
|
489 |
-
for step in items:
|
490 |
-
if not isinstance(step, str) or not step.startswith(ICON_BEGINNING_MARKER):
|
491 |
-
return False
|
492 |
-
|
493 |
-
slide_layout = presentation.slide_layouts[5]
|
494 |
-
slide = presentation.slides.add_slide(slide_layout)
|
495 |
-
slide.shapes.title.text = remove_slide_number_from_heading(slide_json['heading'])
|
496 |
-
|
497 |
-
n_items = len(items)
|
498 |
-
text_box_size = INCHES_2
|
499 |
-
|
500 |
-
# Calculate the total width of all pictures and the spacing
|
501 |
-
total_width = n_items * ICON_SIZE
|
502 |
-
spacing = (pptx.util.Inches(slide_width_inch) - total_width) / (n_items + 1)
|
503 |
-
top = INCHES_3
|
504 |
-
|
505 |
-
icons_texts = [
|
506 |
-
(match.group(1), match.group(2)) for match in [
|
507 |
-
ICONS_REGEX.search(item) for item in items
|
508 |
-
]
|
509 |
-
]
|
510 |
-
fallback_icon_files = ice.find_icons([item[0] for item in icons_texts])
|
511 |
-
|
512 |
-
for idx, item in enumerate(icons_texts):
|
513 |
-
icon, accompanying_text = item
|
514 |
-
icon_path = f'{GlobalConfig.ICONS_DIR}/{icon}.png'
|
515 |
-
|
516 |
-
if not os.path.exists(icon_path):
|
517 |
-
logger.warning(
|
518 |
-
'Icon not found: %s...using fallback icon: %s',
|
519 |
-
icon, fallback_icon_files[idx]
|
520 |
-
)
|
521 |
-
icon_path = f'{GlobalConfig.ICONS_DIR}/{fallback_icon_files[idx]}.png'
|
522 |
-
|
523 |
-
left = spacing + idx * (ICON_SIZE + spacing)
|
524 |
-
# Calculate the center position for alignment
|
525 |
-
center = left + ICON_SIZE / 2
|
526 |
-
|
527 |
-
# Add a rectangle shape with a fill color (background)
|
528 |
-
# The size of the shape is slightly bigger than the icon, so align the icon position
|
529 |
-
shape = slide.shapes.add_shape(
|
530 |
-
MSO_AUTO_SHAPE_TYPE.ROUNDED_RECTANGLE,
|
531 |
-
center - INCHES_0_5,
|
532 |
-
top - (ICON_BG_SIZE - ICON_SIZE) / 2,
|
533 |
-
INCHES_1, INCHES_1
|
534 |
-
)
|
535 |
-
shape.fill.solid()
|
536 |
-
shape.shadow.inherit = False
|
537 |
-
|
538 |
-
# Set the icon's background shape color
|
539 |
-
shape.fill.fore_color.rgb = shape.line.color.rgb = random.choice(ICON_COLORS)
|
540 |
-
|
541 |
-
# Add the icon image on top of the colored shape
|
542 |
-
slide.shapes.add_picture(icon_path, left, top, height=ICON_SIZE)
|
543 |
-
|
544 |
-
# Add a text box below the shape
|
545 |
-
text_box = slide.shapes.add_shape(
|
546 |
-
MSO_AUTO_SHAPE_TYPE.ROUNDED_RECTANGLE,
|
547 |
-
left=center - text_box_size / 2, # Center the text box horizontally
|
548 |
-
top=top + ICON_SIZE + INCHES_0_2,
|
549 |
-
width=text_box_size,
|
550 |
-
height=text_box_size
|
551 |
-
)
|
552 |
-
text_frame = text_box.text_frame
|
553 |
-
text_frame.text = accompanying_text
|
554 |
-
text_frame.word_wrap = True
|
555 |
-
text_frame.paragraphs[0].alignment = pptx.enum.text.PP_ALIGN.CENTER
|
556 |
-
|
557 |
-
# Center the text vertically
|
558 |
-
text_frame.vertical_anchor = pptx.enum.text.MSO_ANCHOR.MIDDLE
|
559 |
-
text_box.fill.background() # No fill
|
560 |
-
text_box.line.fill.background() # No line
|
561 |
-
text_box.shadow.inherit = False
|
562 |
-
|
563 |
-
# Set the font color based on the theme
|
564 |
-
for paragraph in text_frame.paragraphs:
|
565 |
-
for run in paragraph.runs:
|
566 |
-
run.font.color.theme_color = pptx.enum.dml.MSO_THEME_COLOR.TEXT_2
|
567 |
-
|
568 |
-
_add_text_at_bottom(
|
569 |
-
slide=slide,
|
570 |
-
slide_width_inch=slide_width_inch,
|
571 |
-
slide_height_inch=slide_height_inch,
|
572 |
-
text='More icons available in the SlideDeck AI repository',
|
573 |
-
hyperlink='https://github.com/barun-saha/slide-deck-ai/tree/main/icons/png128'
|
574 |
-
)
|
575 |
-
|
576 |
-
return True
|
577 |
-
|
578 |
-
return False
|
579 |
-
|
580 |
-
|
581 |
-
def _add_text_at_bottom(
|
582 |
-
slide: pptx.slide.Slide,
|
583 |
-
slide_width_inch: float,
|
584 |
-
slide_height_inch: float,
|
585 |
-
text: str,
|
586 |
-
hyperlink: Optional[str] = None,
|
587 |
-
target_height: Optional[float] = 0.5
|
588 |
-
):
|
589 |
-
"""
|
590 |
-
Add arbitrary text to a textbox positioned near the lower left side of a slide.
|
591 |
-
|
592 |
-
:param slide: The slide.
|
593 |
-
:param slide_width_inch: The width of the slide.
|
594 |
-
:param slide_height_inch: The height of the slide.
|
595 |
-
:param target_height: the target height of the box in inches (optional).
|
596 |
-
:param text: The text to be added
|
597 |
-
:param hyperlink: The hyperlink to be added to the text (optional).
|
598 |
-
"""
|
599 |
-
|
600 |
-
footer = slide.shapes.add_textbox(
|
601 |
-
left=INCHES_1,
|
602 |
-
top=pptx.util.Inches(slide_height_inch - target_height),
|
603 |
-
width=pptx.util.Inches(slide_width_inch),
|
604 |
-
height=pptx.util.Inches(target_height)
|
605 |
-
)
|
606 |
-
|
607 |
-
paragraph = footer.text_frame.paragraphs[0]
|
608 |
-
run = paragraph.add_run()
|
609 |
-
run.text = text
|
610 |
-
run.font.size = pptx.util.Pt(10)
|
611 |
-
run.font.underline = False
|
612 |
-
|
613 |
-
if hyperlink:
|
614 |
-
run.hyperlink.address = hyperlink
|
615 |
-
|
616 |
-
|
617 |
-
def _handle_double_col_layout(
|
618 |
-
presentation: pptx.Presentation(),
|
619 |
-
slide_json: dict,
|
620 |
-
slide_width_inch: float,
|
621 |
-
slide_height_inch: float
|
622 |
-
) -> bool:
|
623 |
-
"""
|
624 |
-
Add a slide with a double column layout for comparison.
|
625 |
-
|
626 |
-
:param presentation: The presentation object.
|
627 |
-
:param slide_json: The content of the slide as JSON data.
|
628 |
-
:param slide_width_inch: The width of the slide in inches.
|
629 |
-
:param slide_height_inch: The height of the slide in inches.
|
630 |
-
:return: True if double col layout has been added; False otherwise.
|
631 |
-
"""
|
632 |
-
|
633 |
-
if 'bullet_points' in slide_json and slide_json['bullet_points']:
|
634 |
-
double_col_content = slide_json['bullet_points']
|
635 |
-
|
636 |
-
if double_col_content and (
|
637 |
-
len(double_col_content) == 2
|
638 |
-
) and isinstance(double_col_content[0], dict) and isinstance(double_col_content[1], dict):
|
639 |
-
slide = presentation.slide_layouts[4]
|
640 |
-
slide = presentation.slides.add_slide(slide)
|
641 |
-
placeholders = None
|
642 |
-
|
643 |
-
shapes = slide.shapes
|
644 |
-
title_placeholder = shapes.title
|
645 |
-
title_placeholder.text = remove_slide_number_from_heading(slide_json['heading'])
|
646 |
-
|
647 |
-
try:
|
648 |
-
left_heading, right_heading = shapes.placeholders[1], shapes.placeholders[3]
|
649 |
-
except KeyError:
|
650 |
-
# For manually edited/added master slides, the placeholder idx numbers in the dict
|
651 |
-
# will be different (>= 10)
|
652 |
-
left_heading, right_heading = None, None
|
653 |
-
placeholders = get_slide_placeholders(slide, layout_number=4)
|
654 |
-
|
655 |
-
for idx, name in placeholders:
|
656 |
-
if 'text placeholder' in name:
|
657 |
-
if not left_heading:
|
658 |
-
left_heading = shapes.placeholders[idx]
|
659 |
-
elif not right_heading:
|
660 |
-
right_heading = shapes.placeholders[idx]
|
661 |
-
|
662 |
-
try:
|
663 |
-
left_col, right_col = shapes.placeholders[2], shapes.placeholders[4]
|
664 |
-
except KeyError:
|
665 |
-
left_col, right_col = None, None
|
666 |
-
if not placeholders:
|
667 |
-
placeholders = get_slide_placeholders(slide, layout_number=4)
|
668 |
-
|
669 |
-
for idx, name in placeholders:
|
670 |
-
if 'content placeholder' in name:
|
671 |
-
if not left_col:
|
672 |
-
left_col = shapes.placeholders[idx]
|
673 |
-
elif not right_col:
|
674 |
-
right_col = shapes.placeholders[idx]
|
675 |
-
|
676 |
-
left_col_frame, right_col_frame = left_col.text_frame, right_col.text_frame
|
677 |
-
|
678 |
-
if 'heading' in double_col_content[0] and left_heading:
|
679 |
-
left_heading.text = double_col_content[0]['heading']
|
680 |
-
if 'bullet_points' in double_col_content[0]:
|
681 |
-
flat_items_list = get_flat_list_of_contents(
|
682 |
-
double_col_content[0]['bullet_points'], level=0
|
683 |
-
)
|
684 |
-
|
685 |
-
if not left_heading:
|
686 |
-
left_col_frame.text = double_col_content[0]['heading']
|
687 |
-
|
688 |
-
for idx, an_item in enumerate(flat_items_list):
|
689 |
-
if left_heading and idx == 0:
|
690 |
-
left_col_frame.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
|
691 |
-
else:
|
692 |
-
paragraph = left_col_frame.add_paragraph()
|
693 |
-
paragraph.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
|
694 |
-
paragraph.level = an_item[1]
|
695 |
-
|
696 |
-
if 'heading' in double_col_content[1] and right_heading:
|
697 |
-
right_heading.text = double_col_content[1]['heading']
|
698 |
-
if 'bullet_points' in double_col_content[1]:
|
699 |
-
flat_items_list = get_flat_list_of_contents(
|
700 |
-
double_col_content[1]['bullet_points'], level=0
|
701 |
-
)
|
702 |
-
|
703 |
-
if not right_heading:
|
704 |
-
right_col_frame.text = double_col_content[1]['heading']
|
705 |
-
|
706 |
-
for idx, an_item in enumerate(flat_items_list):
|
707 |
-
if right_col_frame and idx == 0:
|
708 |
-
right_col_frame.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
|
709 |
-
else:
|
710 |
-
paragraph = right_col_frame.add_paragraph()
|
711 |
-
paragraph.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
|
712 |
-
paragraph.level = an_item[1]
|
713 |
-
|
714 |
-
_handle_key_message(
|
715 |
-
the_slide=slide,
|
716 |
-
slide_json=slide_json,
|
717 |
-
slide_height_inch=slide_height_inch,
|
718 |
-
slide_width_inch=slide_width_inch
|
719 |
-
)
|
720 |
-
|
721 |
-
return True
|
722 |
-
|
723 |
-
return False
|
724 |
-
|
725 |
-
|
726 |
-
def _handle_step_by_step_process(
|
727 |
-
presentation: pptx.Presentation,
|
728 |
-
slide_json: dict,
|
729 |
-
slide_width_inch: float,
|
730 |
-
slide_height_inch: float
|
731 |
-
) -> bool:
|
732 |
-
"""
|
733 |
-
Add shapes to display a step-by-step process in the slide, if available.
|
734 |
-
|
735 |
-
:param presentation: The presentation object.
|
736 |
-
:param slide_json: The content of the slide as JSON data.
|
737 |
-
:param slide_width_inch: The width of the slide in inches.
|
738 |
-
:param slide_height_inch: The height of the slide in inches.
|
739 |
-
:return True if this slide has a step-by-step process depiction added; False otherwise.
|
740 |
-
"""
|
741 |
-
|
742 |
-
if 'bullet_points' in slide_json and slide_json['bullet_points']:
|
743 |
-
steps = slide_json['bullet_points']
|
744 |
-
|
745 |
-
no_marker_count = 0.0
|
746 |
-
n_steps = len(steps)
|
747 |
-
|
748 |
-
# Ensure that it is a single list of strings without any sub-list
|
749 |
-
for step in steps:
|
750 |
-
if not isinstance(step, str):
|
751 |
-
return False
|
752 |
-
|
753 |
-
# In some cases, one or two steps may not begin with >>, e.g.:
|
754 |
-
# {
|
755 |
-
# "heading": "Step-by-Step Process: Creating a Legacy",
|
756 |
-
# "bullet_points": [
|
757 |
-
# "Identify your unique talents and passions",
|
758 |
-
# ">> Develop your skills and knowledge",
|
759 |
-
# ">> Create meaningful work",
|
760 |
-
# ">> Share your work with the world",
|
761 |
-
# ">> Continuously learn and adapt"
|
762 |
-
# ],
|
763 |
-
# "key_message": ""
|
764 |
-
# },
|
765 |
-
#
|
766 |
-
# Use a threshold, e.g., at most 20%
|
767 |
-
if not step.startswith(STEP_BY_STEP_PROCESS_MARKER):
|
768 |
-
no_marker_count += 1
|
769 |
-
|
770 |
-
slide_header = slide_json['heading'].lower()
|
771 |
-
if (no_marker_count / n_steps > 0.25) and not (
|
772 |
-
('step-by-step' in slide_header) or ('step by step' in slide_header)
|
773 |
-
):
|
774 |
-
return False
|
775 |
-
|
776 |
-
if n_steps < 3 or n_steps > 6:
|
777 |
-
# Two steps -- probably not a process
|
778 |
-
# More than 5--6 steps -- would likely cause a visual clutter
|
779 |
-
return False
|
780 |
-
|
781 |
-
bullet_slide_layout = presentation.slide_layouts[1]
|
782 |
-
slide = presentation.slides.add_slide(bullet_slide_layout)
|
783 |
-
shapes = slide.shapes
|
784 |
-
shapes.title.text = remove_slide_number_from_heading(slide_json['heading'])
|
785 |
-
|
786 |
-
if 3 <= n_steps <= 4:
|
787 |
-
# Horizontal display
|
788 |
-
height = INCHES_1_5
|
789 |
-
width = pptx.util.Inches(slide_width_inch / n_steps - 0.01)
|
790 |
-
top = pptx.util.Inches(slide_height_inch / 2)
|
791 |
-
left = pptx.util.Inches((slide_width_inch - width.inches * n_steps) / 2 + 0.05)
|
792 |
-
|
793 |
-
for step in steps:
|
794 |
-
shape = shapes.add_shape(MSO_AUTO_SHAPE_TYPE.CHEVRON, left, top, width, height)
|
795 |
-
shape.text = step.removeprefix(STEP_BY_STEP_PROCESS_MARKER)
|
796 |
-
left += width - INCHES_0_4
|
797 |
-
elif 4 < n_steps <= 6:
|
798 |
-
# Vertical display
|
799 |
-
height = pptx.util.Inches(0.65)
|
800 |
-
top = pptx.util.Inches(slide_height_inch / 4)
|
801 |
-
left = INCHES_1 # slide_width_inch - width.inches)
|
802 |
-
|
803 |
-
# Find the close to median width, based on the length of each text, to be set
|
804 |
-
# for the shapes
|
805 |
-
width = pptx.util.Inches(slide_width_inch * 2 / 3)
|
806 |
-
lengths = [len(step) for step in steps]
|
807 |
-
font_size_20pt = pptx.util.Pt(20)
|
808 |
-
widths = sorted(
|
809 |
-
[
|
810 |
-
min(
|
811 |
-
pptx.util.Inches(font_size_20pt.inches * a_len),
|
812 |
-
width
|
813 |
-
) for a_len in lengths
|
814 |
-
]
|
815 |
-
)
|
816 |
-
width = widths[len(widths) // 2]
|
817 |
-
|
818 |
-
for step in steps:
|
819 |
-
shape = shapes.add_shape(MSO_AUTO_SHAPE_TYPE.PENTAGON, left, top, width, height)
|
820 |
-
shape.text = step.removeprefix(STEP_BY_STEP_PROCESS_MARKER)
|
821 |
-
top += height + INCHES_0_3
|
822 |
-
left += INCHES_0_5
|
823 |
-
|
824 |
-
return True
|
825 |
-
|
826 |
-
|
827 |
-
def _handle_key_message(
|
828 |
-
the_slide: pptx.slide.Slide,
|
829 |
-
slide_json: dict,
|
830 |
-
slide_width_inch: float,
|
831 |
-
slide_height_inch: float
|
832 |
-
):
|
833 |
-
"""
|
834 |
-
Add a shape to display the key message in the slide, if available.
|
835 |
-
|
836 |
-
:param the_slide: The slide to be processed.
|
837 |
-
:param slide_json: The content of the slide as JSON data.
|
838 |
-
:param slide_width_inch: The width of the slide in inches.
|
839 |
-
:param slide_height_inch: The height of the slide in inches.
|
840 |
-
"""
|
841 |
-
|
842 |
-
if 'key_message' in slide_json and slide_json['key_message']:
|
843 |
-
height = pptx.util.Inches(1.6)
|
844 |
-
width = pptx.util.Inches(slide_width_inch / 2.3)
|
845 |
-
top = pptx.util.Inches(slide_height_inch - height.inches - 0.1)
|
846 |
-
left = pptx.util.Inches((slide_width_inch - width.inches) / 2)
|
847 |
-
shape = the_slide.shapes.add_shape(
|
848 |
-
MSO_AUTO_SHAPE_TYPE.ROUNDED_RECTANGLE,
|
849 |
-
left=left,
|
850 |
-
top=top,
|
851 |
-
width=width,
|
852 |
-
height=height
|
853 |
-
)
|
854 |
-
shape.text = slide_json['key_message']
|
855 |
-
|
856 |
-
|
857 |
-
def _get_slide_width_height_inches(presentation: pptx.Presentation) -> Tuple[float, float]:
|
858 |
-
"""
|
859 |
-
Get the dimensions of a slide in inches.
|
860 |
-
|
861 |
-
:param presentation: The presentation object.
|
862 |
-
:return: The width and the height.
|
863 |
-
"""
|
864 |
-
|
865 |
-
slide_width_inch = EMU_TO_INCH_SCALING_FACTOR * presentation.slide_width
|
866 |
-
slide_height_inch = EMU_TO_INCH_SCALING_FACTOR * presentation.slide_height
|
867 |
-
# logger.debug('Slide width: %f, height: %f', slide_width_inch, slide_height_inch)
|
868 |
-
|
869 |
-
return slide_width_inch, slide_height_inch
|
870 |
-
|
871 |
-
|
872 |
-
if __name__ == '__main__':
|
873 |
-
_JSON_DATA = '''
|
874 |
-
{
|
875 |
-
"title": "AI Applications: Transforming Industries",
|
876 |
-
"slides": [
|
877 |
-
{
|
878 |
-
"heading": "Introduction to AI Applications",
|
879 |
-
"bullet_points": [
|
880 |
-
"Artificial Intelligence (AI) is transforming various industries",
|
881 |
-
"AI applications range from simple decision-making tools to complex systems",
|
882 |
-
"AI can be categorized into types: Rule-based, Instance-based, and Model-based"
|
883 |
-
],
|
884 |
-
"key_message": "AI is a broad field with diverse applications and categories",
|
885 |
-
"img_keywords": "AI, transformation, industries, decision-making, categories"
|
886 |
-
},
|
887 |
-
{
|
888 |
-
"heading": "AI in Everyday Life",
|
889 |
-
"bullet_points": [
|
890 |
-
"Virtual assistants like Siri, Alexa, and Google Assistant",
|
891 |
-
"Recommender systems in Netflix, Amazon, and Spotify",
|
892 |
-
"Fraud detection in banking and credit card transactions"
|
893 |
-
],
|
894 |
-
"key_message": "AI is integrated into our daily lives through various services",
|
895 |
-
"img_keywords": "virtual assistants, recommender systems, fraud detection"
|
896 |
-
},
|
897 |
-
{
|
898 |
-
"heading": "AI in Healthcare",
|
899 |
-
"bullet_points": [
|
900 |
-
"Disease diagnosis and prediction using machine learning algorithms",
|
901 |
-
"Personalized medicine and drug discovery",
|
902 |
-
"AI-powered robotic surgeries and remote patient monitoring"
|
903 |
-
],
|
904 |
-
"key_message": "AI is revolutionizing healthcare with improved diagnostics and patient care",
|
905 |
-
"img_keywords": "healthcare, disease diagnosis, personalized medicine, robotic surgeries"
|
906 |
-
},
|
907 |
-
{
|
908 |
-
"heading": "AI in Key Industries",
|
909 |
-
"bullet_points": [
|
910 |
-
{
|
911 |
-
"heading": "Retail",
|
912 |
-
"bullet_points": [
|
913 |
-
"Inventory management and demand forecasting",
|
914 |
-
"Customer segmentation and targeted marketing",
|
915 |
-
"AI-driven chatbots for customer service"
|
916 |
-
]
|
917 |
-
},
|
918 |
-
{
|
919 |
-
"heading": "Finance",
|
920 |
-
"bullet_points": [
|
921 |
-
"Credit scoring and risk assessment",
|
922 |
-
"Algorithmic trading and portfolio management",
|
923 |
-
"AI for detecting money laundering and cyber fraud"
|
924 |
-
]
|
925 |
-
}
|
926 |
-
],
|
927 |
-
"key_message": "AI is transforming retail and finance with improved operations and decision-making",
|
928 |
-
"img_keywords": "retail, finance, inventory management, credit scoring, algorithmic trading"
|
929 |
-
},
|
930 |
-
{
|
931 |
-
"heading": "AI in Education",
|
932 |
-
"bullet_points": [
|
933 |
-
"Personalized learning paths and adaptive testing",
|
934 |
-
"Intelligent tutoring systems for skill development",
|
935 |
-
"AI for predicting student performance and dropout rates"
|
936 |
-
],
|
937 |
-
"key_message": "AI is personalizing education and improving student outcomes",
|
938 |
-
},
|
939 |
-
{
|
940 |
-
"heading": "Step-by-Step: AI Development Process",
|
941 |
-
"bullet_points": [
|
942 |
-
">> Define the problem and objectives",
|
943 |
-
">> Collect and preprocess data",
|
944 |
-
">> Select and train the AI model",
|
945 |
-
">> Evaluate and optimize the model",
|
946 |
-
">> Deploy and monitor the AI system"
|
947 |
-
],
|
948 |
-
"key_message": "Developing AI involves a structured process from problem definition to deployment",
|
949 |
-
"img_keywords": ""
|
950 |
-
},
|
951 |
-
{
|
952 |
-
"heading": "AI Icons: Key Aspects",
|
953 |
-
"bullet_points": [
|
954 |
-
"[[brain]] Human-like intelligence and decision-making",
|
955 |
-
"[[robot]] Automation and physical tasks",
|
956 |
-
"[[]] Data processing and cloud computing",
|
957 |
-
"[[lightbulb]] Insights and predictions",
|
958 |
-
"[[globe2]] Global connectivity and impact"
|
959 |
-
],
|
960 |
-
"key_message": "AI encompasses various aspects, from human-like intelligence to global impact",
|
961 |
-
"img_keywords": "AI aspects, intelligence, automation, data processing, global impact"
|
962 |
-
},
|
963 |
-
{
|
964 |
-
"heading": "Conclusion: Embracing AI's Potential",
|
965 |
-
"bullet_points": [
|
966 |
-
"AI is transforming industries and improving lives",
|
967 |
-
"Ethical considerations are crucial for responsible AI development",
|
968 |
-
"Invest in AI education and workforce development",
|
969 |
-
"Call to action: Explore AI applications and contribute to shaping its future"
|
970 |
-
],
|
971 |
-
"key_message": "AI offers immense potential, and we must embrace it responsibly",
|
972 |
-
"img_keywords": "AI transformation, ethical considerations, AI education, future of AI"
|
973 |
-
}
|
974 |
-
]
|
975 |
-
}'''
|
976 |
-
|
977 |
-
temp = tempfile.NamedTemporaryFile(delete=False, suffix='.pptx')
|
978 |
-
path = pathlib.Path(temp.name)
|
979 |
-
|
980 |
-
generate_powerpoint_presentation(
|
981 |
-
json5.loads(_JSON_DATA),
|
982 |
-
output_file_path=path,
|
983 |
-
slides_template='Basic'
|
984 |
-
)
|
985 |
-
print(f'File path: {path}')
|
986 |
-
|
987 |
-
temp.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
helpers/text_helper.py
DELETED
@@ -1,83 +0,0 @@
|
|
1 |
-
"""
|
2 |
-
Utility functions to help with text processing.
|
3 |
-
"""
|
4 |
-
import json_repair as jr
|
5 |
-
|
6 |
-
|
7 |
-
def is_valid_prompt(prompt: str) -> bool:
|
8 |
-
"""
|
9 |
-
Verify whether user input satisfies the concerned constraints.
|
10 |
-
|
11 |
-
:param prompt: The user input text.
|
12 |
-
:return: True if all criteria are satisfied; False otherwise.
|
13 |
-
"""
|
14 |
-
|
15 |
-
if len(prompt) < 7 or ' ' not in prompt:
|
16 |
-
return False
|
17 |
-
|
18 |
-
return True
|
19 |
-
|
20 |
-
|
21 |
-
def get_clean_json(json_str: str) -> str:
|
22 |
-
"""
|
23 |
-
Attempt to clean a JSON response string from the LLM by removing ```json at the beginning and
|
24 |
-
trailing ``` and any text beyond that.
|
25 |
-
CAUTION: May not be always accurate.
|
26 |
-
|
27 |
-
:param json_str: The input string in JSON format.
|
28 |
-
:return: The "cleaned" JSON string.
|
29 |
-
"""
|
30 |
-
|
31 |
-
response_cleaned = json_str
|
32 |
-
|
33 |
-
if json_str.startswith('```json'):
|
34 |
-
json_str = json_str[7:]
|
35 |
-
|
36 |
-
while True:
|
37 |
-
idx = json_str.rfind('```') # -1 on failure
|
38 |
-
|
39 |
-
if idx <= 0:
|
40 |
-
break
|
41 |
-
|
42 |
-
# In the ideal scenario, the character before the last ``` should be
|
43 |
-
# a new line or a closing bracket
|
44 |
-
prev_char = json_str[idx - 1]
|
45 |
-
|
46 |
-
if (prev_char == '}') or (prev_char == '\n' and json_str[idx - 2] == '}'):
|
47 |
-
response_cleaned = json_str[:idx]
|
48 |
-
|
49 |
-
json_str = json_str[:idx]
|
50 |
-
|
51 |
-
return response_cleaned
|
52 |
-
|
53 |
-
|
54 |
-
def fix_malformed_json(json_str: str) -> str:
|
55 |
-
"""
|
56 |
-
Try and fix the syntax error(s) in a JSON string.
|
57 |
-
|
58 |
-
:param json_str: The input JSON string.
|
59 |
-
:return: The fixed JSOn string.
|
60 |
-
"""
|
61 |
-
|
62 |
-
return jr.repair_json(json_str, skip_json_loads=True)
|
63 |
-
|
64 |
-
|
65 |
-
if __name__ == '__main__':
|
66 |
-
JSON1 = '''{
|
67 |
-
"key": "value"
|
68 |
-
}
|
69 |
-
'''
|
70 |
-
JSON2 = '''["Reason": "Regular updates help protect against known vulnerabilities."]'''
|
71 |
-
JSON3 = '''["Reason" Regular updates help protect against known vulnerabilities."]'''
|
72 |
-
JSON4 = '''
|
73 |
-
{"bullet_points": [
|
74 |
-
">> Write without stopping or editing",
|
75 |
-
>> Set daily writing goals and stick to them,
|
76 |
-
">> Allow yourself to make mistakes"
|
77 |
-
],}
|
78 |
-
'''
|
79 |
-
|
80 |
-
print(fix_malformed_json(JSON1))
|
81 |
-
print(fix_malformed_json(JSON2))
|
82 |
-
print(fix_malformed_json(JSON3))
|
83 |
-
print(fix_malformed_json(JSON4))
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
icons/png128/0-circle.png
DELETED
Binary file (4.1 kB)
|
|
icons/png128/1-circle.png
DELETED
Binary file (3.45 kB)
|
|
icons/png128/123.png
DELETED
Binary file (2.5 kB)
|
|
icons/png128/2-circle.png
DELETED
Binary file (4.01 kB)
|
|
icons/png128/3-circle.png
DELETED
Binary file (4.24 kB)
|
|
icons/png128/4-circle.png
DELETED
Binary file (3.74 kB)
|
|
icons/png128/5-circle.png
DELETED
Binary file (4.12 kB)
|
|
icons/png128/6-circle.png
DELETED
Binary file (4.37 kB)
|
|
icons/png128/7-circle.png
DELETED
Binary file (3.78 kB)
|
|
icons/png128/8-circle.png
DELETED
Binary file (4.43 kB)
|
|
icons/png128/9-circle.png
DELETED
Binary file (4.44 kB)
|
|
icons/png128/activity.png
DELETED
Binary file (1.38 kB)
|
|
icons/png128/airplane.png
DELETED
Binary file (2.09 kB)
|
|
icons/png128/alarm.png
DELETED
Binary file (4.08 kB)
|
|
icons/png128/alien-head.png
DELETED
Binary file (4.73 kB)
|
|
icons/png128/alphabet.png
DELETED
Binary file (2.44 kB)
|
|
icons/png128/amazon.png
DELETED
Binary file (3.56 kB)
|
|
icons/png128/amritsar-golden-temple.png
DELETED
Binary file (4.44 kB)
|
|
icons/png128/amsterdam-canal.png
DELETED
Binary file (3.32 kB)
|
|
icons/png128/amsterdam-windmill.png
DELETED
Binary file (2.67 kB)
|
|
icons/png128/android.png
DELETED
Binary file (2.24 kB)
|
|
icons/png128/angkor-wat.png
DELETED
Binary file (2.64 kB)
|
|
icons/png128/apple.png
DELETED
Binary file (2.4 kB)
|
|
icons/png128/archive.png
DELETED
Binary file (1.27 kB)
|
|
icons/png128/argentina-obelisk.png
DELETED
Binary file (1.39 kB)
|
|
icons/png128/artificial-intelligence-brain.png
DELETED
Binary file (4.73 kB)
|
|
icons/png128/atlanta.png
DELETED
Binary file (2.87 kB)
|
|
icons/png128/austin.png
DELETED
Binary file (1.72 kB)
|
|
icons/png128/automation-decision.png
DELETED
Binary file (1.19 kB)
|
|
icons/png128/award.png
DELETED
Binary file (2.55 kB)
|
|
icons/png128/balloon.png
DELETED
Binary file (2.83 kB)
|
|
icons/png128/ban.png
DELETED
Binary file (3.32 kB)
|
|
icons/png128/bandaid.png
DELETED
Binary file (3.53 kB)
|
|
icons/png128/bangalore.png
DELETED
Binary file (2.4 kB)
|
|
icons/png128/bank.png
DELETED
Binary file (1.4 kB)
|
|