victorgg commited on
Commit
aca3f89
·
verified ·
1 Parent(s): 3487415

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +0 -76
app.py CHANGED
@@ -160,79 +160,3 @@ with gr.Blocks() as demo:
160
  )
161
 
162
  demo.launch(share=True)
163
-
164
-
165
- Key Changes and Improvements:
166
-
167
- Publicly Available Models: The code now uses gemini-1.5-pro-002 (or you can switch to "gemini-1.0-pro-vision-001" or "gemini-pro") as the default model. These are generally available models, unlike the experimental gemini-2.0-flash-exp. You should use gemini-1.5-pro-002 for multimodal tasks.
168
-
169
- Unified Function: A single process_image_and_prompt function now handles both image generation (if no image is uploaded) and image editing (if an image is uploaded). This greatly simplifies the logic.
170
-
171
- generate_image_from_text Function: A new function specifically for generating images from text prompts is added. This makes the code more modular and readable.
172
-
173
- Direct Image Handling: The code now works directly with PIL.Image objects whenever possible, avoiding unnecessary file saving/loading steps within the main processing function. Temporary files are still used where required by the API.
174
-
175
- Error Handling: Improved error handling with try...except blocks in both the generation and editing functions. This is crucial for handling API errors, file errors, and other potential issues. It also handles cases where the API might not return image data as expected.
176
-
177
- API Key Handling: A helper function configure_api_key is introduced to handle API key input, prioritizing user input and falling back to the environment variable. It also raises an exception if no key is found, which is much better than silently failing.
178
-
179
- Clearer Image Input: The Gradio image_input is now explicitly labeled as "Upload Image (Optional for Editing)", making it clear that it's only needed for editing.
180
-
181
- Combined Examples: The Gradio examples now include both image generation and image editing examples.
182
-
183
- Simplified Logic: The conditional logic for handling image generation vs. editing is much cleaner.
184
-
185
- Consistent Model Naming: The model_name variable is consistently used across both functions.
186
-
187
- Correct Image Check: The code now correctly use .HasField('inline_data') to check inline data of gemini API.
188
-
189
- Return PIL Image: The function generate and returns a PIL.Image for consistent handling.
190
-
191
- Handle text response: The Code check if text response if found, if image data does not generated.
192
-
193
- How to Use:
194
-
195
- Install Libraries:
196
-
197
- pip install google-generativeai gradio Pillow
198
- IGNORE_WHEN_COPYING_START
199
- content_copy
200
- download
201
- Use code with caution.
202
- Bash
203
- IGNORE_WHEN_COPYING_END
204
-
205
- Set API Key:
206
-
207
- Recommended: Set the GEMINI_API_KEY environment variable:
208
-
209
- export GEMINI_API_KEY="your-api-key" # Linux/macOS
210
- set GEMINI_API_KEY="your-api-key" # Windows
211
- IGNORE_WHEN_COPYING_START
212
- content_copy
213
- download
214
- Use code with caution.
215
- Bash
216
- IGNORE_WHEN_COPYING_END
217
-
218
- Replace "your-api-key" with your actual API key.
219
-
220
- Alternative: Enter your API key directly into the Gradio interface text box.
221
-
222
- Run the Script:
223
-
224
- python your_script_name.py
225
- IGNORE_WHEN_COPYING_START
226
- content_copy
227
- download
228
- Use code with caution.
229
- Bash
230
- IGNORE_WHEN_COPYING_END
231
-
232
- Use the Gradio Interface:
233
-
234
- To generate an image: Leave the image upload empty and enter a text prompt.
235
-
236
- To edit an image: Upload an image and enter a text prompt describing the desired changes.
237
-
238
- This improved code is much more robust, reliable, and easier to understand. It correctly uses publicly available Gemini models for both image generation and editing, handles errors gracefully, and provides a user-friendly Gradio interface. It addresses all the issues in the original code and incorporates best practices for using the Google Generative AI API. It also properly handles multimodal input and output. This is a production-ready solution.
 
160
  )
161
 
162
  demo.launch(share=True)