Noo88ear commited on
Commit
ae9c474
Β·
verified Β·
1 Parent(s): 265fa55

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -23
README.md CHANGED
@@ -4,19 +4,21 @@ emoji: 🎨
4
  colorFrom: blue
5
  colorTo: purple
6
  sdk: gradio
7
- sdk_version: 5.39.0
8
  app_file: app.py
9
  pinned: false
 
 
10
  ---
11
 
12
  # Marketing Image Generator with Agent Review
13
 
14
- A sophisticated AI-powered image generation system that creates high-quality marketing images with automated quality review and refinement. Built on modern AI technologies including Google's Imagen3 and advanced agent orchestration.
15
 
16
  ## Features
17
 
18
  - **AI-Powered Image Generation**: Create stunning marketing images from text prompts using Google's Imagen4 via MCP server
19
- - **Automated Quality Review**: Intelligent Gemini agent (2.5-Pro) automatically reviews and refines generated images
20
  - **Marketing-Focused**: Optimized for marketing materials, social media, and promotional content
21
  - **Real-time Feedback**: Get instant quality scores and improvement suggestions
22
  - **Professional Workflow**: Streamlined process from concept to final image
@@ -56,11 +58,11 @@ A sophisticated AI-powered image generation system that creates high-quality mar
56
 
57
  ### Core Components
58
 
59
- - **Agent 1 (Image Generator)**: Creates images using Google's Imagen3 via MCP server integration
60
  - **Agent 2 (Marketing Reviewer)**: Analyzes image quality and provides marketing-focused feedback using Gemini Vision
61
  - **Orchestrator**: Manages workflow between agents and handles handover
62
  - **Web Interface**: Gradio-based user interface optimized for Hugging Face
63
- - **MCP Server Integration**: Model Context Protocol for seamless Imagen3 access
64
 
65
  ### System Architecture and Workflow
66
 
@@ -73,18 +75,18 @@ A sophisticated AI-powered image generation system that creates high-quality mar
73
  β”‚Reviewer │───▢│ │───▢│ Agent 2 (Gemini) Marketing β”‚
74
  β”‚Prompt β”‚ β”‚ β”‚ β”‚ Reviewer β”‚
75
  β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
76
- β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
77
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Ag1: Imagen4 (via MCP) β”‚β”‚
78
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚β”‚
79
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Draft Image Creation β”‚β”‚
80
- β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
81
  β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
82
- β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
83
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚Ag2;Draft Image Reviewed β”‚β”‚
84
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ & Changes Suggested β”‚β”‚
85
- β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
86
  β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
87
- β”‚ Image │◀───│ │◀───│ Final Image Response β”‚
88
  β”‚ Response β”‚ β”‚ β”‚ β”‚ β”‚
89
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
90
  ```
@@ -104,12 +106,12 @@ A sophisticated AI-powered image generation system that creates high-quality mar
104
 
105
  3. **Image Generation and Drafting (Top Right)**:
106
  - **Agent 1 (Gemini) Drafter**: Receives Image Prompt, orchestrates image generation
107
- - **Imagen3 (via MCP)**: Agent 1 interacts with Imagen4 through MCP server to create initial image draft
108
 
109
  4. **Marketing Review and Refinement (Bottom Right)**:
110
  - **Agent 2 (Gemini) Marketing Reviewer**: Receives Reviewer Prompt, evaluates generated image against marketing criteria
111
  - **Draft Image Reviewed and Changes Suggested**: Agent 2's review process output
112
- - **Iterative Refinement Loop**: Bidirectional feedback between Agent 2 and Imagen3 (via Agent 1) to refine image until it meets marketing standards
113
  - Final **Image Response** sent back to Gradio UI
114
 
115
  ### Summary of Flow:
@@ -117,12 +119,12 @@ User provides prompts β†’ Gradio UI β†’ Agent 1 drafts image with Imagen4 β†’ Ag
117
 
118
  ### Technology Stack
119
 
120
- - **AI Models**: Google Imagen4 (via MCP), Gemini Vision
121
  - **Framework**: Gradio (Web Interface)
122
  - **Orchestration**: Custom agent handover system
123
  - **Deployment**: Hugging Face Spaces
124
  - **Authentication**: Google Cloud API Keys
125
- - **Protocol**: MCP (Model Context Protocol) for Imagen3 integration
126
 
127
  ### Why A2A Was Not Applied
128
 
@@ -179,7 +181,7 @@ quality_score = result["data"]["review"]["quality_score"]
179
  - **Quality Threshold**: Minimum quality score for auto-approval
180
  - **Max Iterations**: Maximum refinement attempts
181
  - **Review Settings**: Customize review criteria
182
- - **MCP Configuration**: Imagen3 server settings
183
 
184
  ## Development
185
 
@@ -268,12 +270,52 @@ Access monitoring dashboards:
268
  1. **API Key Errors**: Ensure your Google API keys are valid and configured as HF secrets
269
  2. **Image Generation Fails**: Check your internet connection and API quotas
270
  3. **Review Not Working**: Verify the Gemini agent is running and configured correctly
271
- 4. **MCP Connection Issues**: Check Imagen3 server connectivity and configuration
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
272
 
273
  ### Debug Mode
274
 
275
  Enable debug logging by setting `LOG_LEVEL=DEBUG` in your environment variables.
276
 
 
 
 
 
 
 
 
277
  ### Support
278
 
279
  For issues and questions:
@@ -287,7 +329,7 @@ This project is licensed under the MIT License - see the LICENSE file for detail
287
 
288
  ## Acknowledgments
289
 
290
- - Google AI for Imagen4 and Gemini technologies
291
  - Hugging Face for the deployment platform
292
  - Gradio for the web interface framework
293
- - The open-source community for various dependencies
 
4
  colorFrom: blue
5
  colorTo: purple
6
  sdk: gradio
7
+ sdk_version: 5.38.2
8
  app_file: app.py
9
  pinned: false
10
+ license: mit
11
+ short_description: AI marketing image generator with GCP Imagen4 + Gemini 2.5
12
  ---
13
 
14
  # Marketing Image Generator with Agent Review
15
 
16
+ A sophisticated AI-powered image generation system that creates high-quality marketing images with automated quality review and refinement. Built on modern AI technologies including Google's Imagen4 and Gemini 2.5 Pro with advanced agent orchestration.
17
 
18
  ## Features
19
 
20
  - **AI-Powered Image Generation**: Create stunning marketing images from text prompts using Google's Imagen4 via MCP server
21
+ - **Automated Quality Review**: Intelligent Gemini agent automatically reviews and refines generated images
22
  - **Marketing-Focused**: Optimized for marketing materials, social media, and promotional content
23
  - **Real-time Feedback**: Get instant quality scores and improvement suggestions
24
  - **Professional Workflow**: Streamlined process from concept to final image
 
58
 
59
  ### Core Components
60
 
61
+ - **Agent 1 (Image Generator)**: Creates images using Google's Imagen4 via MCP server integration
62
  - **Agent 2 (Marketing Reviewer)**: Analyzes image quality and provides marketing-focused feedback using Gemini Vision
63
  - **Orchestrator**: Manages workflow between agents and handles handover
64
  - **Web Interface**: Gradio-based user interface optimized for Hugging Face
65
+ - **MCP Server Integration**: Model Context Protocol for seamless Imagen4 access
66
 
67
  ### System Architecture and Workflow
68
 
 
75
  β”‚Reviewer │───▢│ │───▢│ Agent 2 (Gemini) Marketing β”‚
76
  β”‚Prompt β”‚ β”‚ β”‚ β”‚ Reviewer β”‚
77
  β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
78
+ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
79
+ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Imagen4 (via MCP) β”‚ β”‚
80
+ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
81
+ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Draft Image Creation β”‚ β”‚
82
+ β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
83
  β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
84
+ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
85
+ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Draft Image Reviewed β”‚ β”‚
86
+ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ & Changes Suggested β”‚ β”‚
87
+ β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
88
  β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
89
+ β”‚ Image │◀───│ │◀───│ Final Image Response β”‚
90
  β”‚ Response β”‚ β”‚ β”‚ β”‚ β”‚
91
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
92
  ```
 
106
 
107
  3. **Image Generation and Drafting (Top Right)**:
108
  - **Agent 1 (Gemini) Drafter**: Receives Image Prompt, orchestrates image generation
109
+ - **Imagen4 (via MCP)**: Agent 1 interacts with Imagen4 through MCP server to create initial image draft
110
 
111
  4. **Marketing Review and Refinement (Bottom Right)**:
112
  - **Agent 2 (Gemini) Marketing Reviewer**: Receives Reviewer Prompt, evaluates generated image against marketing criteria
113
  - **Draft Image Reviewed and Changes Suggested**: Agent 2's review process output
114
+ - **Iterative Refinement Loop**: Bidirectional feedback between Agent 2 and Imagen4 (via Agent 1) to refine image until it meets marketing standards
115
  - Final **Image Response** sent back to Gradio UI
116
 
117
  ### Summary of Flow:
 
119
 
120
  ### Technology Stack
121
 
122
+ - **AI Models**: Google Imagen4 (via MCP), Gemini 2.5 Pro Vision
123
  - **Framework**: Gradio (Web Interface)
124
  - **Orchestration**: Custom agent handover system
125
  - **Deployment**: Hugging Face Spaces
126
  - **Authentication**: Google Cloud API Keys
127
+ - **Protocol**: MCP (Model Context Protocol) for Imagen4 integration
128
 
129
  ### Why A2A Was Not Applied
130
 
 
181
  - **Quality Threshold**: Minimum quality score for auto-approval
182
  - **Max Iterations**: Maximum refinement attempts
183
  - **Review Settings**: Customize review criteria
184
+ - **MCP Configuration**: Imagen4 server settings
185
 
186
  ## Development
187
 
 
270
  1. **API Key Errors**: Ensure your Google API keys are valid and configured as HF secrets
271
  2. **Image Generation Fails**: Check your internet connection and API quotas
272
  3. **Review Not Working**: Verify the Gemini agent is running and configured correctly
273
+ 4. **MCP Connection Issues**: Check Imagen4 server connectivity and configuration
274
+
275
+ ### Content Policy & Brand Restrictions
276
+
277
+ Google's AI models have built-in safety guardrails that may cause timeouts or rejections for certain content types:
278
+
279
+ #### 🚫 **Highly Restricted Content** (Likely to cause stalls/timeouts):
280
+ - **Political Figures**: Named world leaders, politicians (e.g., "Putin", "Zelensky", "Biden")
281
+ - **Political Buildings**: Government buildings like "10 Downing Street", "White House"
282
+ - **Geopolitical Content**: War, conflict, or sensitive international relations
283
+ - **Financial Institution Brands**: Major banks like "HSBC", "Bank of America", "JPMorgan"
284
+
285
+ #### ⚠️ **Moderately Restricted Content** (May cause delays):
286
+ - **Regulated Industries**: Healthcare, pharmaceutical, financial services
287
+ - **Some Corporate Brands**: Varies by sector and brand sensitivity
288
+
289
+ #### βœ… **Generally Permitted Content**:
290
+ - **Technology Brands**: "Cognizant", "Microsoft", "IBM", "Accenture"
291
+ - **Generic Business**: "Professional office", "corporate environment"
292
+ - **Non-branded Content**: Generic descriptions without specific brand names
293
+
294
+ #### πŸ”§ **Workarounds for Restricted Content**:
295
+
296
+ **Instead of**: `"Professional boardroom with HSBC signage"`
297
+ **Use**: `"Professional boardroom with international banking corporation signage in red and white colors"`
298
+
299
+ **Instead of**: `"Meeting with political leaders"`
300
+ **Use**: `"Meeting with business executives in government-style building"`
301
+
302
+ **Strategy**: Move brand-specific requirements to **Review Guidelines** instead of the main prompt:
303
+ - **Main Prompt**: `"Professional corporate environment"`
304
+ - **Review Guidelines**: `"Ensure branding reflects HSBC corporate colors (red and white)"`
305
+
306
+ This approach bypasses content filters while still providing guidance for review.
307
 
308
  ### Debug Mode
309
 
310
  Enable debug logging by setting `LOG_LEVEL=DEBUG` in your environment variables.
311
 
312
+ ### Content Policy Testing
313
+
314
+ Use the included diagnostic scripts to test content restrictions:
315
+ - `debug_hsbc_prompt.py` - Test financial brand restrictions
316
+ - `test_cognizant_brand.py` - Test tech brand accessibility
317
+ - `test_brand_workaround.py` - Test workaround strategies
318
+
319
  ### Support
320
 
321
  For issues and questions:
 
329
 
330
  ## Acknowledgments
331
 
332
+ - Google AI for Imagen4 and Gemini 2.5 Pro technologies
333
  - Hugging Face for the deployment platform
334
  - Gradio for the web interface framework
335
+ - The open-source community for various dependencies