Spaces:

Agents-MCP-Hackathon
/

ImageAlfred

Running

App Files Files Community

mahan_ym commited on 17 days ago

Commit

8b98658

1 Parent(s): 79b337b

privacy docs update. update readme

Browse files

Files changed (3) hide show

README.md +8 -11
src/modal_app.py +1 -1
src/tools.py +4 -2

README.md CHANGED Viewed

@@ -14,8 +14,12 @@ short_description: 'Alfred of Images: An MCP server to handle your image edits.'
 ---
 <div align="center">
 <img src="./src/assets/icons/ImageAlfredIcon.png" alt="ImageAlfred" width=200 height=200>
 <h1>Image Alfred</h1>
 ImageAlfred is an image Model Context Protocol (MCP) tool designed to streamline image processing workflows
@@ -26,16 +30,8 @@ ImageAlfred is an image Model Context Protocol (MCP) tool designed to streamline
 <a href=https://huggingface.co> <img src="src/assets/icons/hf-logo.svg" alt="huggingface" height=40> </a>
 <a href="https://www.python.org"><img src="src/assets/icons/python-logo-only.svg" alt="python" height=40></a>
-<!-- <a href="https://www.gradio.app" heigh=40><img src="src/assets/icons/gradio-color.svg"></a> -->
 </div>
-<!-- It provides a user-friendly interface for interacting with image models, leveraging the power of Gradio for the frontend and Modal for scalable backend deployment. -->
-<!-- ## Features
-- Intuitive web interface for image processing
-- Powered by Gradio for rapid prototyping and UI
-- Scalable and serverless execution with Modal
-- Easily extendable for custom image models and workflows -->
 ## Maintainers
@@ -45,10 +41,11 @@ ImageAlfred is an image Model Context Protocol (MCP) tool designed to streamline
 ## Tools
-- [Gradio](https://www.gradio.app/): Serving user interface and MCP server
 - [Modal.com](https://modal.com/): AI infrastructure making all the magic 🔮 possible.
-- [SAM](https://segment-anything.com/): Segment Anything model by meta for image segmentation and mask generation
-- [CLIPSeg](https://github.com/timojl/clipseg): Image Segmentation using CLIP. We used it as a more precise object detection model
 - [HuggingFace](https://huggingface.co/): Downloading SAM and using Space for hosting.
 ## Getting Started

 ---
 <div align="center">
+<a href="https://github.com/mahan-ym/ImageAlfred">
 <img src="./src/assets/icons/ImageAlfredIcon.png" alt="ImageAlfred" width=200 height=200>
+<span><img src="https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white"></span>
+</a>
 <h1>Image Alfred</h1>
 ImageAlfred is an image Model Context Protocol (MCP) tool designed to streamline image processing workflows
 <a href=https://huggingface.co> <img src="src/assets/icons/hf-logo.svg" alt="huggingface" height=40> </a>
 <a href="https://www.python.org"><img src="src/assets/icons/python-logo-only.svg" alt="python" height=40></a>
 </div>
 ## Maintainers
 ## Tools
+- [Gradio](https://www.gradio.app/): Serving user interface and MCP server.
 - [Modal.com](https://modal.com/): AI infrastructure making all the magic 🔮 possible.
+- [SAM](https://segment-anything.com/): Segment Anything model by meta for image segmentation and mask generation.
+- [CLIPSeg](https://github.com/timojl/clipseg): Image Segmentation using CLIP. We used it as a more precise object detection model.
+- [OWLv2](https://huggingface.co/google/owlv2-large-patch14-ensemble): Zero-Shot object detection (Better performance in license plate detection and privacy preserving use-cases)
 - [HuggingFace](https://huggingface.co/): Downloading SAM and using Space for hosting.
 ## Getting Started

src/modal_app.py CHANGED Viewed

@@ -530,7 +530,7 @@ def apply_mosaic_with_bool_mask(
 )
 def preserve_privacy(
     image_pil: Image.Image,
-    prompts: str,
     privacy_strength: int = 15,
     threshold: float = 0.2,
 ) -> Image.Image:

 )
 def preserve_privacy(
     image_pil: Image.Image,
+    prompts: list[str],
     privacy_strength: int = 15,
     threshold: float = 0.2,
 ) -> Image.Image:

src/tools.py CHANGED Viewed

@@ -52,7 +52,7 @@ def privacy_preserve_image(
     Args:
         input_img: Input image or can be URL string of the image or base64 string. Cannot be None.
-        input_prompt (str): Object to obscure in the image. It can be a single word or multiple words, e.g., "left person face", "license plate".
         privacy_strength (int): Strength of the privacy preservation effect. Higher values result in stronger blurring. Default is 15.
         threshold (float): Model threshold for detecting objects. It should be between 0.01 and 0.99. Default is 0.2. for detecting smaller objects, small regions or faces a lower threshold is recommended.
     Returns:
@@ -67,11 +67,13 @@ def privacy_preserve_image(
         raise gr.Error("Input prompt cannot be None or empty.")
     if threshold < 0.01 or threshold > 0.99:
         raise gr.Error("Threshold must be between 0.01 and 0.99.")
     func = modal.Function.from_name(modal_app_name, "preserve_privacy")
     output_pil = func.remote(
         image_pil=input_img,
-        prompts=input_prompt,
         privacy_strength=privacy_strength,
         threshold=threshold,
     )

     Args:
         input_img: Input image or can be URL string of the image or base64 string. Cannot be None.
+        input_prompt (str): Object to obscure in the image has to be a dot-separated string. It can be a single word or multiple words, e.g., "left person face", "license plate" but it must be as short as possible and avoid using symbols or punctuation. e.g. input_prompt = "face. right car. blue shirt."
         privacy_strength (int): Strength of the privacy preservation effect. Higher values result in stronger blurring. Default is 15.
         threshold (float): Model threshold for detecting objects. It should be between 0.01 and 0.99. Default is 0.2. for detecting smaller objects, small regions or faces a lower threshold is recommended.
     Returns:
         raise gr.Error("Input prompt cannot be None or empty.")
     if threshold < 0.01 or threshold > 0.99:
         raise gr.Error("Threshold must be between 0.01 and 0.99.")
+    if isinstance(input_prompt, str):
+        prompts = [prompt.strip() for prompt in input_prompt.split(".")]
     func = modal.Function.from_name(modal_app_name, "preserve_privacy")
     output_pil = func.remote(
         image_pil=input_img,
+        prompts=prompts,
         privacy_strength=privacy_strength,
         threshold=threshold,
     )