Spaces:

Yiming-M
/

ZIP

Running on Zero

App Files Files Community

Yiming-M commited on 23 days ago

Commit

8994c36

1 Parent(s): 3b32643

2025-07-31 23:04 🚀

Browse files

Files changed (1) hide show

app.py +19 -16

app.py CHANGED Viewed

@@ -665,16 +665,18 @@ select option[value*="━━━━━━"] {
 with gr.Blocks(css=css, theme=gr.themes.Soft(), title="ZIP Crowd Counting") as demo:
     gr.Markdown("""
     # 🎯 Crowd Counting by ZIP
-    ### Upload an image and get precise crowd density predictions with advanced zero-inflated models
     """)
     # 添加信息面板
-    with gr.Accordion("ℹ️ About ZIP Models", open=False):
         gr.Markdown("""
-        **ZIP (Zero-Inflated Poisson)** models are designed to handle crowd counting with:
-        - **Structural Zeros**: Areas where people cannot exist (walls, sky, etc.)
-        - **Sampling Zeros**: Areas where people could exist but don't
-        - **Advanced Metrics**: MAE (Mean Absolute Error) and NAE (Normalized Absolute Error)
         Choose from different model variants: **ZIP-B** (Base), **ZIP-S** (Small), **ZIP-T** (Tiny), **ZIP-N** (Nano), **ZIP-P** (Pico)
         """)
@@ -785,7 +787,7 @@ with gr.Blocks(css=css, theme=gr.themes.Soft(), title="ZIP Crowd Counting") as d
         gr.Markdown("""
         ### Step-by-step Guide:
-        1. **🎛️ Select Model**: Choose your preferred model variant, dataset, and metric from the dropdown
         2. **📸 Upload Image**: Click the image area to upload your crowd photo or use clipboard
         3. **🚀 Analyze**: Click the "Analyze Crowd" button to start processing
         4. **📊 View Results**: Examine the density maps and crowd count in the output panels
@@ -793,16 +795,16 @@ with gr.Blocks(css=css, theme=gr.themes.Soft(), title="ZIP Crowd Counting") as d
         ### Understanding the Outputs:
         **📊 Main Results:**
-        - **🎯 Density Map**: Shows where people are located with color intensity
         - **� Predicted Count**: Total number of people detected in the image
         **🔍 Zero Analysis:**
-        - **🏗️ Structural Zero Map**: Areas where people cannot exist (buildings, sky, walls)
-        - **📊 Sampling Zero Map**: Areas where people could be but aren't currently present
-        - **🎯 Complete Zero Map**: Combined zero probability map showing all non-crowd areas
         **🎯 Hotspots:**
-        - **📈 Lambda Map**: Highlights crowd density hotspots and expected count per pixel
         """)
     # 添加技术信息
@@ -810,15 +812,16 @@ with gr.Blocks(css=css, theme=gr.themes.Soft(), title="ZIP Crowd Counting") as d
         gr.Markdown("""
         ### Model Variants:
         - **ZIP-B**: Base model with best performance
-        - **ZIP-S**: Smaller model for faster inference
         - **ZIP-T**: Tiny model for resource-constrained environments
         - **ZIP-N**: Nano model for mobile applications
         - **ZIP-P**: Pico model for edge devices
         ### Datasets:
-        - **ShanghaiTech A/B**: Dense crowd scenes
-        - **UCF-QNRF**: Ultra high-resolution crowd images
-        - **NWPU-Crowd**: Large-scale crowd counting dataset
         ### Metrics:
         - **MAE**: Mean Absolute Error - average counting error

 with gr.Blocks(css=css, theme=gr.themes.Soft(), title="ZIP Crowd Counting") as demo:
     gr.Markdown("""
     # 🎯 Crowd Counting by ZIP
+    ### Upload an image and get precise crowd density predictions with ZIP models!
     """)
     # 添加信息面板
+    with gr.Accordion("ℹ️ About ZIP", open=False):
         gr.Markdown("""
+        **ZIP (Zero-Inflated Poisson)** is a framework designed for crowd counting, a task where the goal is to estimate how many people are present in an image. It was introduced in the paper [ZIP: Scalable Crowd Counting via Zero-Inflated Poisson Modeling](https://arxiv.org/abs/2506.19955).
+        ZIP is based on a simple idea: not all empty areas in an image mean the same thing. Some regions are empty because there are truly no people there (like walls or sky), while others are places where people could appear but just happen not to in this particular image. ZIP separates these two cases using two prediction heads:
+        - **Structural Zeros**: These are regions that naturally never contain people (e.g., the background or torso areas). These are handled by the π head.
+        - **Sampling Zeros**: These are regions where people could appear but don't in this image. These are modeled by the λ head.
+        By separating *where* people are likely to be from *how many* are present, ZIP produces more accurate and interpretable crowd estimates, especially in scenes with large empty spaces or varied crowd densities.
         Choose from different model variants: **ZIP-B** (Base), **ZIP-S** (Small), **ZIP-T** (Tiny), **ZIP-N** (Nano), **ZIP-P** (Pico)
         """)
         gr.Markdown("""
         ### Step-by-step Guide:
+        1. **🎛️ Select Model**: Choose your preferred model variant, pre-trained dataset, and evaluation metric from the dropdown
         2. **📸 Upload Image**: Click the image area to upload your crowd photo or use clipboard
         3. **🚀 Analyze**: Click the "Analyze Crowd" button to start processing
         4. **📊 View Results**: Examine the density maps and crowd count in the output panels
         ### Understanding the Outputs:
         **📊 Main Results:**
+        - **🎯 Density Map**: Shows where people are located with color intensity, modeled by (1-π) * λ
         - **� Predicted Count**: Total number of people detected in the image
         **🔍 Zero Analysis:**
+        - **🏗️ Structural Zero Map**: Indicates regions that structurally cannot contain head annotations (e.g., walls, sky, torso, or background). These are governed by the π head, which estimates the probability that a region never contains people.
+        - **📊 Sampling Zero Map**: Shows areas where people could be present but happen not to appear in the current image. These zeros are modeled by (1-π) * exp(-λ), where the expected count λ is near zero.
+        - **🎯 Complete Zero Map**: A combined visualization of zero probabilities, capturing both structural and sampling zeros. This map reflects overall non-crowd likelihood per region.
         **🎯 Hotspots:**
+        - **📈 Lambda Map**: Highlights areas with high expected crowd density. Each value represents the expected number of people in that region, modeled by the Poisson intensity (λ). This map focuses on *how many* people are likely to be present, assuming people could appear there.
         """)
     # 添加技术信息
         gr.Markdown("""
         ### Model Variants:
         - **ZIP-B**: Base model with best performance
+        - **ZIP-S**: Small model for faster inference
         - **ZIP-T**: Tiny model for resource-constrained environments
         - **ZIP-N**: Nano model for mobile applications
         - **ZIP-P**: Pico model for edge devices
         ### Datasets:
+        - **ShanghaiTech A**: Dense, low-resolution crowd scenes
+        - **ShanghaiTech B**: Sparse, high-resolution crowd scenes
+        - **UCF-QNRF**: Dense, ultra high-resolution crowd images
+        - **NWPU-Crowd**: Largest ultra high-resolution crowd counting dataset
         ### Metrics:
         - **MAE**: Mean Absolute Error - average counting error