projectaria
/

EgoBlur

Model card Files Files and versions Community

ariakang commited on Oct 25, 2024

Commit

9c0b319

1 Parent(s): e93c3dd

update model weights and scripts, delete mp4

Browse files

Files changed (12) hide show

.gitattributes +1 -0
CODE_OF_CONDUCT.md +80 -0
CONTRIBUTING.md +31 -0
demo_assets/test_image.jpg +0 -0
ego_blur_face.zip +3 -0
ego_blur_lp.zip +3 -0
environment.yaml +16 -0
script/demo_ego_blur.py +511 -0
tools/README.md +311 -0
tools/vrs_mutation/CMakeLists.txt +52 -0
tools/vrs_mutation/EgoBlurImageMutator.h +521 -0
tools/vrs_mutation/main.cpp +145 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+*.mp4 filter=lfs diff=lfs merge=lfs -text

CODE_OF_CONDUCT.md ADDED Viewed

	@@ -0,0 +1,80 @@

+# Code of Conduct
+## Our Pledge
+In the interest of fostering an open and welcoming environment, we as
+contributors and maintainers pledge to make participation in our project and
+our community a harassment-free experience for everyone, regardless of age, body
+size, disability, ethnicity, sex characteristics, gender identity and expression,
+level of experience, education, socio-economic status, nationality, personal
+appearance, race, religion, or sexual identity and orientation.
+## Our Standards
+Examples of behavior that contributes to creating a positive environment
+include:
+* Using welcoming and inclusive language
+* Being respectful of differing viewpoints and experiences
+* Gracefully accepting constructive criticism
+* Focusing on what is best for the community
+* Showing empathy towards other community members
+Examples of unacceptable behavior by participants include:
+* The use of sexualized language or imagery and unwelcome sexual attention or
+advances
+* Trolling, insulting/derogatory comments, and personal or political attacks
+* Public or private harassment
+* Publishing others' private information, such as a physical or electronic
+address, without explicit permission
+* Other conduct which could reasonably be considered inappropriate in a
+professional setting
+## Our Responsibilities
+Project maintainers are responsible for clarifying the standards of acceptable
+behavior and are expected to take appropriate and fair corrective action in
+response to any instances of unacceptable behavior.
+Project maintainers have the right and responsibility to remove, edit, or
+reject comments, commits, code, wiki edits, issues, and other contributions
+that are not aligned to this Code of Conduct, or to ban temporarily or
+permanently any contributor for other behaviors that they deem inappropriate,
+threatening, offensive, or harmful.
+## Scope
+This Code of Conduct applies within all project spaces, and it also applies when
+an individual is representing the project or its community in public spaces.
+Examples of representing a project or community include using an official
+project e-mail address, posting via an official social media account, or acting
+as an appointed representative at an online or offline event. Representation of
+a project may be further defined and clarified by project maintainers.
+This Code of Conduct also applies outside the project spaces when there is a
+reasonable belief that an individual's behavior may have a negative impact on
+the project or its community.
+## Enforcement
+Instances of abusive, harassing, or otherwise unacceptable behavior may be
+reported by contacting the project team at <[email protected]>. All
+complaints will be reviewed and investigated and will result in a response that
+is deemed necessary and appropriate to the circumstances. The project team is
+obligated to maintain confidentiality with regard to the reporter of an incident.
+Further details of specific enforcement policies may be posted separately.
+Project maintainers who do not follow or enforce the Code of Conduct in good
+faith may face temporary or permanent repercussions as determined by other
+members of the project's leadership.
+## Attribution
+This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
+available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
+[homepage]: https://www.contributor-covenant.org
+For answers to common questions about this code of conduct, see
+https://www.contributor-covenant.org/faq

CONTRIBUTING.md ADDED Viewed

	@@ -0,0 +1,31 @@

+# Contributing to EgoBlur
+We want to make contributing to this project as easy and transparent as
+possible.
+## Pull Requests
+We actively welcome your pull requests.
+1. Fork the repo and create your branch from `main`.
+2. If you've added code that should be tested, add tests.
+3. If you've changed APIs, update the documentation.
+4. Ensure the test suite passes.
+5. Make sure your code lints.
+6. If you haven't already, complete the Contributor License Agreement ("CLA").
+## Contributor License Agreement ("CLA")
+In order to accept your pull request, we need you to submit a CLA. You only need
+to do this once to work on any of Meta's open source projects.
+Complete your CLA here: <https://code.facebook.com/cla>
+## Issues
+We use GitHub issues to track public bugs. Please ensure your description is
+clear and has sufficient instructions to be able to reproduce the issue.
+Meta has a [bounty program](https://www.facebook.com/whitehat/) for the safe
+disclosure of security bugs. In those cases, please go through the process
+outlined on that page and do not file a public issue.
+## License
+By contributing to EgoBlur, you agree that your contributions will be licensed
+under the LICENSE file in the root directory of this source tree.

demo_assets/test_image.jpg ADDED Viewed

ego_blur_face.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2240fd030194fb8b0021856c51d5e28abc864503beb81f22ee7646c866205889
+size 389722007

ego_blur_lp.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d4811e08e11bcc7f35207a4412a27e6db0fe9fed6a42c78a565f8fabb45d1963
+size 389684490

environment.yaml ADDED Viewed

	@@ -0,0 +1,16 @@

+name: ego_blur
+channels:
+  - pytorch
+  - nvidia
+  - conda-forge
+  - defaults
+dependencies:
+  - python=3.10
+  - pytorch=1.12.1
+  - torchvision=0.13.1
+  - moviepy=1.0.3
+  - numpy=1.24.3
+  - pip=23.1.1
+  - ffmpeg=4.4.1
+  - pip:
+      - opencv-python-headless==4.7.0.72

script/demo_ego_blur.py ADDED Viewed

	@@ -0,0 +1,511 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+# This source code is licensed under the license found in the
+# LICENSE file in the root directory of this source tree.
+import argparse
+import os
+from functools import lru_cache
+from typing import List
+import cv2
+import numpy as np
+import torch
+import torchvision
+from moviepy.editor import ImageSequenceClip
+from moviepy.video.io.VideoFileClip import VideoFileClip
+def parse_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--face_model_path",
+        required=False,
+        type=str,
+        default=None,
+        help="Absolute EgoBlur face model file path",
+    )
+    parser.add_argument(
+        "--face_model_score_threshold",
+        required=False,
+        type=float,
+        default=0.9,
+        help="Face model score threshold to filter out low confidence detections",
+    )
+    parser.add_argument(
+        "--lp_model_path",
+        required=False,
+        type=str,
+        default=None,
+        help="Absolute EgoBlur license plate model file path",
+    )
+    parser.add_argument(
+        "--lp_model_score_threshold",
+        required=False,
+        type=float,
+        default=0.9,
+        help="License plate model score threshold to filter out low confidence detections",
+    )
+    parser.add_argument(
+        "--nms_iou_threshold",
+        required=False,
+        type=float,
+        default=0.3,
+        help="NMS iou threshold to filter out low confidence overlapping boxes",
+    )
+    parser.add_argument(
+        "--scale_factor_detections",
+        required=False,
+        type=float,
+        default=1,
+        help="Scale detections by the given factor to allow blurring more area, 1.15 would mean 15% scaling",
+    )
+    parser.add_argument(
+        "--input_image_path",
+        required=False,
+        type=str,
+        default=None,
+        help="Absolute path for the given image on which we want to make detections",
+    )
+    parser.add_argument(
+        "--output_image_path",
+        required=False,
+        type=str,
+        default=None,
+        help="Absolute path where we want to store the visualized image",
+    )
+    parser.add_argument(
+        "--input_video_path",
+        required=False,
+        type=str,
+        default=None,
+        help="Absolute path for the given video on which we want to make detections",
+    )
+    parser.add_argument(
+        "--output_video_path",
+        required=False,
+        type=str,
+        default=None,
+        help="Absolute path where we want to store the visualized video",
+    )
+    parser.add_argument(
+        "--output_video_fps",
+        required=False,
+        type=int,
+        default=30,
+        help="FPS for the output video",
+    )
+    return parser.parse_args()
+def create_output_directory(file_path: str) -> None:
+    """
+    parameter file_path: absolute path to the directory where we want to create the output files
+    Simple logic to create output directories if they don't exist.
+    """
+    print(
+        f"Directory {os.path.dirname(file_path)} does not exist. Attempting to create it..."
+    )
+    os.makedirs(os.path.dirname(file_path))
+    if not os.path.exists(os.path.dirname(file_path)):
+        raise ValueError(
+            f"Directory {os.path.dirname(file_path)} didn't exist. Attempt to create also failed. Please provide another path."
+        )
+def validate_inputs(args: argparse.Namespace) -> argparse.Namespace:
+    """
+    parameter args: parsed arguments
+    Run some basic checks on the input arguments
+    """
+    # input args value checks
+    if not 0.0 <= args.face_model_score_threshold <= 1.0:
+        raise ValueError(
+            f"Invalid face_model_score_threshold {args.face_model_score_threshold}"
+        )
+    if not 0.0 <= args.lp_model_score_threshold <= 1.0:
+        raise ValueError(
+            f"Invalid lp_model_score_threshold {args.lp_model_score_threshold}"
+        )
+    if not 0.0 <= args.nms_iou_threshold <= 1.0:
+        raise ValueError(f"Invalid nms_iou_threshold {args.nms_iou_threshold}")
+    if not 0 <= args.scale_factor_detections:
+        raise ValueError(
+            f"Invalid scale_factor_detections {args.scale_factor_detections}"
+        )
+    if not 1 <= args.output_video_fps or not (
+        isinstance(args.output_video_fps, int) and args.output_video_fps % 1 == 0
+    ):
+        raise ValueError(
+            f"Invalid output_video_fps {args.output_video_fps}, should be a positive integer"
+        )
+    # input/output paths checks
+    if args.face_model_path is None and args.lp_model_path is None:
+        raise ValueError(
+            "Please provide either face_model_path or lp_model_path or both"
+        )
+    if args.input_image_path is None and args.input_video_path is None:
+        raise ValueError("Please provide either input_image_path or input_video_path")
+    if args.input_image_path is not None and args.output_image_path is None:
+        raise ValueError(
+            "Please provide output_image_path for the visualized image to save."
+        )
+    if args.input_video_path is not None and args.output_video_path is None:
+        raise ValueError(
+            "Please provide output_video_path for the visualized video to save."
+        )
+    if args.input_image_path is not None and not os.path.exists(args.input_image_path):
+        raise ValueError(f"{args.input_image_path} does not exist.")
+    if args.input_video_path is not None and not os.path.exists(args.input_video_path):
+        raise ValueError(f"{args.input_video_path} does not exist.")
+    if args.face_model_path is not None and not os.path.exists(args.face_model_path):
+        raise ValueError(f"{args.face_model_path} does not exist.")
+    if args.lp_model_path is not None and not os.path.exists(args.lp_model_path):
+        raise ValueError(f"{args.lp_model_path} does not exist.")
+    if args.output_image_path is not None and not os.path.exists(
+        os.path.dirname(args.output_image_path)
+    ):
+        create_output_directory(args.output_image_path)
+    if args.output_video_path is not None and not os.path.exists(
+        os.path.dirname(args.output_video_path)
+    ):
+        create_output_directory(args.output_video_path)
+    # check we have write permissions on output paths
+    if args.output_image_path is not None and not os.access(
+        os.path.dirname(args.output_image_path), os.W_OK
+    ):
+        raise ValueError(
+            f"You don't have permissions to write to {args.output_image_path}. Please grant adequate permissions, or provide a different output path."
+        )
+    if args.output_video_path is not None and not os.access(
+        os.path.dirname(args.output_video_path), os.W_OK
+    ):
+        raise ValueError(
+            f"You don't have permissions to write to {args.output_video_path}. Please grant adequate permissions, or provide a different output path."
+        )
+    return args
+@lru_cache
+def get_device() -> str:
+    """
+    Return the device type
+    """
+    return (
+        "cpu"
+        if not torch.cuda.is_available()
+        else f"cuda:{torch.cuda.current_device()}"
+    )
+def read_image(image_path: str) -> np.ndarray:
+    """
+    parameter image_path: absolute path to an image
+    Return an image in BGR format
+    """
+    bgr_image = cv2.imread(image_path)
+    if len(bgr_image.shape) == 2:
+        bgr_image = cv2.cvtColor(bgr_image, cv2.COLOR_GRAY2BGR)
+    return bgr_image
+def write_image(image: np.ndarray, image_path: str) -> None:
+    """
+    parameter image: np.ndarray in BGR format
+    parameter image_path: absolute path where we want to save the visualized image
+    """
+    cv2.imwrite(image_path, image)
+def get_image_tensor(bgr_image: np.ndarray) -> torch.Tensor:
+    """
+    parameter bgr_image: image on which we want to make detections
+    Return the image tensor
+    """
+    bgr_image_transposed = np.transpose(bgr_image, (2, 0, 1))
+    image_tensor = torch.from_numpy(bgr_image_transposed).to(get_device())
+    return image_tensor
+def get_detections(
+    detector: torch.jit._script.RecursiveScriptModule,
+    image_tensor: torch.Tensor,
+    model_score_threshold: float,
+    nms_iou_threshold: float,
+) -> List[List[float]]:
+    """
+    parameter detector: Torchscript module to perform detections
+    parameter image_tensor: image tensor on which we want to make detections
+    parameter model_score_threshold: model score threshold to filter out low confidence detection
+    parameter nms_iou_threshold: NMS iou threshold to filter out low confidence overlapping boxes
+    Returns the list of detections
+    """
+    with torch.no_grad():
+        detections = detector(image_tensor)
+    boxes, _, scores, _ = detections  # returns boxes, labels, scores, dims
+    nms_keep_idx = torchvision.ops.nms(boxes, scores, nms_iou_threshold)
+    boxes = boxes[nms_keep_idx]
+    scores = scores[nms_keep_idx]
+    boxes = boxes.cpu().numpy()
+    scores = scores.cpu().numpy()
+    score_keep_idx = np.where(scores > model_score_threshold)[0]
+    boxes = boxes[score_keep_idx]
+    return boxes.tolist()
+def scale_box(
+    box: List[List[float]], max_width: int, max_height: int, scale: float
+) -> List[List[float]]:
+    """
+    parameter box: detection box in format (x1, y1, x2, y2)
+    parameter scale: scaling factor
+    Returns a scaled bbox as (x1, y1, x2, y2)
+    """
+    x1, y1, x2, y2 = box[0], box[1], box[2], box[3]
+    w = x2 - x1
+    h = y2 - y1
+    xc = x1 + w / 2
+    yc = y1 + h / 2
+    w = scale * w
+    h = scale * h
+    x1 = max(xc - w / 2, 0)
+    y1 = max(yc - h / 2, 0)
+    x2 = min(xc + w / 2, max_width)
+    y2 = min(yc + h / 2, max_height)
+    return [x1, y1, x2, y2]
+def visualize(
+    image: np.ndarray,
+    detections: List[List[float]],
+    scale_factor_detections: float,
+) -> np.ndarray:
+    """
+    parameter image: image on which we want to make detections
+    parameter detections: list of bounding boxes in format [x1, y1, x2, y2]
+    parameter scale_factor_detections: scale detections by the given factor to allow blurring more area, 1.15 would mean 15% scaling
+    Visualize the input image with the detections and save the output image at the given path
+    """
+    image_fg = image.copy()
+    mask_shape = (image.shape[0], image.shape[1], 1)
+    mask = np.full(mask_shape, 0, dtype=np.uint8)
+    for box in detections:
+        if scale_factor_detections != 1.0:
+            box = scale_box(
+                box, image.shape[1], image.shape[0], scale_factor_detections
+            )
+        x1, y1, x2, y2 = int(box[0]), int(box[1]), int(box[2]), int(box[3])
+        w = x2 - x1
+        h = y2 - y1
+        ksize = (image.shape[0] // 2, image.shape[1] // 2)
+        image_fg[y1:y2, x1:x2] = cv2.blur(image_fg[y1:y2, x1:x2], ksize)
+        cv2.ellipse(mask, (((x1 + x2) // 2, (y1 + y2) // 2), (w, h), 0), 255, -1)
+    inverse_mask = cv2.bitwise_not(mask)
+    image_bg = cv2.bitwise_and(image, image, mask=inverse_mask)
+    image_fg = cv2.bitwise_and(image_fg, image_fg, mask=mask)
+    image = cv2.add(image_bg, image_fg)
+    return image
+def visualize_image(
+    input_image_path: str,
+    face_detector: torch.jit._script.RecursiveScriptModule,
+    lp_detector: torch.jit._script.RecursiveScriptModule,
+    face_model_score_threshold: float,
+    lp_model_score_threshold: float,
+    nms_iou_threshold: float,
+    output_image_path: str,
+    scale_factor_detections: float,
+):
+    """
+    parameter input_image_path: absolute path to the input image
+    parameter face_detector: face detector model to perform face detections
+    parameter lp_detector: face detector model to perform face detections
+    parameter face_model_score_threshold: face model score threshold to filter out low confidence detection
+    parameter lp_model_score_threshold: license plate model score threshold to filter out low confidence detection
+    parameter nms_iou_threshold: NMS iou threshold
+    parameter output_image_path: absolute path where the visualized image will be saved
+    parameter scale_factor_detections: scale detections by the given factor to allow blurring more area
+    Perform detections on the input image and save the output image at the given path.
+    """
+    bgr_image = read_image(input_image_path)
+    image = bgr_image.copy()
+    image_tensor = get_image_tensor(bgr_image)
+    image_tensor_copy = image_tensor.clone()
+    detections = []
+    # get face detections
+    if face_detector is not None:
+        detections.extend(
+            get_detections(
+                face_detector,
+                image_tensor,
+                face_model_score_threshold,
+                nms_iou_threshold,
+            )
+        )
+    # get license plate detections
+    if lp_detector is not None:
+        detections.extend(
+            get_detections(
+                lp_detector,
+                image_tensor_copy,
+                lp_model_score_threshold,
+                nms_iou_threshold,
+            )
+        )
+    image = visualize(
+        image,
+        detections,
+        scale_factor_detections,
+    )
+    write_image(image, output_image_path)
+def visualize_video(
+    input_video_path: str,
+    face_detector: torch.jit._script.RecursiveScriptModule,
+    lp_detector: torch.jit._script.RecursiveScriptModule,
+    face_model_score_threshold: float,
+    lp_model_score_threshold: float,
+    nms_iou_threshold: float,
+    output_video_path: str,
+    scale_factor_detections: float,
+    output_video_fps: int,
+):
+    """
+    parameter input_video_path: absolute path to the input video
+    parameter face_detector: face detector model to perform face detections
+    parameter lp_detector: face detector model to perform face detections
+    parameter face_model_score_threshold: face model score threshold to filter out low confidence detection
+    parameter lp_model_score_threshold: license plate model score threshold to filter out low confidence detection
+    parameter nms_iou_threshold: NMS iou threshold
+    parameter output_video_path: absolute path where the visualized video will be saved
+    parameter scale_factor_detections: scale detections by the given factor to allow blurring more area
+    parameter output_video_fps: fps of the visualized video
+    Perform detections on the input video and save the output video at the given path.
+    """
+    visualized_images = []
+    video_reader_clip = VideoFileClip(input_video_path)
+    for frame in video_reader_clip.iter_frames():
+        if len(frame.shape) == 2:
+            frame = cv2.cvtColor(frame, cv2.COLOR_GRAY2RGB)
+        image = frame.copy()
+        bgr_image = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
+        image_tensor = get_image_tensor(bgr_image)
+        image_tensor_copy = image_tensor.clone()
+        detections = []
+        # get face detections
+        if face_detector is not None:
+            detections.extend(
+                get_detections(
+                    face_detector,
+                    image_tensor,
+                    face_model_score_threshold,
+                    nms_iou_threshold,
+                )
+            )
+        # get license plate detections
+        if lp_detector is not None:
+            detections.extend(
+                get_detections(
+                    lp_detector,
+                    image_tensor_copy,
+                    lp_model_score_threshold,
+                    nms_iou_threshold,
+                )
+            )
+        visualized_images.append(
+            visualize(
+                image,
+                detections,
+                scale_factor_detections,
+            )
+        )
+    video_reader_clip.close()
+    if visualized_images:
+        video_writer_clip = ImageSequenceClip(visualized_images, fps=output_video_fps)
+        video_writer_clip.write_videofile(output_video_path)
+        video_writer_clip.close()
+if __name__ == "__main__":
+    args = validate_inputs(parse_args())
+    if args.face_model_path is not None:
+        face_detector = torch.jit.load(args.face_model_path, map_location="cpu").to(
+            get_device()
+        )
+        face_detector.eval()
+    else:
+        face_detector = None
+    if args.lp_model_path is not None:
+        lp_detector = torch.jit.load(args.lp_model_path, map_location="cpu").to(
+            get_device()
+        )
+        lp_detector.eval()
+    else:
+        lp_detector = None
+    if args.input_image_path is not None:
+        image = visualize_image(
+            args.input_image_path,
+            face_detector,
+            lp_detector,
+            args.face_model_score_threshold,
+            args.lp_model_score_threshold,
+            args.nms_iou_threshold,
+            args.output_image_path,
+            args.scale_factor_detections,
+        )
+    if args.input_video_path is not None:
+        visualize_video(
+            args.input_video_path,
+            face_detector,
+            lp_detector,
+            args.face_model_score_threshold,
+            args.lp_model_score_threshold,
+            args.nms_iou_threshold,
+            args.output_video_path,
+            args.scale_factor_detections,
+            args.output_video_fps,
+        )

tools/README.md ADDED Viewed

	@@ -0,0 +1,311 @@

+# EgoBlur VRS Utilities
+Meta has developed [EgoBlur](https://www.projectaria.com/tools/egoblur/), a sophisticated face and license plate anonymization system, as part of its ongoing commitment to responsible innovation. Since 2020, EgoBlur has been employed internally within the [Project Aria](https://www.projectaria.com/) program. The VRS file format is designed to record and play back streams of sensor data, including images, audio samples, and other discrete sensors (e.g., IMU, temperature), stored as time-stamped records within per-device streams. This library enables users to process VRS formatted videos using anonymization techniques and generate anonymized output videos in the same efficient VRS format.
+# Getting Started
+To use anonymization models with the VRS files, you need to download model files from our website and install necessary software as described below.
+# Instruction to retrieve the ML Models
+Models can be retrieved from [egoblur download](https://www.projectaria.com/tools/egoblur/) section. We will use the downloadable link and will fetch the models using “wget”.
+We begin by creating a directory to store models
+```
+    mkdir ~/models && cd ~/models
+```
+Run command to fetch models
+```
+    wget -O face.zip "<downloadable_link_fetched_from_website>"
+    unzip face.zip
+```
+Repeat the same process for the license plate model.
+# Installation (Ubuntu 22.04 with libtorch 2.1 with CUDA toolkit 12.1)
+This installation guide assumes that the nvidia-driver, Cuda, OpenCV, make and gcc are already installed. If not please follow the instructions at the end of this document to install these utilities/drivers.
+## Download CMake
+Download CMake binary
+```
+    mkdir ~/cmake && cd ~/cmake
+    wget https://github.com/Kitware/CMake/releases/download/v3.28.0-rc4/cmake-3.28.0-rc4-linux-x86_64.sh
+```
+Unpack CMake
+```
+    chmod 555 cmake-3.28.0-rc4-linux-x86_64.sh && ./cmake-3.28.0-rc4-linux-x86_64.sh --skip-license
+```
+## Download libtorch
+We are working with libtorch 2.1 with CUDA toolkit 12.1. This can be downloaded using:
+```
+cd ~/ && \
+wget https://download.pytorch.org/libtorch/cu121/libtorch-cxx11-abi-shared-with-deps-2.1.0%2Bcu121.zip && \
+unzip libtorch-cxx11-abi-shared-with-deps-2.1.0+cu121.zip
+```
+## Install VRS dependencies
+```
+sudo apt install libfmt-dev libturbojpeg-dev libpng-dev && \
+sudo apt install liblz4-dev libzstd-dev libxxhash-dev && \
+sudo apt install libboost-system-dev libboost-iostreams-dev libboost-filesystem-dev libboost-thread-dev libboost-chrono-dev libboost-date-time-dev
+```
+## Install ninja build(required by projectaria_tools)
+```
+sudo apt install ninja-build
+```
+## Download github repositories
+Make directory to hold repos
+```
+    mkdir ~/repos && cd ~/repos
+```
+### Torchvision
+#### Download torchvision
+```
+    cd ~/repos && \
+    git clone --branch v0.16.0 https://github.com/pytorch/vision/
+```
+#### Build torchvision
+```
+    cd ~/repos && \
+    rm -rf vision/build && \
+    mkdir vision/build && \
+    cd vision/build && \
+    ~/cmake/bin/cmake .. -DCMAKE_BUILD_TYPE=Release -DTORCH_CUDA_ARCH_LIST=$TORCH_CUDA_ARCH_LIST -DWITH_CUDA=on -DTorch_DIR=~/libtorch/share/cmake/Torch && \
+    make -j && \
+    sudo make install
+```
+### EgoBlur
+#### Download EgoBlur repo
+```
+    cd ~/repos && \
+    git clone https://github.com/facebookresearch/EgoBlur.git
+```
+#### Build ego_blur_vrs_mutation
+```
+    cd ~/repos/EgoBlur/tools/vrs_mutation && \
+    rm -rf build && \
+    mkdir build && \
+    cd build &&  \
+    ~/cmake/bin/cmake .. -DTorch_DIR=/home/$USER/libtorch/share/cmake/Torch -DTorchVision_DIR=~/repos/vision/cmake && \
+    make -j ego_blur_vrs_mutation
+```
+# Usage:
+## CLI Arguments
+```
+    -i,--in
+```
+use this argument to provide an absolute path for the given input VRS file on which we want to make detections and perform blurring. You MUST provide this value.
+```
+    -o,--out
+```
+ use this argument to provide an absolute path where we want to store the blurred VRS file. You MUST provide this value.
+```
+    -f, --faceModelPath
+```
+use this argument to provide an absolute EgoBlur face model file path. You SHOULD provide either --faceModelPath or --licensePlateModelPath or both. If none is provided code will not blur any data and the same input VRS will be written out without any operation.
+```
+    --face-model-confidence-threshold
+```
+use this argument to provide a face model score threshold to filter out low confidence face detections. The values must be between 0.0 and 1.0, if not provided this defaults to 0.1.
+```
+    -l, --licensePlateModelPath
+```
+use this argument to provide an absolute EgoBlur license plate model file path. You SHOULD provide either --faceModelPath or --licensePlateModelPath or both. If none is provided code will not blur any data and the same input VRS will be written out without any operation.
+```
+    --license-plate-model-confidence-threshold
+```
+use this argument to provide license plate model score threshold to filter out low confidence license plate detections. The values must be between 0.0 and 1.0, if not provided this defaults to 0.1.
+```
+    --scale-factor-detections
+```
+use this argument to provide scale detections by the given factor to allow blurring more area. The values can only be positive real numbers eg: 0.9(values &lt; 1) would mean scaling DOWN the predicted blurred region by 10%, whereas as 1.1(values > 1) would mean scaling UP the predicted blurred region by 10%. If not provided this defaults to 1.15.
+```
+    --nms-threshold
+```
+use this argument to provide NMS iou threshold to filter out low confidence overlapping boxes. The values must be between 0.0 and 1.0, if not provided this defaults to 0.3.
+```
+    --use-gpu
+```
+flag to indicate whether you want to use GPU. It's highly recommended that you use GPU
+A sample command using mandatory args only:
+```
+    cd ~/repos/EgoBlur/tools/vrs_mutation/build && \
+    ./ego_blur_vrs_mutation --in your_vrs_file --out your_output_vrs_file -f ~/models/ego_blur_face.jit -l ~/models/    ego_blur_lp.jit --use-gpu
+```
+A sample command using all args:
+```
+cd ~/repos/EgoBlur/tools/vrs_mutation/build && \
+./ego_blur_vrs_mutation --in your_vrs_file --out your_output_vrs_file -f ~/models/ego_blur_face.jit --face-model-confidence-threshold 0.75 -l ~/models/ego_blur_lp.jit --license-plate-model-confidence-threshold 0.99 --scale-factor-detections 1.15 --nms-threshold 0.3 --use-gpu
+```
+# Additional Installation Instructions
+In this section we will cover additional installation instructions.
+## Check OS version
+```
+    hostnamectl
+```
+This should give you the OS version which will be helpful in selecting the drivers and CUDA toolkit in the steps below.
+## Install make
+```
+    sudo apt install make
+```
+## Install gcc
+```
+    sudo apt install gcc
+```
+## Install OpenCV
+```
+    sudo apt install libopencv-dev
+```
+## Install utility unzip
+```
+    sudo apt install unzip
+```
+## Check if you have GPU
+This should provide the type of the GPU on the machine.
+```
+    lspci | grep nvidia -i
+```
+## Decide GPU Driver
+Based on the gpu type and your OS type obtained previously, go to the website to search for an appropriate driver: [https://www.nvidia.com/Download/index.aspx?lang=en-us](https://www.nvidia.com/Download/index.aspx?lang=en-us)
+## Update package manager
+```
+    sudo apt update
+```
+## Install GPU drivers
+```
+    sudo apt install ubuntu-drivers-common
+```
+Confirm that package manager identifies your device(GPU) and recommends appropriate drivers
+```
+    sudo ubuntu-drivers devices
+```
+Finally install the driver
+```
+    sudo apt install nvidia-driver-535
+```
+Reboot the system
+```
+    sudo reboot
+```
+Check if driver installation went correctly by running nvidia-smi
+```
+    nvidia-smi
+```
+## Install CUDA Toolkit
+Go to pytorch website and find the specific cuda toolkit version you want to install(this should match with your libtorch supported version). Since we are using libtorch version v2.1.0: [Previous PyTorch Versions | PyTorch](https://fburl.com/himtgbgc) we will install libtorch v2.1.0: [Previous PyTorch Versions | PyTorch](https://fburl.com/z2m3p81z) with cuda toolkit 12.1.
+### CUDA
+Since we will be using libtorch v2.1.0 with cuda toolkit 12.1, visit website [https://developer.nvidia.com/cuda-toolkit-archive](https://developer.nvidia.com/cuda-toolkit-archive) to get the installation instructions
+Get cuda run file
+```
+    wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda_12.1.1_530.30.02_linux.run
+```
+Execute CUDA runfile
+```
+    sudo sh cuda_12.1.1_530.30.02_linux.run
+```
+Since we have already installed driver we don't need to reinstall it, we can simply continue with cuda toolkit installation
+Export paths
+```
+    vi ~/.bashrc
+```
+And add these lines:
+```
+    export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
+    export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
+```
+Run
+```
+    source ~/.bashrc
+```
+To verify cuda toolkit installation run:
+```
+    nvcc --version
+```

tools/vrs_mutation/CMakeLists.txt ADDED Viewed

	@@ -0,0 +1,52 @@

+# Copyright (c) Meta Platforms, Inc. and affiliates.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+cmake_minimum_required(VERSION 3.12)
+project(ego_blur
+        LANGUAGES CXX)
+set(CMAKE_CXX_STANDARD 17)
+set(CMAKE_POSITION_INDEPENDENT_CODE ON)
+set(PROJECTARIA_TOOLS_BUILD_TOOLS ON CACHE BOOL "")
+include(FetchContent)
+FetchContent_Declare(
+    projectaria_tools
+    GIT_REPOSITORY https://github.com/facebookresearch/projectaria_tools.git
+    GIT_TAG origin/main
+    SOURCE_DIR "${CMAKE_BINARY_DIR}/_deps/projectaria_tools-src/projectaria_tools"
+)
+FetchContent_MakeAvailable(projectaria_tools)
+include_directories("${CMAKE_BINARY_DIR}/_deps/projectaria_tools-src")
+find_package(TorchVision REQUIRED)
+set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${TORCH_CXX_FLAGS}")
+find_package(OpenCV REQUIRED)
+add_executable( ego_blur_vrs_mutation
+    EgoBlurImageMutator.h
+    main.cpp
+)
+target_link_libraries(ego_blur_vrs_mutation
+    vrs_image_mutation_interface
+    TorchVision::TorchVision
+    CLI11::CLI11
+    "${TORCH_LIBRARIES}"
+    ${OpenCV_LIBS})

tools/vrs_mutation/EgoBlurImageMutator.h ADDED Viewed

	@@ -0,0 +1,521 @@

+/*
+ * Copyright (c) Meta Platforms, Inc. and affiliates.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#pragma once
+#include <c10/core/ScalarType.h>
+#include <c10/util/Exception.h>
+#include <torch/types.h>
+#include <vrs/RecordFormat.h> // @manual
+#include <projectaria_tools/tools/samples/vrs_mutation/ImageMutationFilterCopier.h> // @manual
+#include <torch/script.h>
+#include <torch/serialize.h> // @manual
+#include <torch/torch.h>
+#include <cstdint>
+#include <iostream>
+#include <memory>
+#include <string>
+#include <c10/cuda/CUDACachingAllocator.h>
+#include <opencv2/core.hpp>
+#include <opencv2/imgproc.hpp>
+namespace EgoBlur {
+struct EgoBlurImageMutator : public vrs::utils::UserDefinedImageMutator {
+  // Inherit from UserDefinedImageMutator as defined in projectaria_tools to
+  // blur detected faces/license plates. This class implements the logic to run
+  // model inference and performs blurring on VRS frame by frame and saves the
+  // output as a VRS file.
+  std::shared_ptr<torch::jit::script::Module> faceModel_;
+  std::shared_ptr<torch::jit::script::Module> licensePlateModel_;
+  float faceModelConfidenceThreshold_;
+  float licensePlateModelConfidenceThreshold_;
+  float scaleFactorDetections_;
+  float nmsThreshold_;
+  bool useGPU_;
+  bool clockwise90Rotation_;
+  std::unordered_map<std::string, std::unordered_map<std::string, int>> stats_;
+  torch::Device device_ = torch::kCPU;
+  explicit EgoBlurImageMutator(
+      const std::string& faceModelPath = "",
+      const float faceModelConfidenceThreshold = 0.1,
+      const std::string& licensePlateModelPath = "",
+      const float licensePlateModelConfidenceThreshold = 0.1,
+      const float scaleFactorDetections = 1.15,
+      const float nmsThreshold = 0.3,
+      const bool useGPU = true,
+      const bool clockwise90Rotation = true)
+      : faceModelConfidenceThreshold_(faceModelConfidenceThreshold),
+        licensePlateModelConfidenceThreshold_(
+            licensePlateModelConfidenceThreshold),
+        scaleFactorDetections_(scaleFactorDetections),
+        nmsThreshold_(nmsThreshold),
+        useGPU_(useGPU),
+        clockwise90Rotation_(clockwise90Rotation) {
+    device_ = getDevice();
+    std::cout << "attempting to load ego blur face model: " << faceModelPath
+              << std::endl;
+    if (!faceModelPath.empty()) {
+      faceModel_ = loadModel(faceModelPath);
+    }
+    std::cout << "attempting to load ego blur license plate model: "
+              << licensePlateModelPath << std::endl;
+    if (!licensePlateModelPath.empty()) {
+      licensePlateModel_ = loadModel(licensePlateModelPath);
+    }
+  }
+  std::shared_ptr<torch::jit::script::Module> loadModel(
+      const std::string& path) {
+    std::shared_ptr<torch::jit::script::Module> model;
+    try {
+      model = std::make_shared<torch::jit::script::Module>();
+      // patternlint-disable-next-line no-torch-low-level-api
+      *model = torch::jit::load(path);
+      std::cout << "Loaded model: " << path << std::endl;
+      model->to(device_);
+      model->eval();
+    } catch (const c10::Error&) {
+      std::cout << "Failed to load model: " << path << std::endl;
+      throw;
+    }
+    return model;
+  }
+  at::DeviceType getDevice() const {
+    if (useGPU_ && torch::cuda::is_available()) {
+      // using GPU
+      return torch::kCUDA;
+    } else {
+      // using CPU
+      return torch::kCPU;
+    }
+  }
+  torch::Tensor filterDetections(
+      c10::intrusive_ptr<c10::ivalue::Tuple> detections,
+      float scoreThreshold) const {
+    // filter prediction based of confidence scores, we have scores at index 2
+    torch::Tensor scoreThresholdMask =
+        torch::gt(
+            detections->elements().at(2).toTensor(),
+            torch::tensor(scoreThreshold))
+            .detach();
+    // we have boxes at index 0
+    torch::Tensor filteredBoundingBoxes = detections->elements()
+                                              .at(0)
+                                              .toTensor()
+                                              .index({scoreThresholdMask})
+                                              .detach();
+    torch::Tensor filteredBoundingBoxesScores = detections->elements()
+                                                    .at(2)
+                                                    .toTensor()
+                                                    .index({scoreThresholdMask})
+                                                    .detach();
+    // filter out overlapping detections by performing NMS
+    torch::Tensor filteredBoundingBoxesPostNMS =
+        performNMS(
+            filteredBoundingBoxes, filteredBoundingBoxesScores, nmsThreshold_)
+            .detach();
+    scoreThresholdMask.reset();
+    filteredBoundingBoxes.reset();
+    filteredBoundingBoxesScores.reset();
+    return filteredBoundingBoxesPostNMS;
+  }
+  // Define a custom NMS function
+  torch::Tensor performNMS(
+      const torch::Tensor& boxes,
+      const torch::Tensor& scores,
+      float overlapThreshold) const {
+    // Convert tensors to CPU
+    torch::Tensor boxesCPU = boxes.to(torch::kCPU).detach();
+    torch::Tensor scoresCPU = scores.to(torch::kCPU).detach();
+    // Get the number of bounding boxes
+    int numBoxes = boxesCPU.size(0);
+    // Extract bounding box coordinates
+    auto boxesAccessor = boxesCPU.accessor<float, 2>();
+    auto scoresAccessor = scoresCPU.accessor<float, 1>();
+    std::vector<bool> picked(numBoxes, false);
+    for (int i = 0; i < numBoxes; ++i) {
+      if (!picked[i]) {
+        for (int j = i + 1; j < numBoxes; ++j) {
+          if (!picked[j]) {
+            float x1 = std::max(boxesAccessor[i][0], boxesAccessor[j][0]);
+            float y1 = std::max(boxesAccessor[i][1], boxesAccessor[j][1]);
+            float x2 = std::min(boxesAccessor[i][2], boxesAccessor[j][2]);
+            float y2 = std::min(boxesAccessor[i][3], boxesAccessor[j][3]);
+            float intersection =
+                std::max(0.0f, x2 - x1) * std::max(0.0f, y2 - y1);
+            float iou = intersection /
+                ((boxesAccessor[i][2] - boxesAccessor[i][0]) *
+                     (boxesAccessor[i][3] - boxesAccessor[i][1]) +
+                 (boxesAccessor[j][2] - boxesAccessor[j][0]) *
+                     (boxesAccessor[j][3] - boxesAccessor[j][1]) -
+                 intersection);
+            if (iou > overlapThreshold) {
+              if (scoresAccessor[i] > scoresAccessor[j]) {
+                picked[j] = true;
+              } else {
+                picked[i] = true;
+              }
+            }
+          }
+        }
+      }
+    }
+    std::vector<int> selectedIndices;
+    for (int i = 0; i < numBoxes; ++i) {
+      if (!picked[i]) {
+        selectedIndices.push_back(i);
+      }
+    }
+    torch::Tensor filteredBoundingBoxes =
+        torch::index_select(
+            boxes.to(torch::kCPU),
+            0,
+            torch::from_blob(
+                selectedIndices.data(),
+                {static_cast<long>(selectedIndices.size())},
+                torch::kInt))
+            .detach();
+    boxesCPU.reset();
+    scoresCPU.reset();
+    return filteredBoundingBoxes;
+  }
+  static std::vector<float> scaleBox(
+      const std::vector<float>& box,
+      int maxWidth,
+      int maxHeight,
+      float scale) {
+    // Extract x1, y1, x2, and y2 from the input box.
+    float x1 = box[0];
+    float y1 = box[1];
+    float x2 = box[2];
+    float y2 = box[3];
+    float w = x2 - x1;
+    float h = y2 - y1;
+    // Calculate the center point of the box.
+    float xc = x1 + (w / 2);
+    float yc = y1 + (h / 2);
+    // Scale the width and height of the box.
+    w = scale * w;
+    h = scale * h;
+    // Update the coordinates of the box to fit within the maximum dimensions.
+    x1 = std::max(xc - (w / 2), 0.0f);
+    y1 = std::max(yc - (h / 2), 0.0f);
+    x2 = std::min(xc + (w / 2), static_cast<float>(maxWidth));
+    y2 = std::min(yc + (h / 2), static_cast<float>(maxHeight));
+    // Return the scaled box as a vector of vectors.
+    return {x1, y1, x2, y2};
+  }
+  cv::Mat blurImage(
+      const cv::Mat& image,
+      const std::vector<torch::Tensor>& detections,
+      float scale) {
+    // Use the mask to combine the original and blurred images
+    cv::Mat response = image.clone();
+    cv::Mat mask;
+    if (image.channels() == 3) {
+      mask = cv::Mat::zeros(image.size(), CV_8UC3);
+    } else {
+      mask = cv::Mat::zeros(image.size(), CV_8UC1);
+    }
+    for (const auto& detection : detections) {
+      for (auto& box : detection.unbind()) {
+        std::vector<float> boxVector(
+            box.data_ptr<float>(), box.data_ptr<float>() + box.numel());
+        if (scale != 1.0f) {
+          boxVector = scaleBox(boxVector, image.cols, image.rows, scale);
+        }
+        int x1 = static_cast<int>(boxVector[0]);
+        int y1 = static_cast<int>(boxVector[1]);
+        int x2 = static_cast<int>(boxVector[2]);
+        int y2 = static_cast<int>(boxVector[3]);
+        int w = x2 - x1;
+        int h = y2 - y1;
+        // Blur region inside ellipse
+        cv::Scalar color;
+        if (image.channels() == 3) {
+          color = cv::Scalar(255, 255, 255);
+        } else {
+          color = cv::Scalar(255);
+        }
+        cv::ellipse(
+            mask,
+            cv::Point((x1 + x2) / 2, (y1 + y2) / 2),
+            cv::Size(w / 2, h / 2),
+            0,
+            0,
+            360,
+            color,
+            -1);
+        // Apply blur effect to the whole image
+        cv::Size ksize = cv::Size(image.rows / 8, image.cols / 8);
+        cv::Mat blurredImage;
+        cv::blur(image(cv::Rect({x1, y1, w, h})), blurredImage, ksize);
+        blurredImage.copyTo(
+            response(cv::Rect({x1, y1, w, h})), mask(cv::Rect({x1, y1, w, h})));
+        blurredImage.release();
+      }
+    }
+    mask.release();
+    return response;
+  }
+  cv::Mat detectAndBlur(
+      vrs::utils::PixelFrame* frame,
+      const std::string& frameId) {
+    // Convert PixelFrame to cv::Mat
+    const int width = frame->getWidth();
+    const int height = frame->getHeight();
+    // Deduce type of the Array (can be either GRAY or RGB)
+    const int channels =
+        frame->getPixelFormat() == vrs::PixelFormat::RGB8 ? 3 : 1;
+    cv::Mat img = cv::Mat(
+                      height,
+                      width,
+                      CV_8UC(channels),
+                      static_cast<void*>(frame->getBuffer().data()))
+                      .clone();
+    // Rotate image if needed
+    if (clockwise90Rotation_) {
+      cv::rotate(img, img, cv::ROTATE_90_CLOCKWISE);
+    }
+    torch::NoGradGuard no_grad;
+    // Convert image to tensor
+    torch::Tensor imgTensor = torch::from_blob(
+        (void*)frame->rdata(), {height, width, channels}, torch::kUInt8);
+    // torch::Tensor imgTensor = getImageTensor(frame);
+    torch::Tensor imgTensorFloat = imgTensor.to(torch::kFloat);
+    // If you need to move to GPU
+    torch::Tensor imgTensorFloatOnDevice = imgTensorFloat.to(device_);
+    torch::Tensor imgTensorFloatOnDevicePostRotation;
+    // rotate the image clockwise
+    if (clockwise90Rotation_) {
+      imgTensorFloatOnDevicePostRotation =
+          torch::rot90(imgTensorFloatOnDevice, -1);
+    } else {
+      imgTensorFloatOnDevicePostRotation = imgTensorFloatOnDevice;
+    }
+    // convert from HWC to CHW
+    torch::Tensor imgTensorFloatOnDevicePostRotationCHW =
+        imgTensorFloatOnDevicePostRotation.permute({2, 0, 1});
+    // Create input tensor for model inference
+    std::vector<torch::jit::IValue> inputs = {
+        imgTensorFloatOnDevicePostRotationCHW};
+    // Create output vector to store results
+    std::vector<torch::Tensor> boundingBoxes;
+    cv::Mat finalImage;
+    torch::Tensor faceBoundingBoxes;
+    torch::Tensor licensePlateBoundingBoxes;
+    // Begin making detections
+    // use face model to find faces
+    if (faceModel_) {
+      c10::intrusive_ptr<c10::ivalue::Tuple> faceDetections =
+          faceModel_->forward(inputs)
+              .toTuple(); // returns boxes, labels, scores, dims
+      faceBoundingBoxes =
+          filterDetections(faceDetections, faceModelConfidenceThreshold_);
+      int totalFaceDetectionsForCurrentFrame = faceBoundingBoxes.sizes()[0];
+      stats_[frameId]["faces"] += totalFaceDetectionsForCurrentFrame;
+      if (faceBoundingBoxes.sizes()[0] > 0) {
+        boundingBoxes.push_back(faceBoundingBoxes);
+      }
+      faceDetections.reset();
+    }
+    // use LP model to find LP
+    if (licensePlateModel_) {
+      c10::intrusive_ptr<c10::ivalue::Tuple> licensePlateDetections =
+          licensePlateModel_->forward(inputs)
+              .toTuple(); // returns boxes, labels, scores, dims
+      licensePlateBoundingBoxes = filterDetections(
+          licensePlateDetections, licensePlateModelConfidenceThreshold_);
+      int totaLlicensePlateDetectionsForCurrentFrame =
+          licensePlateBoundingBoxes.sizes()[0];
+      stats_[frameId]["licensePlate"] +=
+          totaLlicensePlateDetectionsForCurrentFrame;
+      if (licensePlateBoundingBoxes.sizes()[0] > 0) {
+        boundingBoxes.push_back(licensePlateBoundingBoxes);
+      }
+      licensePlateDetections.reset();
+    }
+    if (!boundingBoxes.empty()) {
+      // Blur the image
+      finalImage = blurImage(img, boundingBoxes, scaleFactorDetections_);
+      // Rotate image back if needed
+      if (clockwise90Rotation_) {
+        cv::rotate(finalImage, finalImage, cv::ROTATE_90_COUNTERCLOCKWISE);
+      }
+      // Force Cleanup
+      boundingBoxes.clear();
+    }
+    // Force Cleanup
+    inputs.clear();
+    imgTensor.reset();
+    imgTensorFloat.reset();
+    imgTensorFloatOnDevice.reset();
+    imgTensorFloatOnDevicePostRotation.reset();
+    imgTensorFloatOnDevicePostRotationCHW.reset();
+    faceBoundingBoxes.reset();
+    licensePlateBoundingBoxes.reset();
+    img.release();
+    return finalImage;
+  }
+  bool operator()(
+      double timestamp,
+      const vrs::StreamId& streamId,
+      vrs::utils::PixelFrame* frame) override {
+    // Handle the case where we have no image data
+    if (!frame) {
+      return false;
+    }
+    cv::Mat blurredImage;
+    // If not Eye Tracking image
+    if (streamId.getNumericName().find("214") != std::string::npos ||
+        streamId.getNumericName().find("1201") != std::string::npos) {
+      // Get predictions and blur
+      std::string frameId =
+          streamId.getNumericName() + "_" + std::to_string(timestamp);
+      stats_[frameId]["faces"] = 0;
+      stats_[frameId]["licensePlate"] = 0;
+      blurredImage = detectAndBlur(frame, frameId);
+    }
+    // Copy back results into the frame
+    if (!blurredImage.empty()) {
+      // RGB
+      if (streamId.getNumericName().find("214") != std::string::npos) {
+        std::memcpy(
+            frame->wdata(),
+            blurredImage.data,
+            frame->getWidth() * frame->getStride());
+      }
+      // Gray
+      else if (streamId.getNumericName().find("1201") != std::string::npos) {
+        std::memcpy(
+            frame->wdata(),
+            blurredImage.data,
+            frame->getWidth() * frame->getHeight());
+      }
+    }
+    blurredImage.release();
+    c10::cuda::CUDACachingAllocator::emptyCache();
+    return true;
+  }
+  std::string logStatistics() const {
+    std::string statsString;
+    int totalFrames = 0;
+    int totalRGBFramesWithFaces = 0;
+    int totalRGBFaces = 0;
+    int totalSLAMFramesWithFaces = 0;
+    int totalSLAMFaces = 0;
+    int totalRGBFramesWithLicensePlate = 0;
+    int totalRGBLicensePlate = 0;
+    int totalSLAMFramesWithLicensePlate = 0;
+    int totalSLAMLicensePlate = 0;
+    for (const auto& outer : stats_) {
+      const std::string& frameId = outer.first;
+      const std::unordered_map<std::string, int>& categoryBoxCountMapping =
+          outer.second;
+      // Do something with the outer key and inner map
+      for (const auto& innerPair : categoryBoxCountMapping) {
+        const std::string& category = innerPair.first;
+        int boxCount = innerPair.second;
+        if (boxCount > 0) {
+          if (category == "faces") {
+            if (frameId.find("214") != std::string::npos) {
+              totalRGBFramesWithFaces++;
+              totalRGBFaces += boxCount;
+            } else if (frameId.find("1201") != std::string::npos) {
+              totalSLAMFramesWithFaces++;
+              totalSLAMFaces += boxCount;
+            }
+          }
+          if (category == "licensePlate") {
+            if (frameId.find("214") != std::string::npos) {
+              totalRGBFramesWithLicensePlate++;
+              totalRGBLicensePlate += boxCount;
+            } else if (frameId.find("1201") != std::string::npos) {
+              totalSLAMFramesWithLicensePlate++;
+              totalSLAMLicensePlate += boxCount;
+            }
+          }
+        }
+      }
+      totalFrames++;
+    }
+    std::ostringstream summary;
+    summary << " ----------------" << "\n|    Summary     |"
+            << "\n ----------------" << "\nTotal frames: " << totalFrames
+            << "\n Faces:" << "\n RGB - Total detected frame: "
+            << totalRGBFramesWithFaces
+            << "\n RGB - Total detections: " << totalRGBFaces
+            << "\n SLAM - Total detected frame: " << totalSLAMFramesWithFaces
+            << "\n SLAM - Total detections: " << totalSLAMFaces
+            << "\n License Plates:" << "\n RGB - Total detected frame: "
+            << totalRGBFramesWithLicensePlate
+            << "\n RGB - Total detections: " << totalRGBLicensePlate
+            << "\n SLAM - Total detected frame: "
+            << totalSLAMFramesWithLicensePlate
+            << "\n SLAM - Total detections: " << totalSLAMLicensePlate;
+    return summary.str();
+  }
+};
+} // namespace EgoBlur

tools/vrs_mutation/main.cpp ADDED Viewed

	@@ -0,0 +1,145 @@

+/*
+ * Copyright (c) Meta Platforms, Inc. and affiliates.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#include <cstdlib>
+#include <memory>
+#include <string>
+#include <fmt/core.h>
+#include <vrs/utils/FilterCopy.h> // @manual
+#include <vrs/utils/RecordFileInfo.h> // @manual
+#include <projectaria_tools/tools/samples/vrs_mutation/ImageMutationFilterCopier.h> // @manual
+#include "EgoBlurImageMutator.h"
+#include <CLI/CLI.hpp>
+int main(int argc, const char* argv[]) {
+  // std::string mutationType;
+  std::string vrsPathIn;
+  std::string vrsPathOut;
+  std::string vrsExportPath;
+  std::string faceModelPath;
+  std::string licensePlateModelPath;
+  float faceModelConfidenceThreshold;
+  float licensePlateModelConfidenceThreshold;
+  float scaleFactorDetections;
+  float nmsThreshold;
+  bool useGPU = false;
+  CLI::App app{
+      "VRS file Mutation example by using VRS Copy + Filter mechanism"};
+  app.add_option("-i,--in", vrsPathIn, "VRS input")->required();
+  app.add_option("-o,--out", vrsPathOut, "VRS output")->required();
+  app.add_option("-f, --faceModelPath", faceModelPath, "Face model path");
+  app.add_option(
+         "--face-model-confidence-threshold",
+         faceModelConfidenceThreshold,
+         "Face model confidence threshold")
+      ->default_val(0.1);
+  app.add_option(
+      "-l, --licensePlateModelPath",
+      licensePlateModelPath,
+      "License Plate model path");
+  app.add_option(
+         "--license-plate-model-confidence-threshold",
+         licensePlateModelConfidenceThreshold,
+         "License plate model confidence threshold")
+      ->default_val(0.1);
+  app.add_option(
+         "--scale-factor-detections",
+         scaleFactorDetections,
+         "scale factor for scaling detections in dimensions")
+      ->default_val(1.15);
+  app.add_option(
+         "--nms-threshold",
+         nmsThreshold,
+         "NMS threshold for filtering overlapping detections")
+      ->default_val(0.3);
+  app.add_flag("--use-gpu", useGPU, "Use GPU for inference");
+  app.add_option("-e,--exportPath", vrsExportPath, "VRS export output path");
+  CLI11_PARSE(app, argc, argv);
+  if (vrsPathIn == vrsPathOut) {
+    std::cerr << " <VRS_IN> <VRS_OUT> paths must be different." << std::endl;
+    return EXIT_FAILURE;
+  }
+  vrs::utils::FilteredFileReader filteredReader;
+  // Initialize VRS Reader and filters
+  filteredReader.setSource(vrsPathIn);
+  filteredReader.openFile();
+  filteredReader.applyFilters({});
+  // Configure Copy Filter and initialize the copy
+  const std::string targetPath = vrsPathOut;
+  vrs::utils::CopyOptions copyOptions;
+  copyOptions.setCompressionPreset(vrs::CompressionPreset::Default);
+  // Functor to perform image processing(blurring PII faces/license plates)
+  try {
+    std::shared_ptr<vrs::utils::UserDefinedImageMutator> imageMutator;
+    if (setenv("ONEDNN_PRIMITIVE_CACHE_CAPACITY", "1", 1) == 0) {
+      // See github issue https://github.com/pytorch/pytorch/issues/29893 for
+      // details
+      std::cout << "Successfully Set ONEDNN_PRIMITIVE_CACHE_CAPACITY to 1"
+                << std::endl;
+    }
+    if (setenv("TORCH_CUDNN_V8_API_DISABLED", "1", 1) == 0) {
+      std::cout << "Successfully Set TORCH_CUDNN_V8_API_DISABLED to 1"
+                << std::endl;
+    }
+    imageMutator = std::make_shared<EgoBlur::EgoBlurImageMutator>(
+        faceModelPath,
+        faceModelConfidenceThreshold,
+        licensePlateModelPath,
+        licensePlateModelConfidenceThreshold,
+        scaleFactorDetections,
+        nmsThreshold,
+        useGPU);
+    auto copyMakeStreamFilterFunction =
+        [&imageMutator](
+            vrs::RecordFileReader& fileReader,
+            vrs::RecordFileWriter& fileWriter,
+            vrs::StreamId streamId,
+            const vrs::utils::CopyOptions& copyOptions)
+        -> std::unique_ptr<vrs::utils::RecordFilterCopier> {
+      auto imageMutatorFilter =
+          std::make_unique<vrs::utils::ImageMutationFilter>(
+              fileReader,
+              fileWriter,
+              streamId,
+              copyOptions,
+              imageMutator.get());
+      return imageMutatorFilter;
+    };
+    const int statusCode = filterCopy(
+        filteredReader, targetPath, copyOptions, copyMakeStreamFilterFunction);
+    auto* const egoBlurMutator =
+        dynamic_cast<EgoBlur::EgoBlurImageMutator*>(imageMutator.get());
+    std::cout << egoBlurMutator->logStatistics() << std::endl;
+    return statusCode;
+  } catch (const std::exception& ex) {
+    std::cerr << "Error while applying EGOBLUR mutation : " << " to : "
+              << vrsPathIn << "\nError :\n"
+              << ex.what() << std::endl;
+    return EXIT_FAILURE;
+  }
+}