{ "cells": [ { "cell_type": "markdown", "id": "75257e56-c228-4d99-808f-2d6cf7576087", "metadata": {}, "source": [ "# Person Counting System using YOLOV8 and OpenVINO™" ] }, { "cell_type": "markdown", "id": "719e91cb-62ae-4049-b976-a89c7ebc0a59", "metadata": {}, "source": [ "In this project, we utilized YOLOV8 Object Counting class to develop a real-time people counting system using the YOLOv8 object detection model, optimized for Intel's OpenVINO toolkit to enhance inferencing speed. This system effectively monitors the number of individuals entering and exiting a room, leveraging the optimized YOLOv8 model for accurate people detection under varied conditions.\n", "\n", "By utilizing the OpenVINO runtime on Intel hardware, the system achieves significant improvements in processing speed, making it ideal for applications requiring real-time data, such as occupancy management and traffic flow control in public spaces and commercial settings.\n", "\n", "> **NOTE**: To use this notebook with a webcam, you need to run the notebook on a computer with a webcam. If you run the notebook on a server, the webcam will not work. However, you can still do inference on a video.\n", "\n", "\n", "#### Table of contents:\n", "\n", "- [Install pre-requisites](#Install-pre-requisites)\n", "- [Download Model](#Download-Model)\n", "- [Inference function](#Inference-function)\n", "- [Run live demo](#Run-live-demo)\n", "- [Select inference device](#Select-inference-device) \n" ] }, { "cell_type": "markdown", "id": "86727720-ccae-471b-a8b3-b1b6e41f9e94", "metadata": {}, "source": [ "## Install pre-requisites\n", "[back to top ⬆️](#Table-of-contents:)" ] }, { "cell_type": "code", "execution_count": null, "id": "74ee9d6e-9025-4179-8b0a-c8517168e5be", "metadata": {}, "outputs": [], "source": [ "%pip install -q \"openvino>=2024.0.0\" \"ultralytics==8.2.24\" \"torch>=2.1\" --extra-index-url https://download.pytorch.org/whl/cpu" ] }, { "cell_type": "markdown", "id": "85b9a300-05e9-4dfb-bfb1-ceba64ef5acc", "metadata": {}, "source": [ "## Download Model\n", "Download and convert YOLOV8 to OpenVINO Intermediate Representation" ] }, { "cell_type": "markdown", "id": "67d375ac", "metadata": {}, "source": [ "[back to top ⬆️](#Table-of-contents:)" ] }, { "cell_type": "code", "execution_count": null, "id": "06461f51-72ab-401c-8c50-bb04c2f6ae88", "metadata": { "scrolled": true }, "outputs": [], "source": [ "from pathlib import Path\n", "from ultralytics import YOLO\n", "\n", "models_dir = Path(\"./models\")\n", "models_dir.mkdir(exist_ok=True)\n", "\n", "DET_MODEL_NAME = \"yolov8n\"\n", "\n", "det_model = YOLO(models_dir / f\"{DET_MODEL_NAME}.pt\")\n", "label_map = det_model.model.names\n", "\n", "# Need to make en empty call to initialize the model\n", "res = det_model()\n", "det_model_path = models_dir / f\"{DET_MODEL_NAME}_openvino_model/{DET_MODEL_NAME}.xml\"\n", "if not det_model_path.exists():\n", " det_model.export(format=\"openvino\", dynamic=True, half=True)" ] }, { "cell_type": "markdown", "id": "a40c8848", "metadata": {}, "source": [ "## Inference function\n", "[back to top ⬆️](#Table-of-contents:)" ] }, { "cell_type": "code", "execution_count": 3, "id": "2ecf6009-cd02-4fa3-9707-a883c5877dc8", "metadata": {}, "outputs": [], "source": [ "from ultralytics import YOLO, solutions\n", "import cv2\n", "import time\n", "import collections\n", "import numpy as np\n", "from IPython import display\n", "import torch\n", "import openvino as ov\n", "import ipywidgets as widgets\n", "\n", "\n", "def run_inference(source, device):\n", " core = ov.Core()\n", "\n", " det_ov_model = core.read_model(det_model_path)\n", " ov_config = {}\n", "\n", " if device.value != \"CPU\":\n", " det_ov_model.reshape({0: [1, 3, 640, 640]})\n", " if \"GPU\" in device.value or (\"AUTO\" in device.value and \"GPU\" in core.available_devices):\n", " ov_config = {\"GPU_DISABLE_WINOGRAD_CONVOLUTION\": \"YES\"}\n", " compiled_model = core.compile_model(det_ov_model, device.value, ov_config)\n", "\n", " def infer(*args):\n", " result = compiled_model(args)\n", " return torch.from_numpy(result[0])\n", "\n", " # Use openVINO as inference engine\n", " det_model.predictor.inference = infer\n", " det_model.predictor.model.pt = False\n", "\n", " try:\n", " cap = cv2.VideoCapture(source)\n", " assert cap.isOpened(), \"Error reading video file\"\n", "\n", " line_points = [(0, 300), (1080, 300)] # line or region points\n", " classes_to_count = [0] # person is class 0 in the COCO dataset\n", "\n", " # Init Object Counter\n", " counter = solutions.ObjectCounter(\n", " view_img=False, reg_pts=line_points, classes_names=det_model.names, draw_tracks=True, line_thickness=2, view_in_counts=False, view_out_counts=False\n", " )\n", " # Processing time\n", " processing_times = collections.deque(maxlen=200)\n", "\n", " while cap.isOpened():\n", " success, frame = cap.read()\n", " if not success:\n", " print(\"Video frame is empty or video processing has been successfully completed.\")\n", " break\n", "\n", " start_time = time.time()\n", " tracks = det_model.track(frame, persist=True, show=False, classes=classes_to_count, verbose=False)\n", " frame = counter.start_counting(frame, tracks)\n", " stop_time = time.time()\n", "\n", " processing_times.append(stop_time - start_time)\n", "\n", " # Mean processing time [ms].\n", " _, f_width = frame.shape[:2]\n", " processing_time = np.mean(processing_times) * 1000\n", " fps = 1000 / processing_time\n", " cv2.putText(\n", " img=frame,\n", " text=f\"Inference time: {processing_time:.1f}ms ({fps:.1f} FPS)\",\n", " org=(20, 40),\n", " fontFace=cv2.FONT_HERSHEY_COMPLEX,\n", " fontScale=f_width / 1000,\n", " color=(0, 0, 255),\n", " thickness=2,\n", " lineType=cv2.LINE_AA,\n", " )\n", "\n", " # Get the counts. Counts are getting as 'OUT'\n", " # Modify this logic accordingly\n", " counts = counter.out_counts\n", "\n", " # Define the text to display\n", " text = f\"Count: {counts}\"\n", " fontFace = cv2.FONT_HERSHEY_COMPLEX\n", " fontScale = 0.75 # Adjust scale as needed\n", " thickness = 2\n", "\n", " # Calculate the size of the text box\n", " (text_width, text_height), _ = cv2.getTextSize(text, fontFace, fontScale, thickness)\n", "\n", " # Define the upper right corner for the text\n", " top_right_corner = (frame.shape[1] - text_width - 20, 40)\n", " # Draw the count of \"OUT\" on the frame\n", " cv2.putText(\n", " img=frame,\n", " text=text,\n", " org=(top_right_corner[0], top_right_corner[1]),\n", " fontFace=fontFace,\n", " fontScale=fontScale,\n", " color=(0, 0, 255),\n", " thickness=thickness,\n", " lineType=cv2.LINE_AA,\n", " )\n", "\n", " # Show the frame\n", " _, encoded_img = cv2.imencode(ext=\".jpg\", img=frame, params=[cv2.IMWRITE_JPEG_QUALITY, 100])\n", " # Create an IPython image.\n", " i = display.Image(data=encoded_img)\n", " # Display the image in this notebook.\n", " display.clear_output(wait=True)\n", " display.display(i)\n", " except KeyboardInterrupt:\n", " print(\"Interrupted\")\n", "\n", " cap.release()\n", " cv2.destroyAllWindows()" ] }, { "cell_type": "markdown", "id": "e542d587-7da3-4477-bfad-f9858fd752ae", "metadata": {}, "source": [ "## Run live demo\n", "[back to top ⬆️](#Table-of-contents:)" ] }, { "cell_type": "code", "execution_count": 4, "id": "e1c479d9-a401-43a4-af69-e4130513c5b5", "metadata": {}, "outputs": [], "source": [ "WEBCAM_INFERENCE = False\n", "\n", "if WEBCAM_INFERENCE:\n", " VIDEO_SOURCE = 0 # Webcam\n", "else:\n", " VIDEO_SOURCE = \"https://github.com/intel-iot-devkit/sample-videos/raw/master/people-detection.mp4\"" ] }, { "cell_type": "markdown", "id": "c1fe0dbe-fdcb-468f-804f-d214a6f2257b", "metadata": {}, "source": [ "> **NOTE**: make sure to restart kernel and run all cells when switching between video and webcam to avoid any errors." ] }, { "cell_type": "markdown", "id": "7f673ce4-a237-4ef5-adb8-f51c0fe5b3a9", "metadata": {}, "source": [ "## Select inference device\n", "[back to top ⬆️](#Table-of-contents:)" ] }, { "cell_type": "code", "execution_count": null, "id": "312fe227-b67e-47f6-8742-5c5fe01edc41", "metadata": {}, "outputs": [], "source": [ "core = ov.Core()\n", "\n", "device = widgets.Dropdown(\n", " options=core.available_devices + [\"AUTO\"],\n", " value=\"AUTO\",\n", " description=\"Device:\",\n", " disabled=False,\n", ")\n", "\n", "device" ] }, { "cell_type": "code", "execution_count": null, "id": "45bf1064-8ae5-4036-ba8d-7f330c164dfd", "metadata": {}, "outputs": [], "source": [ "run_inference(\n", " source=VIDEO_SOURCE,\n", " device=device,\n", ")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" }, "openvino_notebooks": { "imageUrl": "https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/e0525f8a-4578-4c56-a0a5-ce68e30d2d45", "tags": { "categories": [ "Live Demos" ], "libraries": [], "other": [], "tasks": [ "Object Detection" ] } } }, "nbformat": 4, "nbformat_minor": 5 }