← All work
Engineering · 2024

Live RTSP Camera Motion Capture & AI Object Recognition

Overview

A Python system that pulls motion events from TP-Link Tapo ONVIF cameras, records the RTSP stream on motion, and runs AI object recognition to produce annotated snapshots with labelled bounding boxes. Built around the open-source peterstamps project and evaluated/run on a Raspberry Pi 4, including a higher-performance multiprocessing variant.

The Challenge

Reliable smart-camera capture on constrained edge hardware is hard: you must subscribe to ONVIF motion events, keep a rolling buffer of pre-motion frames, record RTSP without running out of memory, and optionally call an object-detection model, all on a fanless Raspberry Pi without dropping frames.

What We Built

Multiple runnable Python programs explore the design space: myMPTapoDetectCaptureVideo.py (preferred multiprocessing version) and myTapoDetectCaptureVideo.py combine ONVIF motion detection (myTapoMotionDetection.py) with RTSP capture (myTapoVideoCapture.py), driven by config modules (myTapoMotionConfig.py / myMPTapoMotionConfig.py). The capture loop maintains a frame deque as a pre-record buffer, flushing first-in-first-out and watching a memory-full percentage to avoid OOM. When enabled, frames are sent to an AI object server that returns compact JPEGs with detected objects marked and labelled. Sensitivity adapts automatically across dusk/dawn between day and night thresholds. ONVIF is handled via onvif-zeep-async/zeep with downloaded WSDL files.

Technologies & Approach

Python with asyncio and the ONVIF SOAP/zeep stack for event subscription, RTSP for live video, and an external AI object-recognition server for detection. A multiprocessing variant offloads work for smoother capture on the Pi. Configuration-first design makes thresholds, paths, and the AI toggle easy to tune per hardware.

Outcome / Impact

A working, tested smart-camera pipeline (validated on a Tapo C225 and Raspberry Pi 4) proving the studio can integrate IP cameras end to end, ONVIF events, RTSP recording, memory-safe buffering, and edge AI object detection. Positioned as evaluation and operation of an open-source project, with adaptation to real hardware constraints.

Capabilities Demonstrated

  • ONVIF motion-event subscription and camera integration
  • RTSP live-stream capture with memory-safe pre-record buffering
  • Motion-triggered recording and snapshot generation
  • AI object detection with labelled bounding boxes at the edge
  • Adaptive day/night sensitivity and multiprocessing performance tuning
  • Deployment on constrained edge hardware (Raspberry Pi)
More work See all →