If you’re working with computer vision, you know that tracking objects in a video stream can be a challenging task. Kalman Filters can be an effective solution to this problem, and when combined with OpenCV and Python, they become even more powerful. In this blog post, we will walk through a Kalman Filter OpenCV Python example to track the movement of people in a video stream. By the end of this tutorial, you’ll have a deeper understanding of how Kalman Filters work, and you’ll be equipped with the knowledge needed to use them in your own computer vision projects. So, let’s dive in and learn how to use Kalman Filters with OpenCV and Python via example!
What is a Kalman Filter?
A Kalman filter is an algorithm that is used to estimate the state of a time-varying system in the presence of noise and uncertainty. It was developed by Rudolf Kalman in the 1960s and has since become one of the most widely used and influential algorithms in the fields of control theory and signal processing.
The basic idea behind the Kalman filter is to use a probabilistic model to estimate the true state of a system at each time step, based on measurements that are subject to noise and other forms of uncertainty. The filter works by recursively estimating the state of the system at each time step, using a combination of the previous estimate and the most recent measurement.
The Kalman filter is particularly useful in applications where the measurements are subject to noise and the underlying system dynamics are uncertain or complex. It has been applied in numerous fields, including aerospace, robotics, economics, and communication systems, to name just a few.
Using Kalman Filters in OpenCV
The OpenCV library provides us with a KalmanFilter class that we can take advantage of to build our matrices. The filter operates by estimating the state of an object at each time step by combining measurements of that object’s position and motion with predictions from a mathematical model. To illustrate how the Kalman Filter works for object tracking, let’s take a look at some code that will be part of a class we construct. In particular, we will examine how histogram backprojection is used to locate an object in each frame of a sequence by updating the filter state.
The code first initializes a histogram, which is essentially a graphical representation of pixel values in the image. Specifically, it extracts the region of interest (ROI) from the current frame by selecting the rectangle defined by track_window coordinates.
x, y, w, h = track_window roi = hsv_frame[y:y+h, x:x+w]
Next, it generates an HSV histogram based on the selected region using
roi_hist = cv2.calcHist([roi], [0, 2], None, [15, 16],[0, 180, 0, 256])
The second parameter [0, 2] specifies for channels to compute color histogram: hue and value. Channels are zero-indexed so channel 0 corresponds to hue (range [0 – 179]) and channel 2 corresponds to value or saturation (range [0-255]). ‘None’ is passed as mask because we want to collect histograms for all pixels available on our selection.
The third parameter refers to the mask that limits values contributing to histograms – None means we want all pixel values included in the computation.
The fourth parameter defines size of two array specifying size of each bin. In our case number of bins along hue dimension is given as 15 whereas along saturation dimension it is given as 16.
Finally out range noticed on hue dimension is [0-180] as referred to HSV color description where saturation and value are both given having a range of [0-255], hence 256.
This histogram is then normalized using
self.roi_hist = cv2.normalize(roi_hist, roi_hist, 0, 255, cv2.NORM_MINMAX)
The final parameter
cv2.NORM_MINMAX specifies that we are normalizing the data between 0 and 255. The output of this step is then used to update the Kalman Filter state at each step of the tracking process by computing the back-projected probability density function (PDF) for each new frame.
Kalman Filter Matrices
To use a Kalman Filter for object tracking, we need to set up some matrices that define the process model and measurement model of our system. Specifically, we need:
- A state vector x that represents the position and velocity of the object to be tracked. In this case, our state vector has four variables: x position, y position, x velocity, and y velocity.
- A transition matrix A that maps the current state vector to its next state vector based on a linear motion model. In other words, it predicts how the object will move given its current velocity.
- A measurement matrix H that maps the state vector to the observed measurements (e.g. x and y positions).
- A process noise covariance matrix Q that models the uncertainty in our motion model.
Let’s explore how to set up these parameters for a 2D object tracking scenario.
Firstly, we create a Kalman filter object with 4 state variables and 2 measurements variables using
cv2.KalmanFilter(4, 2). Then we set up the measurement matrix
self.kalman.measurementMatrix as a 2×4 matrix that maps x and y coordinates to our 4-dimensional state vector (i.e. only positions are directly observable).
self.kalman = cv2.KalmanFilter(4, 2) self.kalman.measurementMatrix = np.array( [[1, 0, 0, 0], [0, 1, 0, 0]], np.float32)
The transition matrix
self.kalman.transitionMatrix defines how our state vectors evolve from time step t to t+1 based on a simple linear motion model where objects move linearly at constant velocity. The first two rows map position estimates onto future position estimates (position must advance by current speed values), while last two rows mantain unchanged predictions about velocities.
self.kalman.transitionMatrix = np.array( [[1, 0, 1, 0], [0, 1, 0, 1], [0, 0, 1, 0], [0, 0, 0, 1]], np.float32)
The process noise covariance matrix
self.kalman.processNoiseCov represents the uncertainty in our motion model and affects how the Kalman filter predicts the next state. In this particular case, it’s set as a diagonal matrix scaled by 0.03 meaning that we’re adding small errors to each of our 4 variables.
self.kalman.processNoiseCov = np.array( [[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]], np.float32) * 0.03
Next, we initialize the predicted state
self.kalman.statePre with a 4×1 column vector that represents the initial position estimate in x and y (i.e., the center of the tracked window), and zero velocities. Lastly, we set
self.kalman.statePost to have the same values as statePre since no measurements have been taken yet.
cx = x+w/2 cy = y+h/2 self.kalman.statePre = np.array([[cx], [cy], , ], np.float32) self.kalman.statePost = np.array([[cx], [cy], , ], np.float32)
After defining our Kalman filter, we can update it at every time step and estimate the location and velocity of our tracked object based on newly observed measurements.
Tracker Class for Kalman Filter OpenCV Python Example
Now let’s combine the code together to create a Tracker Class for our Kalman filter OpenCVpPython example that we can easily use in an overall script for tracking people from frame to frame in a video. Here is the Tracker() class code:
import cv2 import numpy as np class Tracker(): """ This class represents a tracker object that uses OpenCV and Kalman Filters. """ def __init__(self, id, hsv_frame, track_window): """ Initializes the Tracker object. Args: id (int): Identifier for the tracker. hsv_frame (numpy.ndarray): HSV frame. track_window (tuple): Tuple containing the initial position of the tracked object (x, y, width, height). """ self.id = id self.track_window = track_window self.term_crit = (cv2.TERM_CRITERIA_COUNT | cv2.TERM_CRITERIA_EPS, 10, 1) # Initialize the histogram. x, y, w, h = track_window roi = hsv_frame[y:y+h, x:x+w] roi_hist = cv2.calcHist([roi], [0, 2], None, [15, 16],[0, 180, 0, 256]) self.roi_hist = cv2.normalize(roi_hist, roi_hist, 0, 255, cv2.NORM_MINMAX) # Create a Kalman filter object with 4 state variables and 2 measurement variables. self.kalman = cv2.KalmanFilter(4, 2) # Set the measurement matrix of the Kalman filter. # It defines how the state variables are mapped to the measurement variables. # In this case, the measurement matrix is a 2x4 matrix that maps the x and y position measurements to the state variables. self.kalman.measurementMatrix = np.array( [[1, 0, 0, 0], [0, 1, 0, 0]], np.float32) # Set the transition matrix of the Kalman filter. # It defines how the state variables evolve over time. # In this case, the transition matrix is a 4x4 matrix that represents a simple linear motion model. self.kalman.transitionMatrix = np.array( [[1, 0, 1, 0], [0, 1, 0, 1], [0, 0, 1, 0], [0, 0, 0, 1]], np.float32) # Set the process noise covariance matrix of the Kalman filter. # It represents the uncertainty in the process model and affects how the Kalman filter predicts the next state. # In this case, the process noise covariance matrix is a diagonal matrix scaled by 0.03. self.kalman.processNoiseCov = np.array( [[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]], np.float32) * 0.03 cx = x+w/2 cy = y+h/2 # Set the initial predicted state of the Kalman filter. # It is a 4x1 column vector that represents the initial estimate of the tracked object's state. # The first two elements are the predicted x and y positions, initialized to the center of the tracked window. self.kalman.statePre = np.array([[cx], [cy], , ], np.float32) # Set the corrected state of the Kalman filter. # It is a 4x1 column vector that represents the current estimated state of the tracked object. # Initially, it is set to the same value as the predicted state. self.kalman.statePost = np.array([[cx], [cy], , ], np.float32)
Example Video for Tracking
We’ll need a video with people that hopefully don’t move around too much, but still want a few moving in order to capture their movement with the Kalman Filter. We also need this video to be copyright free, so we’ll dive into some C-SPAN classics:
Tracking and Drawing Contours
In order to actually use the filters for our in our Kalman filter OpenCV python example we’ll need to set-up the script for tracking in the video:
import cv2 # Open the video file. cap = cv2.VideoCapture('video.mp4') # Create the KNN background subtractor. bg_subtractor = cv2.createBackgroundSubtractorKNN() # Set the history length for the background subtractor. history_length = 20 bg_subtractor.setHistory(history_length) # Create kernel for erode and dilate operations. erode_kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)) dilate_kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 7)) # Create an empty list to store the tracked senators. senators =  # Counter to keep track of the number of history frames populated. num_history_frames_populated = 0 # Start processing each frame of the video. while True: # Read the current frame from the video. grabbed, frame = cap.read() # If there are no more frames to read, break out of the loop. if not grabbed: break # Apply the KNN background subtractor to get the foreground mask. fg_mask = bg_subtractor.apply(frame) # Let the background subtractor build up a history before further processing. if num_history_frames_populated < history_length: num_history_frames_populated += 1 continue # Create the thresholded image using the foreground mask. _, thresh = cv2.threshold(fg_mask, 127, 255, cv2.THRESH_BINARY) # Perform erosion and dilation to improve the thresholded image. cv2.erode(thresh, erode_kernel, thresh, iterations=2) cv2.dilate(thresh, dilate_kernel, thresh, iterations=2) # Find contours in the thresholded image. contours, hier = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) # Convert the frame to HSV color space for tracking. hsv_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV) # Draw red rectangles around large contours. # If there are no senators being tracked yet, create new trackers. should_initialize_senators = len(senators) == 0 id = 0 for c in contours: # Check if the contour area is larger than a threshold. if cv2.contourArea(c) > 500: # Get the bounding rectangle coordinates. (x, y, w, h) = cv2.boundingRect(c) # Draw a rectangle around the contour. cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 1) # If no senators are being tracked yet, create a new tracker for each contour. if should_initialize_senators: senators.append(Tracker(id, hsv_frame, (x, y, w, h))) id += 1 # Update the tracking of each senator. for senator in senators: senator.update(frame, hsv_frame) # Display the frame with senators being tracked. cv2.imshow('Senators Tracked', frame) # Wait for the user to press a key (110ms delay). k = cv2.waitKey(110) # If the user presses the Escape key (key code 27), exit the loop. if k == 27: break
This Python code implements object tracking using Kalman Filter on a video of senators in a hall. The code processes each frame of the input video through the following steps:
- Read the next frame of the video using cv2.VideoCapture(). If there are no more frames to read, exit the loop.
- Apply the KNN background subtractor to get the foreground mask (fg_mask). This is a simple way of separating objects from the background by subtracting learned background model from current frame.
- Let the background subtractor build up a history before further processing, as this gives more accurate foreground mask.
- Create a thresholded image using fg_mask and a threshold level. The threshold level is set at 127, which means any pixel value above 127 in fg_mask will be set to 255 (fully white), otherwise it will be set to 0 (fully black).
- Perform erosion and dilation operations on the thresholded image to improve its quality. These operations help remove noise and smooth out contour boundaries.
- Find contours in the thresholded image using cv2.findContours() method.
- Convert input frame to HSV color space for tracking purpose, as HSV color space separates color information into three channels – Hue, Saturation, Value making it more suitable for color-based object tracking than RGB color space
- Draw red rectangles around large contours with an area larger than 500 (assuming it represents an object of interest) and create new trackers for each if there are none in existing list of trackers
- Update each senator’s tracker based on current frame and HSV_frame by calling legislators.update() method
- Display each frame with tracked senator animation in cv2.imshow()
- In case user presses escape key(27), exit from loop.
Note that the Tracker in line
senators.append(Tracker(id, hsv_frame, (x, y, w, h))) is invoked to trace the identified senators.
Let’s take a look at some of the object detection and tracking that occurred in our Kalman Filter OpenCV Python Example once we run the code:
In conclusion, Kalman Filters are a powerful tool for object tracking and detection in Computer Vision. By incorporating assumptions about the underlying dynamics of an object’s motion and its measurement noise, we can perform accurate and efficient tracking even in challenging scenarios like crowded corridors or constant occlusions. In this blog post, we have demonstrated how to use Kalman Filters to track the positions of senators in a C-SPAN video using open-source Python libraries like OpenCV. We hope you have found this tutorial useful and feel confident in your ability to apply Kalman Filters to your own object tracking tasks. With some creativity and ingenuity, you can use these techniques to analyze a wide range of visual data and gain new insights into the behavior of objects and people. Interested in learning more about Computer Vision? Check out our Computer Vision with OpenCV and Python course! If you’re interested in learning how to become a data scientist, check out our download:
Your FREE Guide to Become a Data Scientist
Discover the path to becoming a data scientist with our comprehensive FREE guide! Unlock your potential in this in-demand field and access valuable resources to kickstart your journey.
Don’t wait, download now and transform your career!