# 008 How to detect faces, eyes and smiles using Haar Cascade Classifiers with OpenCV in Python
Highlights: If you have any type of camera that does face detection it is probably using Haar feature-based cascade classifier for object detection. In this post we are going to learn what are these Haar cascade classifiers and how to use them to detect faces, eyes and smiles.
Tutorial Overview:
- Understanding Haar cascade classifiers
- How to detect faces, eyes and smiles with Haar cascade classifiers?
1. Understanding Haar Cascade Classifiers
Two decades ago, face detection was a tricky job. It was ongoing research at that time, and you had to be absolute programming expert to be able to perform object detection on images. Then in 2002, Paul Viola and Michael Jones came up with the research article entitled “Rapid Object Detection using a Boosted Cascade of Simple Feature”. This work revolutionized image processing and it is still the most commonly used methods for face detection.
There are several problems with face detection that we need to solve. We are often dealing with a high-resolution image, we do not know the size of the face on the image, and we do not know how many faces are there in the image. Moreover, we need to consider different ethnic or age groups people with beard, or people with glasses on. So, when It comes to face detection it is very difficult to obtain accurate and quick results.
But thanks to the Viola and Jones, this is not a big problem anymore. They came up with Haar Cascade (Viola- Jones algorithm) – a machine learning object detection algorithm that can be used to identify objects in images or videos. It consists many simple features called Haar features which are used to determine whether the object (face, eyes) is present in the image/video or not.
The Viola- Jones algorithm consists of the following steps:
- Training Haar classifier
- Haar feature selection
- Creating an integral image
- Applying Adaboost algorithm
- Cascade classifiers
Let us have a look at the following example of a face detection method with Viola- Jones algorithm.
Training Haar classifier
First thing that we need to do is to train the Haar classifier using a large number of images. It is a machine learning based approach where a cascade function is trained from a lot of positive and negative images. Positive images include faces , and negative images do not have any face. We use these training images to train Haar classifiers which will be used to detect faces in test images or videos.
Haar feature selection
Once Haar classifier collected positive and negative images, next step is to collect Haar features from these images using sliding windows of simple rectangular blocks. A Haar features are calculated by subtracting the sum of a pixel intensities under white rectangles from the black rectangles. Basically, we are looking for that part of an image where one bit is brighter or darker than another part.
However, calculating the large groups of pixels in an image and using a large number sliding windows, is a quite a slow process. It results in a hundred of thousands of calculations. Therefore, to remove this complexity, Viola and Jones introduced the concept of an integral image that makes this process much faster.
Creating an integral image
To create the integral image first we need to precompute pixel intensities in our input image and then store them in intermediate form. That intermediate form (integral image) will help us to add or subtract certain rectangular areas of the image. You can better understand this with the help of following example.
Every pixel in the integral image is the sum of all neighboring pixels that are above and left of that pixel, including that pixel itself. Using this method we can easily and quickly calculate any area of the image that we want.
Applying Adaboost algorithm
Although, we now have a way to calculate features faster using integral images, we come up with more than 160 000 features and most of them are irrelevant. For example, feature in the image below is relevant for the region of the eyes because the eyes are darker than the bridge of the nose. But the same feature is irrelevant when applied on cheeks or any other area of the image.
So, how can we solve this? Well obviously, we need to select the best features among them. This can be done using an algorithm known as Adaboost which selects the most relevant features and trains the classifiers that use them. This algorithm uses weak classifiers to build strong classifiers by assigning higher weighted penalties on incorrect classifications. In that way we can reduce the number of features from 160 000 to 6000.
Cascade classifiers
But even with these 6000 features selected using Adaboost, we still have a lot of processing to do. To solve this problem researchers developed a system called cascade classifiers. The cascade classifiers consist of a collection of stages that can be s in the image below.
In the first location of the sliding window we start with only one feature. That is the stage one. If there is nothing in that region the detector immediately slides the window to the next location. So, for most of the image part, there will be no computation. On the other hand, if here is anything that looks like a face in that region, we pass this stage through the stage two. And we can keep going until we fail. Finally, if we pass through all 38 stages and nothing fails, we detected the face.
Now, let us see how we can implement Haar cascade classifiers using Python.
2. How to detect faces, eyes and smiles with Haar cascade classifiers?
Before you start programing, be sure to download following three files from GitHub directory of Haar cascades, and load them into your python script.
Now let us import necessary libraries and load our image.
# Necessary imports
import cv2
import numpy as np
from google.colab.patches import cv2_imshow
# Loading the image
img = cv2.imread("emily_clark.jpg")
cv2_imshow(img)
Output:
In the following lines of the code we call face_cascade
, eye_cascade
and smile_ cascade
classifiers.
face_cascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
eye_cascade = cv2.CascadeClassifier("haarcascade_eye.xml")
smile_cascade = cv2.CascadeClassifier('haarcascade_smile.xml')
Next, we need to convert our image into grayscale because Haar cascades work only on gray images. So, we are going to detect faces, eyes and smiles in a grayscale images, but we will draw rectangles around the detected faces on the color images.
In the first step we will detect the face. To extract coordinates of a rectangle that we are going to draw around the detected face, we need to create object faces. In this object we are going to store our detected faces. With a function detectMultiScale()
we will obtain tuple of four elements: \(x \) and \(y \) are coordinates of a top left corner, and \(w \) and \(h \) are width and height of the rectangle. This method requires several arguments. First one is the gray image, the input image on which we will detect faces. Second argument is the scale factor which tells us how much the image size is reduced at each image scale. Third and last argument is the minimal number of neighbors. This parameter specifying how many neighbors each candidate rectangle should have to retain it.
# Creating an object faces
faces= face_cascade.detectMultiScale (gray, 1.1, 10)
# Drawing rectangle around the face
for(x , y, w, h) in faces:
cv2.rectangle(img, (x,y) ,(x+w, y+h), (0,255,0), 3)
cv2_imshow(img)
Output:
Now let’s detect the eyes. In order to do that, first we need to create two regions of interest Now we will detect the eyes. To detect the eyes, first we need to create two regions of interest which will be located inside the rectangle. We need first region for the gray image, where we going to detect the eyes, and second region will be used for the color image where we are going to draw rectangles.
# Creating two objects of interest
roi_gray=gray[y:(y+h), x:(x+w)]
roi_color=img[y:(y+h), x:(x+w)]
Now, we can apply the same method foe eye and smile detection.
eyes = eye_cascade.detectMultiScale(roi_gray, 1.1, 10)
for (x_eye, y_eye, w_eye, h_eye) in eyes:
cv2.rectangle(roi_color,(x_eye, y_eye),(x_eye+w_eye, y_eye+h_eye), (0, 0, 255), 3)
cv2_imshow(img)
Output:
smile = smile_cascade.detectMultiScale(roi_gray, 1.8, 20)
for (x_smile, y_smile, w_smile, h_smile) in smile:
cv2.rectangle(roi_color,(x_smile, y_smile),(x_smile + w_smile, y_smile + h_smile), (255, 0, 130), 3)
cv2_imshow(img)
Output:
Summary
In this tutorial we talked about one of the most commonly used methods for face detection i.e. Haar cascade classifiers. This method is based on famous Viola- Jones machine learning object detection algorithm that is used for object detection in an images or videos. This method uses a large number of simple features called Haar features which are used to determine whether the face is present in the image /video or not. In the next tutorial we will talk about facial landmark detection.
References:
[1] Deep Learning Haar Cascade Explained – Will Berger