Google I/O 2017

Google is now moving from mobile-first to AI-first. So in this year’s event, there are a lot of talks centering around Tensorflow, Deep Learning, etc. Definitely check out the videos from Google Developers on YouTube! https://www.youtube.com/user/GoogleDevelopers

Second Step of Face Recognition: Align

Align the photos: this means scale, rotate, or manipulate all the samples in a way such that each of the eyes, nose and mouth will be found in the same place in a picture. The easiest way is to find the locations of the eyes and the mouth using Viola-Jones algorithm (or other face recognition algorithm such as HOG) and then scale up or down the image so that the distance between the eyes is the same across all images, and then rotate the images so the mouth is always in the middle.

One of the drawbacks is that the transformation is flat (2D). What if the face is slanted or looking down? Then face in the aligned picture will be distorted.

FaceBook DeepFace (https://research.fb.com/wp-content/uploads/2016/11/deepface-closing-the-gap-to-human-level-performance-in-face-verification.pdf?) has done a more advanced alignment job. Their alignment includes 3D alignment and approximation. And they claim that because of this special alignment, their face recognition accuracy has improved a lot.

First Step of Face Recognition: Detect

The first stage of face recognition is detection of face in movies. Here are the steps:

  1. Find a video you like
  2. Use one of the online Youtube to mp4 converter sites to convert the video to a local mp4 file. You can google to find many free converters online
  3. I have developed a routine in Matlab that will crop out the faces appear in each video frame (or in every x number of frames, since you don’t really need to save so many faces that look very similar)

In the Computer Vision toolbox of Matlab, there is a thing called vision.CascadeObjectDetector. This uses the Viola-Jones algorithm to detect people’s faces, noses, eyes, mouth, or upper body. This Viola-Jones algorithm should be similar to the Haar Cascades Identifier available in OpenCV (http://docs.opencv.org/trunk/d7/d8b/tutorial_py_face_detection.html).

The routine can give me a lot of faces (note that the face needs to be full frontal, cannot be side view), but at the same time it also gives me some non-face images as well.

Here are some images collected:

In general, I believe it can basically collect the faces of all characters in a selected movie, but it will also give you some pictures that do not look like faces at all, and it doesn’t work too well with people with glasses.

MATLAB

Now with OpenBR being quite difficult to use and understand, I tried to find other packages which can do similar things. After some browsing on the web, I have decided to use MATLAB to learn face recognition for the following reasons:

  • There are a lot of learning materials developed by Matlab which are available freely. They have webinars, white papers, sample codes.
  • There are codes submitted by 3rd party in their file exchange website (and it’s free).
  • They have toolboxes for image processing, neural networks, etc. Essentially they contain libraries and better user interfaces specifically for the respective fields. It seems like it is easier to learn and visualize with these ‘toolboxes’.
  • While Matlab is still not free, and there are other alternatives like R and Octave. But they offer a 30 day free trial.

The common complain about Matlab is that it is not fast and not for production use. But I guess at this moment this is not something I need to worry about yet.