In her TED talk about image-net, Fei Fei Li, Director of Stanford AI Lab, and Chief Scientist of Google, says that the size of a typical neural network they use in image-net has 24 million nodes, 140 million parameters and 15 billion connections!
An interview of Geoffrey Hinton – A University of Toronto professor who has been in the Artificial Intelligence field for more than 30 years.
““Because he’s been in AI in the dark days, when he looked like a mad scientist and people never thought this was going to happen. … Now all these things that were talked about for 20, 30 years are happening…”
I came across this post – Lemurs don’t all look the same and facial recognition technology can prove it – on my Facebook one day. It talks about scientists using an open source face recognition package called OpenBR to identify lemurs in wildlife.
So I thought maybe I can use this package for my task. However, it turned out that getting it installed on my Mac was a lot harder than what their installation documentation suggested. OpenBR is built on top of three packages: Qt (looks like it is a UI framework that promises write once and deploy on multiple platforms (Windows, Mac, Linux, and embedded devices)), OpenCV and CMake (a multi-platform C++ environment). I was able to install these three components on my Mac. But at the end I failed to compile OpenBR as some of the library file locations have moved with the newer XCode and it seems like it is detecting the wrong MacOS version. I spent a few hours looking at the makefiles but at the end I gave up. I am not too familiar with Cmake and C++ anyway. I should not spend too much time on it.
Then I got access to a Debian machine. But then the installation page only talks about Ubuntu. After a few more days, I got access to a Ubuntu server. Finally I was able to install OpenBR properly!
Then came the next problem. The first example in the OpenBR page required a webcam. The Ubuntu server was not next to me so there was no way I could connect a webcam to that server to try out stuff.
Of course there were other examples that did not require webcam. There was a section on face recognition as well which used their command line API. The first simple example was about comparing two faces and it looked fine to me. But the examples after this one were more complicated and I don’t really know what to put in the arguments (there are some .csv files, .gal files – I purely didn’t have any idea what I should put inside those files – and I couldn’t find the usage of those files).
I tried to find more documentation on the web, but there wasn’t really any good information. So maybe I should learn OpenCV as well as OpenBR is a layer above OpenCV? And lastly, the framework uses C++. So I should brush up my C++ skills as well? Frankly I haven’t written anything meaningful in C++ in my life before. The hurdle is quite high for me to master OpenBR (and OpenCV). I began to look for other options. I should not be constrained by the programming language used; besides the more popular languages used in the machine learning community are Python and Matlab-like languages like R, Octave and Matlab.
To the admins of OpenBR, it would be great if you could provide more examples and more detailed documentation online. As a beginner, I definitely find it hard to understand how to use OpenBR properly by just reading the tutorial section.
I took a Machine Learning online course on Coursera earlier last year. The course is great for beginners like me and I am able to grasp the basic concepts of Machine Learning after taking this course. There is not a lot of math and proofs in the lectures; instead it focuses more on how you can run the algorithms and apply these algorithms to the real world. Its assignments are also quite fun to do and show you how you can build systems for spam filtering, film recommendation system, digit recognition, server cluster monitoring, etc.
The assignments are done in Octave or Matlab, which is a very good language for dealing with matrices.
This course is conducted by Andrew Ng, a prominent figure in the Artificial Intelligence field. He is a professor at Stanford, a founder of Coursera and until recently was the head of Artificial Intelligence of Baidu in Silicon Valley. I wonder what he is up to next after leaving Baidu!
Back to the course, I highly recommend anyone who wants to learn the basics of Machine Learning!
The objective of my first assigned is to group the major characters in a movie by face recognition. So, I am given a movie file and by running my ‘module’, I would be able to say that these characters A, B & C appear the most in the movies and here are the images of each of these characters (grouped by characters) captured in the movie.
Here is how I think I will tackle the problem step-by-step:
i) Face Detection
I will run the movie frame by frame. On every 10-20 frames, I will run a face detector program to capture the faces appearing in the picture frame. The faces in each frame will be cropped and saved in the computer. At the end of this process I will have a collection of faces from the movie
The cropped faces may need to be processed. Some of the things I believe I need to do are:
- Have all the images to have the same size
- Align the features of the face (e.g. eyes, nose, mouth) to pretty much the same area in every image
- Rotate the faces that are tilted
Then I run a classification algorithm. The algorithm should hopefully classify the faces into groups. Each group should contain faces with similar characteristics (i.e. the group should contain the face of one character (if there are twins in the movie, then I will be in trouble!).
I hope it will just be this easy. I am sure will encounter a lot of problems as I progress! =p
AI (Artificial Intelligence), Neural Networks, Machine Learning are the buzzwords in the tech world these days. Even the general population often hear these words on TV, radio or in newspapers. Some talk about how AI can replace our workforce or even rule the humankind. Some talk about how AI can streamline their businesses and increase people’s productivity. Suddenly we humankind have equipped a new technology that sounds scary yet helpful. Seeing this as a new direction in computing, I have decided to learn these technologies. So this blog is simply a place to record the findings, the stumbles I have encountered in this learning experience.