Image recognition for everyone

Warning : I’m NOT a data scientist, but I huge fan of cool technology !

Today I want to write of a new functionality that amazes me and that can help you to literally do “magic” things that you can think can be exclusive of super data scientists expert of deep learning and frameworks like TensorFlow, CNTK, Caffe,etc…

Imagine the following: someone  trains huge neural networks (imagine like mini brains) for weeks/months using a lot of GPUs on thousands and thousands of images.

These mini brains are then used to classify images and say something like: a car is present, a human is present, a cat etc… . Now one of “bad things” of neural networks is that usually you cannot understand how they really work internally and what is the “thinking process” of a neural network.


However latest studies on neural networks have found a way to “extract” this knowledge and Microsoft has delivered right now in April this knowledge or better these models.

Now I want to show you an example on how to do this.

Let’s grab some images of the Simpsons :


and some other images of the Flintstones:


For example 13 images of Simpson cartoon and 11 of Flintstones. And let’s build a program that can predict given a new image that is not part of the two image sets if it is a Simpson or Flintstone image. I’ve chosen cartoons but you can apply this to any image that you want to process (watches? consoles? vacation places? etc…).

The idea is the following: I take the images I have and give these images to the “model” that has been trained . Now the result of this process will be , for each image, a collection of “numbers” that are the representation of that image according to the neural network. An example to understand that: our DNA is a small fraction of ourselves but it can “represent” us, so these “numbers” are the DNA of the image.

Now that we have the image represented by a simple array of numbers, we can use a “normal” machine learning technique like linear regression to leverage this simplified representation of the image and learn how to classify them.

Applying the sample R code described in the article to only a small sample of images (13 and 11 respectively) using 80% for training and 20% for scoring we obtained the following result:


A good 75% on a very small amount of images !