The Machine Learning Blog

How to make an image classifier without coding. Part 1

no coding

In this article, I will explain how to build an image classifier without coding and using the existing tools available in the market. This tutorial is intended for beginners in the Machine Learning fields and artificial intelligence enthusiasts who do not have any coding skills.

Why did I decide to write this tutorial?

There are many people in the world that could benefit from Machine Learning but who do not know how to code. I am in a mission to make Artificial Intelligence available to all and I think this could be a good way to get people on board!

power to the data

In this tutorial, we will be using an incredibly powerful new service from Google Cloud, called Auto ML Vision, which is currently on beta release (meaning that you should only use it for tests, but not in production, as the product could change without any warning).

In order to build an image classifier, you will need three things:

  1. A good idea of what you want to achieve
  2. An image dataset
  3. A Google Cloud account

This tutorial is divided in three parts. The first part of this tutorial will focus on explaining what is an Image classifier and what can you do with it. Let's get it on then!

lets get it on

What is image classification?

As humans, we can easily tell what we are looking at by looking at it for less than a second.

We learn this skill from a very young age as it is essential for our survival.

A two month old baby is able to classify faces he sees in three categories "mommy", "daddy" and "other people".

By the time they are one year old, they are already able to process the information they see and decide with a high level of confidence and accuracy amongst thousands of categories and subcategories such as "animals", "fruits", "persons", "colors", etc.

 

baby image classification

A baby is able to distinguish a dog from a cat even if it is the first time he has seen those particular images.

This is image classification. The ability to look at an image and decide what it is, or in other words, what "label" to assign to it.

In real life, there are millions and millions of labels, organized in different categories and subcategories.

What is so difficult about that?

As humans, we are able to learn and remember millions of labels and classify objects in an instant. Computers, on the other hand, do not see the big picture. Machines don't understand concepts, they only see pixels in an image and their corresponding values.

 

human vision vs computer vision

 

 

If we want to teach a machine what an object is, we need to explain it in terms it can understand.

Let's take the example of an apple. At first, one could say that it is easy to explain what an apple is. It is red and it is round, isn't it? But what happens if the apple has another color, a different shape or even if there are other objects in the picture?

apple image classification

It is not a good idea to make rules and exceptions to the rules, but rather to try and understand what is the "digital fingerprint" of an object.

It turns out that if we have enough examples of an object, we can calculate the mathematical function that defines a concept.

Image classification in the context of machine learning is the art of processing multiple images of an object and calculating the mathematical function that defines such object. This is more commonly known as "training a model" and is what machine learning professionals do for a living.

Once a model has been trained, it can be used to predict if a new image is a certain object or not.

Needless to say, a model can be trained to classify multiple labels.

image classification

What are the practical uses of image classification?

Now that we know what image classification is, we can teach a computer to recognize multiple labels on images. These are some common uses of image classification today:

 

Number and character recognition: 

This is the ability to recognize characters on an image (even if they are hand written). This is used by postal services around the world but it is also used in security systems (scanning badges, car plate numbers, etc)

number recognition

Facial recognition:

Security systems around the world could be improved if they learn how to classify faces in a picture. It is currently used by Apple on the iphone X. There is even a controversial project to help police officers find missing people by comparing faces in the street to a database of missing people.

face recognition

 

Product recognition:

Supermarkets and retail stores around the world could benefint from this technology. This could help companies have a better idea of their market, to have better management of the shelves and to keep track of their products.

product classification

Helping doctors.

A model was recently trained to identify images and to tell if there is a cancer or not. The applications in the medical field are huge!

cancer recognition machine learning

Identifying human interactions and predicting the future

In 2016, researchers from the MIT trained a model with images from video sequences, in order to try to predict what people in the image were about to do. They trained a model to see an image before the action and understand if people were about to kiss, hug or give a "high five". This is an incredible example of machine learning used to predict the future.

future prediction with machine learning

Identifying abstract concepts

Computers are not only able to recognize simple objects such as "dogs" and "cats", we can also teach them to recognize abstract concepts such as "cubism", "expressionism" and "impressionism" and we can even combine the model we trained, with an existing image. The results are amazing!

machine learning art

 

With enough data, we can train a machine to identify whatever we want on an image. The possibilities are endless! Before you make your application, you will need to decide what you want to do with it. I suggest you start simple by defining two labels.

Are you able to see an image an assign one of the labels in less than two seconds. If you cannot do it, neither will your model be able to do it.

In this tutorial, I will make an application that will be able to say if my children would be safe playing around an object or not and I will call it "can I hug it?".

can i hug it

With some luck, I will be able to sell my technology to manufacturers of baby surveillance cameras so that they can alert the parents when children are attempting to play with their medieval axe.

Continue reading the second part of this tutorial.

Artificial Intelligence
Deep learning
Digital transformation
Supervised learning
Machine Learning

Add new comment

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.