How to make an image classifier without coding. Part 2
This is the second article in a series of three articles, explaining how to build an image classifier without coding. If you have not read part 1, please read it here before continuing.
Now that we decided to build an image classifier with two classes, we will need a proper and clean dataset of images. In my example, I decided to build an image classifier that would be able to judge if it is safe for my baby to hug or not an object. My image classifier will be called “can_I_hug_it”.
I picked this classification on purpose as it is something very subjective and difficult to explain in words and rules. What makes an object or a thing safe to hug? the shape has a lot to do with it, but also the materials, the size and some other properties. Is it safe to hug a dog? It depends. If it is a cute puppy, yeah, why not?. But it we are talking about a mad rottweiler, it is common sense to get away from it as soon as possible. Is it safe to hug a spider? only if it is a toy spider… you get the idea.
We will build our dataset with images from the internet. We will use google images as well as “Fatkun batch download image”, a free chrome plugin that helps you download images in batch from any page. Start by adding the “Fatkun batch download image” plugin to your browser.
Now go to google images and type the name of something that you can hug, like for example “bunny” and click on the plugin you just installed in at the upper right corner of your browser. Click on "only this tab", otherwise you will get all the images from all the chrome tabs.
This will open the plugin and select all the images you got from google.
Click on images that you want to unselect. You have to make sure your dataset is clean, so make sure you select only the images of bunnies you can hug. Try to avoid drawings, blurry images, images of imaginary things or cartoons and false positives like pictures of scary bunnies. Remember that the reliability of your image classifier depends on the reliability of the training images you use.
Before downloading the images click on "More options" to automatically rename the downloaded pictures, starting with the prefix “bunny”.
Make sure that in the chrome settings, the option that asks where to download every image is not activated, otherwise you will get a pop up for every image you are downloading and it will be annoying.
By default, the images will be downloaded to your "downloads" folder.
Now that you have downloaded a couple of images of bunnies, do the same exercise for other things that can be hugged. These are the search words I used:
- Stuffed animal
- Feather boa
Remember to change the prefix of the files every time you are doing a new google search image, otherwise the plugin will try to rewrite the articles with the same name as before and you will get annoying pop ups asking you that there is already a file with that name.
Repeat the process until you have 1000 images of things you can hug. Now copy all these images in a folder in your computer called "hug".
When you are done, your folder should look something like this.
Now repeat the process for things you should “not_hug”. These are the keywords I used in my search on google images:
- Killer clown
- Scary dog
As before, make sure you get 1000 images of things that are not safe to hug and that you have them all in a folder called "not_hug".
At this point you should have two folders with 1000 images each.
Go over the images quickly and make sure they all represent the two labels you have.
Now that we have a proper dataset, we need to rename every file inside these folders in order to manipulate them later on.
Notice that if you have MacOS, there is a little free app you can use that does all of the following steps. It is called NameChanger and you can download it here.
If you have a Windows computer, here is what you need to do:
Open the “hug” folder and type CTRL+E to select all files. Click on the first file with right button of your mouse and select the “change name” option.
Rename them all “sample”
This will change the names of all selected files.
All your files should be called now sample (1).jpg, sample (2).jpg, etc.
Now we need to remove the spaces from the file names. We need to rename “sample (12).jpg” to “sample(12).jpg”. It is necessary to do this step, since the service from google we will be using does not accept spaces in the file names or in the label names. I know I promised there is no coding here, but unless you want to rename all files manually, you need to use a small script I prepared for you.
If you have a Windows computer, go to the “hug” folder. Click on the right button of your mouse and select the option “new” and then “text file”.
Open the file in any text editing app like notepad and copy these lines of text:
:renameNoSpace [/R] [FolderPath]
if /i "%~1"=="/R" (
set "forOption=%~1 %2"
) else (
if "%~1" neq "" (set "inPath=%~1\") else set "inPath="
for %forOption% %%F in ("%inPath%* *") do (
if /i "%~f0" neq "%%~fF" (
echo ren "!folder!!file!" "!file: =!"
ren "!folder!!file!" "!file: =!"
Save and close the file.
Right click on your file and change its name and extension to “renameNoSpace.bat”
Windows will ask you if you are sure you want to change the extension of the file, Click "yes".
If the script you created is on the same folder as the pictures, you can simply double click on it and it will automatically rename all of your files erasing the spaces in the names.
At this point, all your file names in the “hug” folder should be named sample(1).jpg, sample(2).jpg, etc
Now do the same for the files in the folder “not_hug”, but name the files “example”, instead of “sample”.
At this point, your dataset of images should look like this in your computer:
Now that you have a proper dataset of images and they are named correctly, you will need to make your image classifier with Google Cloud Vision Auto ML, read part 3 of this article.