Introduction to Recurrent Neural Networks
In this article, I will explain what are Recurrent Neural Networks (RNN), how they work and what you can do with them. I will also show a very cool example of music generation using artificial intelligence.
However, before discussing RNN, we need to explain the concept of sequence data.
As the name indicates, sequence data is a collection of data in different states through time so it can form something else. These are some examples:
- All the different notes that form a song.
- All the words that form a phrase.
- All the pictures that form a video.
Let's take for example, all the pictures that form the following video clip.
Individually, they can only tell us that they are pictures of a dude playing golf.
But if you look at all the images in the right order (from left to right), you can see the swing that is taking place. This is a "sequence" of data that needs to be analysed as a whole if we want to make some sense of it and come to the conclusion that all these images represent a "swing".
So what are Recurrent Neural Networks?
If you have been reading my blog, by now you must be familiar with neural networks, if you are not, I reccomend you read this article first.
Neural networks are very cool and fun to use, but they have one small problem. They can only take one input of fixed size and provide one output of fixed size... In other words, they can not analyse sequences of data.
If you take the example of image classification, the neural network will input a vector (X) of all the pixel values and the output will be a classification vector (Y) representing a prediction of what the image is.
As amazing as they are, machine learning practicioners, often refer to these models as "vanilla" neural networks, since they are of architecture "one to one" and to their eyes, they are "regular" and "without any fancy stuff"...
Recurrent Neural Networks (RNN) on the other hand, are a kinkier type of artificial neural network where the output(s) depends on the input(s).
This means that the prediction will take in account all the previous information it has been given.
The auto complete function in our smartphones is a great example of what a Recurrent Neural Network can do.
Every time you type a word, the software will take all previous words you have typed to predict which one is the most likely you are going to write. Giving a RNN just the first word and let him predict the rest can be very funny.
You can test yourself that if you change any of the words typed in the middle of a sentence, your phone will propose different words. This is how you know the function takes in account all the sequence of data in the order you provide it and that it is not just random words.
The autocomplete function is an example of a "Many to one" RNN, because the software needs to analyse many words in the right order through time to come to one conclusion... What the next word will be.
But there are other types of RNN architectures:
- One input to many outputs
- Many inputs to one output
- Many inputs to Many outputs
- Many inputs to Many outputs (with different lengths)
I hope the below picture gives you a better understanding of these different architectures.
The "NN" cell represents a neural network and it can be as deep as you want, even with hundreds of hidden layers.
Some academics and researchers often represent the RNNs with a different diagram, but the concept is exactly the same.
I personally prefer the "unrolled" diagram that I presented first as I think it is easier to understand.
Training a RNN
Training a RNN is like training a vanilla neural network.
- You start by setting the parameters (weights and bias) to random values.
- You do forward propagation to calculate the cost.
- You do back propagation to find the right set of parameters that minimize the error of your model.
- You repeat the same process many times until your model is as accurate as possible (smallest error)
If you are new to machine learning and you did not understand any of the above statements or the picture above, don't worry.
The only thing to understand is that you need to take into account the results of the previous NN when calculating the outcome of the current NN and so on until the end.
In other words, the model takes in account all the previous sequences of data, when trained, but also when providing predictions.
Generating music with RNN
By now, you should understand the basic concept of Recurrent Neural Networks. As you can see, you can train a model with words so that it predicts the following ones or you can train it with Shakespare poems so that it will generate its own poetry.
But I personally find that one of the most amazing examples of RNN is to generate music, one note at a time.
If you are new to my blog, you need to understand that my articles are intended for a non-technical audience, so I will not cover the implementation details of this project, nor provide the code that I used. There are hundred of tutorials out there on how to achieve what I did, so if you are interested in doing the same, you can simply google it or contact me for details.
In this article I will only provide a general idea of what I did.
- I collected different clips of Jazz music
- I used them to train a RNN
- I gave the RNN a first random note so that it could predict the following notes.
- Some Jazz music was generated!
This is the final result that I uploaded to youtube.
The images in the video were also generated with AI using a technique called "neural transfer" that I will definetly cover in a different article.
So that's it for now. There are other interesting concepts behind RNNs like "memory" and bi-directional RNNs that I will also cover in other articles. I hope you enjoyed reading about this topic and please leave your comments below.