Deep Learning (I)

Artificial Intelligence is a technology more widespread everyday, and in most places is being introduced in a discreet and unnoticed manner. We are not talking about C-3PO, HAL-9000, Skynet or any other famous robot from Science Fiction films, but instead about programs running behind the scenes that recommend us what book to buy, identify our face to unlock our phone, or finish a sketch we are drawing. One the most popular techniques in AI is Deep Learning, and its application is behind many recent innovations we encounter everyday. But how does it work? Well, I have found it to be something as simple as mysterious. Allow me to take you through my first steps in the realm of neural networks.

First of all, a few definitions to set the scene:

Artificial Intelligence is a branch of computer science dealing with the simulation of intelligent behaviour in computers.
Machine Learning is the subfield of Artificial Intelligence that gives “computers the ability to learn without being explicitly programmed.”
Deep Learning is a type of Machine Learning better suited for higher level abstractions and normally based on Neural Networks.

Clear? Not so much, right? That is what I also thought at the time… So, following my curiosity and trying to grasp an understanding of what was this Deep Learning that many of us talk about, I signed up for Andrew Ng’s Deep Learning course in Coursera (https://www.coursera.org/specializations/deep-learning) which, by the way, I strongly recommend. After a few weeks of following his videos and some Python programming (it was also a good chance to refresh my programming skills), I passed the initial course successfully. When I was in the midst of it, I realised that it is a fairly simple technique (at least in “beginner’s mode”) that, very surprisingly, actually works! However, I have to confess that I couldn’t really understand what was really happening in there… Let me explain what I mean.

Neural Networks are called that way because of their resemblance to our neurons, a cell with several inputs which fires an output when some conditions are met. Neural Networks are composed of layers of “artificial neurons”. Every neuron of a layer receives all the inputs, performs a simple calculation, and outputs a value. These outputs become the inputs of the next layer, and so on, until the final output layer. For the sake of simplicity, I will focus on Neural Networks which have a single output of a binary, true/false, nature. The following picture probably illustrates better what I intended to describe. Further on I will try to explain how these networks of neurons operate.

But let me first focus on what they are able to do. A typical example (and one we programmed during the course) is a “cat identifier”. The inputs to the network are the values of every pixel in a picture, and the output is a true/false value indicating if it is actually the picture of a cat or not. But… how can such a simple structure (the Neural Network) understand the concept of “cat” and is able to look for it in the picture? Well, the trick here is in the concept of “understanding”. Does it understand what a cat is? No, absolutely not, at least in the complex and abstract way we humans understand it. However, the network is capable of detecting, on its own from the examples, the patterns present in “cat” pictures and not in “not-cat” pictures, to apply them to a new picture and provide an answer. So, for the purpose of our problem, we could say it has learned how to detect cats in pictures.

For me this was amazing enough, but when I saw how it is actually done, well, I was astonished… But this post is becoming too long, so “stay tuned” to the blog and I will explain in a following one how Neural Networks operate, as well as a very recent explanation of how that learning process takes place behind the scenes.

See you soon!