JavaRush /Java Blog /Random EN /Deep Learning, Artificial Intelligence and Machine Learni...

Deep Learning, Artificial Intelligence and Machine Learning for Dummies: Explained with an Example

Published in the Random EN group
Do you want to sparkle with your intellect in the company of colleagues or amaze your friends in a conversation on current technical topics? Mention “Artificial Intelligence” or “Machine Learning” in a conversation and you’re done. Deep Learning, artificial intelligence and machine learning for dummies: explained with an example - 1The term “Artificial Intelligence” is now widely heard. Programmers want to learn AI. Leaders want to implement AI in their services. But in practice, even professionals do not always understand what “AI” is. This article is intended to help you understand the terms “artificial intelligence” and “machine learning”. You will also learn how Deep Learning, the most popular type of machine learning, works. And, what is important, these instructions are written in quite accessible language. The math here won't be too difficult to understand.

Basics

The first step to understanding what Deep Learning is is to understand the difference between the key terms.
Deep Learning, artificial intelligence and machine learning for dummies: explained with an example - 2
Picture: Datanami

Artificial Intelligence vs Machine Learning

Artificial intelligence (AI or AI agnl.) is an attempt to copy the human thought process by a computer. When research in the field of Artificial Intelligence was just beginning, scientists tried to copy the behavior of human intelligence strictly under certain conditions, that is, to sharpen it to solve certain problems. For example, so that the machine can play games. They established a number of rules that the computing machine had to follow. The computer had a list of possible actions, and it made decisions based on the rules and restrictions set during the design phase.
Machine learning (ML or ML in English) means the ability of a machine to learn by processing large sets of information instead of clearly defined rules.
ML allows computers to learn on their own. This type of learning takes advantage of modern computing technology, which can easily process huge amounts of data.

Supervised learning vs Unsupervised learning

Supervised learning uses labeled data sets that consist of inputs and expected outputs. When you train artificial intelligence using supervised learning, you provide data as input and specify what the output should be. If the result that the AI ​​produces is different from what was expected, then the AI ​​must correct its calculations. The process is repeated many times over the data array as long as the AI ​​makes mistakes. An example of supervised learning would be Artificial Intelligence predicting the weather. It learns to predict the weather using historical data. The input data is pressure, humidity and wind speed, and as a result we should get temperature. Unsupervised learning is a task that consists of training AI using unstructured data. When you train artificial intelligence using unsupervised learning, you enable the AI ​​to make logical classifications of data. An example of artificial intelligence using unsupervised machine learning is a robot predictor of customer behavior in an online store. It learns without using pre-known inputs and outputs. Instead, it must classify the input data itself. The algorithm should identify and tell you which type of users prefer which products.

How machine learning works

So, Deep Learning is one of the approaches to machine learning. It allows you to predict results from given input data. To train AI, you can use both of the above options: supervised and unsupervised learning. We will understand how Deep Learning works using a clear example: let’s say we need to develop a service for predicting prices for air travel. We will train our algorithm using a supervised method. We want our service for predicting prices for air travel to predict the price based on the following input data (we do not take into account the return flight for ease of presentation):
  • departure airport;
  • arrival airport;
  • planned departure date;
  • airline.
Neural networks Let's take a look into the brain of artificial intelligence. As in the case of biological living beings, our predictor has neurons in its “head”. In the picture they are presented in the form of circles. Neurons are connected to each other.
Deep Learning, artificial intelligence and machine learning for dummies: explained with an example - 3
In the image, neurons are combined into three groups of layers:
  • input layer;
  • hidden layer 1 (hidden layer 1) and hidden layer 2 (hidden layer 2);
  • output layer.
Some data enters the input layer. In our case, we have four neurons on the input layer: departure airport, arrival airport, departure date, airline. The input layer passes data to the first hidden layer. Hidden layers perform mathematical calculations based on the received input data. One of the main issues when building neural networks is the choice of the number of hidden layers and the number of neurons in each layer.
The word Deep in the phrase Deep Learning indicates the presence of more than one hidden layer.
The output layer returns the resulting information to us. In our case, the expected price of the flight.
Deep Learning, artificial intelligence and machine learning for dummies: explained with an example - 4
We have missed the most interesting thing so far: how exactly is the expected price calculated? This is where the magic of Deep Learning begins. Each connection between neurons is assigned a certain weight (coefficient). This weight determines the importance of the input value. The initial weights are set randomly. When predicting the cost of air travel, the date of departure affects the price the most. Therefore, the connections of the “departure date” neuron have more weight.
Deep Learning, artificial intelligence and machine learning for dummies: explained with an example - 5
Each neuron has an activation function attached to it. It is difficult to understand what this function is without mathematical knowledge. So let's make some simplification: the point of the activation function is to “standardize” the output from the neuron. After the data set has passed through all the layers of the neural network, it returns the result through the output layer. So far everything is clear, right?

Neural network training

Training a neural network is the most difficult part of Deep Learning! Why? Because you need a large amount of data. Because you need more computing power. For our project, we need to find historical airfare data. Moreover, for all possible combinations of departure and destination airports, departure dates and different airlines. We need a very large amount of data with ticket prices. We must feed the input data from our set to the inputs of our neural network and check whether they match the results that we already have. If the results obtained by artificial intelligence differ from those expected, this means that it has not yet trained enough. Once we've run the full amount of data through our neural network, we can build a function that will indicate how different the AI's results are from the actual results in our data set. Such a function is called a cost function . In the ideal case, which we strive for with all our might, the values ​​of our cost function are equal to zero. This means that the cost results selected by the neural network do not differ from the actual cost of tickets in our dataset.

How can we reduce the value of the cost function?

We change the weights of connections between neurons. This can be done randomly, but this approach is not efficient. Instead, we will use a method called Gradient Descent .
Gradient descent is a method that allows us to find the minimum of a function. In our case, we are looking for the minimum of the cost function.
This algorithm works by gradually increasing the weights after each new iteration of processing our data set. By calculating the derivative (or gradient) of the cost function for certain sets of weights, we can see in which direction the minimum lies.
Deep Learning, artificial intelligence and machine learning for dummies: explained with an example - 6
In the picture: Initial weight - initial weight, Global cost minimum - global minimum of the cost function. To minimize the cost function, we must perform calculations on our data set many times. This is why you need a lot of computing power. The weights are updated automatically using the gradient descent method. This is the magic of Deep Learning! Once we have trained our AI flight price prediction service, we can safely use it to predict prices in reality.

Let's sum it up...

  • Deep learning uses neural networks to simulate intelligence.
  • There are three types of neurons in a neural network: input layer, hidden layers, output layer.
  • Each connection between neurons has its own weight, indicating the importance of that input.
  • Neurons use an activation function to “standardize” the output from the neuron.
  • To train a neural network, you need a large amount of data.
  • If we process a data array using a neural network and compare the output data with the actual data, we will get a cost function that shows how much the AI ​​is wrong.
  • After each data processing, the weights between neurons are adjusted using the gradient descent method to achieve a reduction in the cost function.
Link to original
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION