Deep Learning Fundamentals - Part 6: What does it mean to train an artificial neural network?

blkholedetector (30)in #deep-learning • 7 years ago (edited)

In this sixth video of the Deep Learning Fundamentals series, we’ll discuss what it means to train an artificial neural network. In previous videos, we went over the basic architecture of a general artificial neural network. Now, after configuring the architecture of the model, the next step is to train this network.

So, what does this mean?

When we train a model, we’re basically trying to solve an optimization problem. What we’re trying to optimize are the weights within the model. Recall, we touched on this idea in our video about layers. There, we talked about how each connection between neurons had an arbitrary weight assigned to it, and that during training, these weights would constantly be updating in attempts to reach their optimal values.

Now, in terms of how the weights are being optimized, that is going to depend on the optimization algorithm, or optimizer, we choose to use for our model. The most widely known optimizer is called Stochastic Gradient Descent, or more simply, SGD.

When we have any optimization problem, we must have an objective. So, what is SGD’s objective in attempting to set the model’s weights? The objective is to minimize a given loss function. So, SGD would be assigning the weights in such a way to make this loss function as close to zero as possible. The loss function could be something like mean squared error, but there are several loss functions that we could use in its place.

Ok, but what is the actual loss we’re talking about?

Well, during training, we supply our model with data and the labels to that data. For example, if we had a model that we wanted to train to classify whether images were either images of cats or images of dogs, then we would supply our model with images of cats and dogs along with the labels for these images that state whether any given image is of a cat or of a dog.

So say we give one image of a cat to our model. Once the image passes through the entire network, the model is going to spit out an output at the end. This will consist of what it thinks that image is, either a cat or a dog. It will actually consist of probabilities for cat or dog. So it may assign a 75% probability to the image being a cat, and a 25% probability to it being a dog. In this example, the model is assigning a higher likelihood to the image being of a cat than of a dog.

In this case, the loss is going to be the error between what the network is predicting for the image versus what the true label of the image is. So it makes sense for SGD to try to minimize this error to make our model as accurate as possible in its predictions.

After passing all of our data through our model, we’re going to continue passing the same data in over and over again. Now during this state of repeatedly sending the same data into the model is when the model will actually learn. So through some process that’s occurring with SGD repeatedly, the model will learn from the data and then update the weights of the model accordingly.

So, what does it actually mean for the model to learn? We’re going to pick up on that topic in our next video.

We know now generally what is happening during one pass of the data through the model. In the next video, we’ll see how the model learns through multiple passes of the data and what exactly SGD is doing to minimize the loss function.

Also, before wrapping up, I did want to point out that I just generally covered concepts, like the optimizer, loss, and a couple other topics in this video. We’ll definitely be going into more detail about these topics in subsequent videos, but I just needed to give them a general introduction here so that we could understand the basics of training.

So hopefully now you have a general understanding about what it means to train a model and how this training is implemented. Stay tuned for the next video where we’ll learn what’s happening behind the scenes of this training and how the model learns from this process.

Previous videos in this series:

#ai #machine-learning #programming #artificial-intelligence

7 years ago in #deep-learning by blkholedetector (30)

$0.00

6 votes

Sort:

Trending

[-]

steemitrobot (-2)(1) 7 years ago

This post was resteemed by @steemitrobot!
Good Luck!

Resteem your post just send 0.100 SBD or Steem with your post url on memo. We have over 2000 followers. Take our service to reach more People.

Pro Plan: just send 1 SBD or Steem with your post url on memo we will resteem your post and send 10 upvotes from our Associate Accounts.

The @steemitrobot users are a small but growing community.
Check out the other resteemed posts in steemitrobot's feed.
Some of them are truly great. Please upvote this comment for helping me grow.

$0.00

STEEM 0.20

TRX 0.14

JST 0.029

BTC 68011.48

ETH 3275.77

USDT 1.00

SBD 2.64

Deep Learning Fundamentals - Part 6: What does it mean to train an artificial neural network?

Coin Marketplace