How did I learn Machine Learning : part 3 - Implement a simple neural network from scratch IV

in technology •  last month  (edited)


In my previous post, I have presented you how to implement a simple neural network without using programming frameworks. In that code, I noticed that few issues and I had explained them in my previous blog post. In brief those drawbacks are law training/test accuracy and slowness of the training process. In this blog post, I am going to address those issues.

If you go through the code, I presented in the previous blog you will find that there are several for loops. It is the main reason for the slowness of the training process. Since the training process takes more time, we cannot train the network for a large number of iteration. Because of that, we cannot gain a large training/test accuracy. So, Vectorization is the answer for those issues.

Let's consider the eqn (1) in the previous blog.
eqn (1) : z1(i) = w1x1(i) + w2x2(i) + ... + w12288x12288(i) + b
According to that equation we consider a single input image(12288 total pixels) and multiply each pixel values with corresponding weight value and add them altogether(also the biase value). Since we consider a single image we need to repeat this for whole training dataset. But by using matrix manipulations, we can do the same computation very quickly.

Lets consider, X represents the whole training dataset. So the columns represent the test images and raws represent the each pixel of a single image. Thefore in this case the dimensions of X is 12288*209(There are 209 training images). In stead of considering each weight value as a single scalar, lets consider them as a metrix W which has the dimension of 1*12288. Now we can write the eqn (1) for whole input dataset as follows.

eqn (10) : Z = WX + B ---- eqn (1)
Here the dimensions of is 1*12288. In this way, we can write same equation which are quivalant to the equations we derived in previous posts.
eqn (11): Ŷ = g(Z) ---- eqn (2)
eqn (12): J = np.sum(-Ylog(Ŷ) - (1 - Y)log(1 - Ŷ)) ---- eqn (3) and (4)
(here the np.sum means adding the all elimants in the matrix.)
eqn (13): ∂W = (Ŷ - Y)XT/m ---- eqn (6)
eqn (14): ∂B = np.sum(Ŷ - Y)/m ---- eqn (7)
eqn (15): W = W -α∂W ---- eqn (8)
eqn (16): B = B -α∂B ---- eqn (9)

Now we have written the all scalar equations as matrix equations which as easy to compute. Please note that according to one of my previous blogs, the capital letters always represent the vectors or matrices.

In the next blog, I will show you the code which is written according to these equations and you will be able to see that, the new code runs very quickly and it shows a high training/test accuracy.

Please feel free to raise any concerns/suggestions on this blog post. Let's meet in the next post.

My previous blogs,
How did I learn Machine Learning : part 1 - Create the coding environment
How did I learn Machine Learning : part 2 - Setup conda environment in PyCharm
How did I learn Machine Learning : part 3 - Implement a simple neural network from scratch I
How did I learn Machine Learning : part 3 - Implement a simple neural network from scratch II
How did I learn Machine Learning : part 3 - Implement a simple neural network from scratch III

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

Congratulations @boostyslf! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :

You received more than 100 upvotes. Your next target is to reach 250 upvotes.

You can view your badges on your Steem Board and compare to others on the Steem Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

Vote for @Steemitboard as a witness to get one more award and increased upvotes!

~Smartsteem Curation Team