datahacker.rs@gmail.com

# Category: Other

### #010 C Random initialization of parameters in a Neural Network

Why do we need a random initialization? If we have for example this shallow Neural Network: Parameters for this shallow neural network are , $$\textbf{W}^{[2]}$$, $$b^{[1]}$$ and $$b^{[2]}$$. If we initialize matrices  and $$\textbf{W}^{[2]}$$ to zeros then unit1 and unit2 will give the same output, so $$a_1^{[1]}$$ and $$a_2^{[1]}$$ would be equal. In other words unit1 and unit2 are symmetric, and it can be shown by induction that these two units are computing…

### #005B Logistic Regression: Scratch vs. Scikit-Learn

Logistic Regression: from scratch vs. Scikit-Learn Let’s now compare Logistic Regression from scratch and Logistic Regression from scikit – learn. Our dataset are class 0 and class 1, which we generated randomly. The training set has 2000 examples coming from the first and second class. The test set has 1000 examples, 500 from each class.  When we plot these datasets it looks like this: Python’s library scikit-learn has  function LogisticRegression and we will implement it…

### #006B Vectorization and Broadcasting in Python

What is Vectorization? A vectorization is basically the art of getting rid of explicit for loops whenever possible. With the help of vectorization, operations are applied to whole arrays instead of individual elements. The rule of thumb to remember is to avoid using explicit loops in your code. Deep learning algorithms tend to shine when trained on large datasets, so it’s important that your code runs quickly. Otherwise, your code might take a long time…
Building blocks of a Deep Neural Network In this post we will see what are the building blocks of a Deep Neural Network. We will pick one layer, for example layer $$l$$ of a deep neural network and we will focus on computatons for that layer. For layer $$l$$ we have parameters $$\textbf{W}^{[l]}$$ and . Calculation of the forward pass for layer we get as we input activations from the previous layer…
In this post we will see how to implement gradient descent for one hidden layer Neural Network as presented in the picture below. One hidden layer Neural Network   Parameters for one hidden layer Neural Network are $$\textbf{W}^{[1]}$$, $$b^{[1]}$$, $$\textbf{W}^{[2]}$$ and $$b^{[2]}$$. Number of unitis in each layer are:  input of a Neural Network is feature vector ,so the length of “zero” layer $$a^{[0]}$$ is the size of an input feature…