#013 CNN VGG 16 and VGG 19

datahacker.rs Other 10.11.2018 | 0

\(VGG \) neural network

In the previous posts we talked about \(LeNet-5 \) and AlexNet . Let’s now see one more example of a convolutional neural network called \(VGG-16 \) and \(VGG-19 \) network.

In this network smaller filters are used, but the network was built to be deeper than convolutional neural networks we have seen in the previous posts.

VGG16 architecture

Architecture of \(VGG-16 \)

Remarkable thing about the \(VGG-16 \) is that instead of having so many hyper parameters we will use a much simpler network. We will focus on just having \(conv \) layers that are just \(3\times3\) filters with a stride of \(1 \), and with the same padding. In all \(Max\enspace pooling \) layers we will use \(2 \times 2\) filters with a stride of \(2 \).

Let’s go through the architecture.

The first two layers are convolutional layers with \(3 \times 3 \) filters, and in the first two layers we use \(64\) filters so we end up with a \(224 \times 224 \times 64 \) volume because we’re using \(same \) convolutions (height and width are the same). So, this \((CONV\enspace 64) \times 2 \) represents that we have \(2\enspace conv\) layers with \(64\) filters. The filters are always \(3 \times 3\) with stride of \(1 \) and they’re always implemented with the \(same \) convolutions.
Then, we use a \(pooling \) layer which will reduce height and width of a volume: it goes from \(224 \times 224 \times 64\) down to \(112 \times 112 \times 64\).
Then we have a couple more \(conv \) layers. Here we use \(128 \) filters and because we use the \(same \) convolutions, a new dimension will be \(112 \times 112 \times 128\).
Then, a \(pooling \) layer is added so new dimension will be \( 56 \times 56 \times 128 \).
\(2 \enspace conv\) layers with \(256 \) filters
The \(pooling \) layer
A few more \(conv\) layers with \(512 \) filters
A \(pooling \) layer
A few more \(conv\) layers with \(512 \) filters
A \(pooling \) layer
At the end we have final \(7 \times 7 \times 512\) into \(Fully\enspace connected\) layer \((FC) \) with \(4096 \) units, and in a \(softmax \) output one of a \(1000 \) classes.

vgg 16 vs 19 architecture

Layers of \(VGG-16 \) and \(VGG-19 \)

Number \(16 \) in the name \(VGG-16\) refers to the fact that this has \(16\) layers that have some weights. This is a pretty large network, and has a total of about \(138\) million parameters. That’s pretty large even by modern standards. However, the simplicity of the \(VGG-16 \) architecture made it quite appealing. We can tell that this architecture is really quite uniform. There are a few \(conv \) layers followed by a \(pooling \) layer which reduces the height and width of a volume. If we look at a number of filters we use we can see that we have \(64\) filters and then we double it to \(128 \) and then to \(256\) and in the last layers we use \(512\) layers. The number of filter we use is roughly doubling on every step or doubling through every stack of \(conv \) layers and that is another simple principle used to design the architecture of this network. The main downside was that it was a pretty large network in terms of the number of parameters to be trained. \(VGG-19\) neural network which is bigger then \(VGG-16\), but because \(VGG-16\) does almost as well as the \(VGG-19\) a lot of people will use \(VGG-16\).

In the next post, we will talk more about Residual Network architecture.

#013 CNN VGG 16 and VGG 19

#013 CNN VGG 16 and VGG 19

\(VGG \) neural network

More resources on the topic:

Leave a Reply Cancel reply

Recent Posts

Search

The hundred-page Computer Vision book

What are morphological transformations?

Learn how to align Faces in OpenCV in Python

Thanks for showing interest in our book!

Enter the following details to obtain the book sample

If you don't see email in your inbox, check the spam folder

We will send the code to your email

If you don't see email in your inbox, check the spam folder

Please fill in the following: