#011 TF How to improve the model performance with Data Augmentation?

#011 TF How to improve the model performance with Data Augmentation?

Highlights: In this post we will show the benefits of data augmentation techniques as a way to improve performance of a model. This method will be very beneficial when we do not have enough data at our disposal.

Tutorial Overview:

  1. Training without data augmentation
  2. What is data augmentation?
  3. Training with data augmentation
  4. Visualization

1. Training without data augmentation

A familiar question is “why should we use data augmentation?”. So, let’s see the answer.

In order to demonstrate this, we will create a convolutional neural network in TensorFlow and train it on Cats-vs-Dogs dataset.

First, we will prepare our dataset for training. We will start by downloading the dataset from the online repository. Once this is completed, we will move on to unzipping and creating a path location for the training and validation set.

Let’s move onto loading the necessary libraries that we will need for this tutorial.

We will build our model by using the “model subclassing technique”. This way, we should define our layers in __init__ and implement the model’s forward pass in call. An input to our model will be the image of size \([150, 150, 3]\). After convolutional layers, we will utilize two fully connected layers in order to make predictions. This is a binary classification problem, so we will have only one neuron in the ouput layer.

Our images are not in one file, but in several folders. In order to train the network on a such dataset, we need to use Image Data Generators. After creating two generators, both for training and validations, we can train the network with the fit method. The only difference will be that we are not separately passing input and output to our network, but rather a data generator.

It is good a practice to normalize the pixel values so that each pixel value has a value between 0 and 1 in order not to disrupt or slow down the learning process. So, that will be the only parameter passed to the Image Data Generator. Then, we will use these data generators to loop through directories, resize images and create batches.

Let’s check the classification accuracy and the loss of our model.

Here, both accuracy and loss on the training set are shown with blue color, while on the validation set are shown with orange color.

From these graphs we can clearly see that the model perform a lot better on the training than on the validation set. So what can we do? Use a data augmentation.

2. What is data augmentation?

Most computer vision tasks require lots of data and data augmentation is one of the techniques used to improve the performance of computer vision systems. The computer vision is a pretty complicated task. For an input image the algorithm has to find a pattern to understand what is in the picture.

In practice, having more data will help for almost all computer vision tasks. Today, the state of computer vision requires more data for the majority of computer vision problems. This is maybe not true for all the applications of convolutional neural networks, but it does feel like it’s true for computer vision area.

When we are training a computer vision model, a data augmentation will often help. This is true whether we are using transfer learning or training the model from scratch.

So, the data augmentation is a technique that can significantly increase the diversity of data available for training, without collecting the new data. More about theoretical aspect of data augmentation you may find here.

3. Training with data augmentation

At first glance, a data augmentation may sound complex, but luckily, TensorFlow allows us to implement it efficiently.

So, we will utilize Image Data Generators as before, but we will add rescaling, rotation, shifting, sheering, zooming and flipping. It is important to say again that this process is only applied on the training set, not on the validation.

Here, not only we have resizing, we will also add rotation (range in degrees), height and width shift (range in pixels), shear range (angle in counter-clockwise direction in degrees), zoom range and flipping.

In addition, a Dropout layer will be added to our neural network. We will drop \(50%\) of our starting neurons. Adding a Dropout layer will help to prevent overfitting.

Now we can train the network.

Let’s see the results. Here, both accuracy and loss on the training set are shown with blue color, while on the validation set are shown with orange color.

The results are much better now. However, the accuracy of our model is not perfect yet.

We will solve this problem in the next post using Transfer Learning.

4. Visualization

So far, we were just talking how to create augmented images, but let’s see how they look like.

In order to do this, we need to use one image from generator and “loop it”. This will iterate through generator and perform augmentation. Below we show a visualization of samples of those images.

Dog augmented images
Augmented images


To summarize, we have learned how to use the data augmentation technique to improve model’s performance. In situations when data is scarce or data collection is expensive, we can use this method. However, be aware that we cannot augment dataset to an extremely large proportions. This method has its limit. In the next post, we are going to show how the process of transfer learning can be applied.

More resources on the topic: