#002 PyTorch - Tensors - The main data structure

#002 PyTorch – Tensors – The main data structure

datahacker.rs Other 09.11.2019 | 0

Highlights: Welcome everyone! In this post we will cover the main data structure of PyTorch – tensors. Before we proceed with tensors, we will first give a quick overview of what PyTorch is and why it has been popular lately. Next, in the following posts we will use this knowledge to build interesting applications.

1. What is PyTorch?

PyTorch is an open-source Python framework released from the Facebook AI Research Team. Its main purpose is for the development of deep learning models. It’s a Python-based scientific computing package with the main goal to:

Have characteristics of a NumPy library to harness the power of GPUs but with stronger acceleration.
In addition, it provides remarkable flexibility and speed during the implementation and the development of deep neural network architectures.

PyTorch Community

The PyTorch community is growing in numbers on a daily basis. It has been largely used by many scientific researchers. With its rising popularity, more people are adopting PyTorch within their research labs to develop sophisticated deep learning models.

PyTorch tensorflow — Reference: https://trends.google.com/trends/explore?q=pytorch,tensorflow

If you take a look at the GitHub repository, there are 47, 490 contributors working on improvement to the existing PyTorch functionalities and applications.

Currently, PyTorch has been applied by Tech giants such as Facebook, Tesla, Uber, Nvidia.

In a research community field PyTorch has been mainly used for neural networks/deep learning, computer vision, image recognition, and NLP. After the release of the version 1.0, it has helped researchers to tackle four major difficulties such as:

Large-scale reworking
Reducing training time
Steady enlargement
Python Programming language stability.

Now, let’s see how we can create tensors. We will show how to do simple operations with them, and also we will show the link between tensors and NumPy arrays.

2. What are Tensors?

In linear algebra tensor is a generalization of vectors and matrices. For instance, a tensor with only one dimension is a vector. In addition, matrices are tensors with two dimensions. Then, we can also have a three dimensional tensor. Example of a three dimensional tensor is an image. Such an image can be represented in a computer with three channels – red, green and blue matrices (RGB image). This can continue to expand to four dimensional tensors and so on. However, we will mainly work with the tensors up to three dimensions.

So, let’s start with creating tensors in Python. First, let’s import our torch library to create a 1D, 2D, and 3D tensors. By default, these tensors are of float type which is recommended and will be used throughout these posts.

Creating tensors in PyTorch

# Import necessary library
import torch

vector_data = [20., 40., 60., 80., 100.]
vector = torch.tensor(vector_data)
print(vector, end='\n\n')

# Create a 2D Tensor of size 2x3
Matrix_data = [[20., 40., 60.], [80., 100., 120.]]
Matrix = torch.tensor(Matrix_data)
print(Matrix, end='\n\n')

# Create a 3D tensor of size 3x2x2.
Tensor_data = [[[20., 40.], [60., 80.]],
          [[100., 120.], [140., 160.]],
          [[180., 200.], [220., 240.]]]
Tensor = torch.tensor(Tensor_data)
print(Tensor, end='\n\n')

Output:
tensor([ 20., 40., 60., 80., 100.]) 

tensor([[ 20., 40., 60.], [ 80., 100., 120.]]) 

tensor([[[ 20., 40.], [ 60., 80.]], [[100., 120.], [140., 160.]], [[180., 200.], [220., 240.]]])

Within PyTorch we can also perform indexing. Let’s see how we can access individual elements within a vector, matrix and tensor.

Indexing using PyTorch

# Indexing into vector and get a scalar
print(vector[0]),'\n'

# Get a Python number from it
print(vector[0].item())

# Indexing into Matrix and get a vector
print(Matrix[0])

# Indexing into Tensor and get a matrix
print(Tensor[0])

Output:
tensor(20.) 
20.0 
tensor([20., 40., 60.]) 
tensor([[20., 40.], 
        [60., 80.]])

In addition, we can generate a tensor with random numbers. In particular, we can use these numbers which are sampled with a normal distribution (Gaussian distribution). To accomplish this we will use a function torch.randn(). Now, let’s see how we can generate a matrix of size $4\times4 $ that has elements which are random numbers.

Generating Data

z = torch.randn((4, 4))
print(z)

Output: 
tensor([[ 0.2887, 0.7928, 0.4076, -1.1513], 
        [ 1.1598, -0.0620, 0.3933, -1.1523], 
        [ 1.8673, -0.7979, 1.7806, -0.5694],
        [-2.6755, -0.2938, 0.1893, -0.6120]])

Intuitively, we can also perform basic mathematical operations on tensors in various ways. Let’s take a look at some examples such as an addition, subtraction, multiplication and division. We can see that this is similar to NumPy where we perform element by element mathematical operations.

Operations with Tensors in Pytorch

x = torch.tensor([4., 2., 6.])
y = torch.tensor([3., 19., 12.])

# Addition
torch.add(x, y)

Output: tensor([ 7., 21., 18.])

# Subtraction
torch.sub(x, y)

Output: tensor([ 1., -17., -6.])

# Multiplication
torch.mul(x, y)

Output: tensor([12., 38., 72.])

# Division
torch.div(x, y)

Output: tensor([1.3333, 0.1053, 0.5000])

We will proceed by showing how to concatenate tensors. This can be done in a row-wise or a column-wise manner. Note that if we have tensors of different sizes, PyTorch will raise an exception. For this choice between a row-wise and a column-wise in a function torch.cat()we use an argument axis=0 or axis=1.

# Concatenate columns: Row-wise, axis=0 is by default.
a_1 = torch.randn(2, 2)
b_1 = torch.randn(2, 2)
c_1 = torch.cat([a_1, b_1], axis=0)
print(c_1)

Output:
tensor([[ 1.8691, -0.2951], 
        [ 0.1459, 0.5055], 
        [-0.7777, 0.5227], 
        [-0.5544, 0.5988]])

# Concatenate columns: Column-wise, axis=1.
a_2 = torch.randn(2, 3)
b_2 = torch.randn(2, 1)

c_2 = torch.cat([a_2, b_2], axis=1)
print(c_2)

Output:
tensor([[-0.0278, -1.0007, -1.2034, 0.0247], 
        [ 1.1118, 0.5623, -0.6332, 0.0814]])

# If we have tensors of different shapes, a torch will raise an exception.
torch.cat([a_2, b_2])

Output:
RuntimeError: Sizes of tensors must match except in dimension 0. Got 3 and 1 in dimension 1

How can we overcome this error? Well, with the use of the .view()method that PyTorch offers, we can reshape our tensor. This is very crucial because many neural networks expect their input to be of a certain shape and, therefore, a support for a data reshape is necessary.

Reshaping Tensors

x = torch.randn(6, 2)
print(x)

Output:
tensor([[[-0.1935,  0.0053]], 
        [[-0.1847, -0.0165]], 
        [[-2.5749,  0.6514]], 
        [[ 0.8537, -2.1573]], 
        [[ 0.5601, -0.4538]], 
        [[-0.8082, -1.6591]]])

print(x.view(3, 4))  # Reshape to 3 rows, 4 columns

Output:
tensor([[-0.1935,  0.0053, -0.1847, -0.0165], 
        [-2.5749,  0.6514,  0.8537, -2.1573], 
        [ 0.5601, -0.4538, -0.8082, -1.6591]])

# Same as above. If one of the dimensions is -1, its size can be deduced
print(x.view(3, -1))

Output:
tensor([[-0.1935,  0.0053, -0.1847, -0.0165],
        [-2.5749,  0.6514,  0.8537, -2.1573],
        [ 0.5601, -0.4538, -0.8082, -1.6591]])

NumPy to Torch and back

PyTorch is very flexible and allows us to do conversion between NumPy arrays and Torch tensors. We can create a tensor from a NumPy array using the command torch.from_numpy(). In addition, we can go back from a tensor into a NumPy array using .numpy() function.

import numpy as np # Numerical Python

X = np.random.randn(4, 4) # Generate random numbers 
print(X)
print(type(X)) # A numpy array

Output:
[[-0.46981217  0.9257731  -0.25390769 -0.8545052 ]
 [-0.64592634 -0.53178916  0.85764578  1.78075961] 
 [ 1.40394094  0.63343998 -0.53284549  0.80508047] 
 [-0.57380618 -0.02222501 -1.70550022  1.42014203]] 
<class 'numpy.ndarray'>

Y = torch.from_numpy(X) # Create a tensor from NumPy array
print(Y)
print(type(Y)) # A tensor

Output:
tensor([[-0.4698,  0.9258, -0.2539, -0.8545], 
        [-0.6459, -0.5318,  0.8576,  1.7808], 
        [ 1.4039,  0.6334, -0.5328,  0.8051], 
        [-0.5738, -0.0222, -1.7055,  1.4201]], 
dtype=torch.float64) <class 'torch.Tensor'>

print(Y.numpy()) # Convert the tensor into NumPy array
print(type(Y.numpy()))

Output:
[[-0.46981217 0.9257731 -0.25390769 -0.8545052 ]
 [-0.64592634 -0.53178916 0.85764578 1.78075961] 
 [ 1.40394094 0.63343998 -0.53284549 0.80508047] 
 [-0.57380618 -0.02222501 -1.70550022 1.42014203]] 
<class 'numpy.ndarray'>

Computation Graphs and Automatic Differentiation – Autograd

Important concept that we need to understand is how to calculate the gradients which are essential for our model optimization. PyTorch provides a solution for that and it is a module called AutoGrad. This is very helpful because we don’t need to deal with the complex mathematical equations. AutoGrad, in only a few lines of code, will automatically calculate gradients of complicated formulas for us. Let’s take an example.

First, we will create a tensor $x $ using the function torch.rand(). As you can see this is a tensor with three random values. Our goal is to calculate the gradients of some function with respect to tensor $x $. We need to specify the argument requires_grad = True. This means that PyTorch will keep track of the operations on this tensor and then it will calculate the gradient. On the other hand, if you create a tensor and don’t want to calculate the gradient for it, by default requires_grad = False. Next, when we start to perform operations with this tensor, PyTorch will create a computational graph. You can see how this graph looks in the following image.

We can see our inputs $x $ and 6. First, we do a forward pass and calculate the output $y $. Since we specified that it requires the gradient, PyTorch will then automatically create a function which will be used in back-propagation. Here $y $ has an attribute grad_fn. This will initiate a gradient function PowBackward which will calculate the gradients in the backward pass.

Note: To turn off gradients all together globally, we used the command torch.set_grad_enabled(True|False).

x = torch.randn(1,requires_grad=True)
print(x)

Output:
tensor([0.5377], requires_grad=True)

y =(6 * x) ** 2
print(y)

Output:
tensor([10.4097], grad_fn=<PowBackward0>)

# Print attribute grad_fn
print(y.grad_fn)

Output:
<PowBackward0 object at 0x7ff0aca31080>

Printing out our tensor x.grad returns None. Since we have taken only the forward pass and haven’t calculated the gradients yet.

print(x.grad)

Output: None

The gradients are computed with respect to our tensor $x $ with y.backward(). This does a backward pass through the operations that created $y $.

$$ f^{\prime}(x)=\frac{\partial}{\partial x} 6 x^{2}=(2 \times 36) x^{2-1}=72 x=12 \times 2=38.71 $$

By taking the derivative of $y $ with respect to $x $ we get the original value.

y.backward()
print(x.grad)

Output:
tensor([38.7169]

Summary

In conclusion, we have gone through the PyTorch data structures. We have learned who developed it and the reasons why it was designed. We also explored what tensors are, where we can use them, and we showed some basic operations with them. We explained what an Autograd module is and how it provides automatic differentiation on tensors. In the next post, we will see how to implement a single layer neural network and perform binary classification.

Here you can download code from our GitHub repo.

#002 PyTorch – Tensors – The main data structure