#003 Linear Algebra – Linear transformations and matrices
Highlight: Hello and welcome back! This post will be quite an interesting one. We will show how a 2D plane can be transformed into another one. Understanding these concepts is a crucial step for some more advanced linear algebra/machine learning methods (e.g. SVD, PCA). So, let’s proceed and we will learn how to connect a matrix-vector multiplication with a linear transformation.
Tutorial Overview:
1. Linear transformation
In this post we will introduce a linear transformation. A linear transformation can also be seen as a simple function. In functions, we usually have a scalar value as an input to our function. But rarely so far, we have experienced that input into a function can be a vector. So, a linear transformation is actually a function that maps an input vector into an output vector. For a linear transformation, both input and output vectors are of the same length. One of the most famous example of a linear transformation is the Discrete Fourier Transform. For instance, this transformation takes as an input a sequence of \(N \) signal samples and these samples are then mapped with the Fourier transform into a sequence of another \(N \) samples. These new samples are actually complex numbers in a Fourier domain. With complex numbers we can capture the amplitude in the frequency domain and phase (time-shift) of our original input signal.

Ok, to make things simple again, we forget on complex numbers and we observe again our 2D coordinate system. We recall that a single vector in a 2D plane is represented with a pair (x, y). If we map this vector to another one, we say that this is actually a transformation. Recall that sometimes we refer to a vector as a movement. Then, with a linear transformation we are moving that vector again in our plane to get the output vector. Therefore, vectors can be seen as a displacement vectors and by transforming them we are actually moving them in some particular way.
The word “transformation” suggests an association with the movement.
In this basic example we will have an input vector that’s transformed to the output vector. To be more precise we will rotate our input vector for 90 degrees clockwise.

Moreover, this same transform can be applied not only to a single vector, but can be actually applied on the whole set of vectors. So, basically, let’s say that we want to transform the whole plane and to see where majority of the vectors from that plane will be mapped. One way to visualize this is to represent vectors not as displacement arrows, but as points (positions). Then, we can map each of these points and observe where they will land after the transformation. This will give us an idea how our transformation actually looks like.
2. Linear transformation and basis vectors
There are three properties that a linear transformation has to obey so that we are allowed to call it a linear transformation. First, a line should remain a line once we transform our coordinate system. Second, an origin should remain at the fixed place, and as third the distance between the grid lines should remain equidistant and parallel. If we apply something more sophisticated like a Neural network, then we should obtain a nonlinear transformation and actually the grid lines will be pretty much curved and deformed.

Suppose the \(2 \) -vector (or \(3 \) -vector) \(v \) represents a position in 2-D (or 3-D) space. Several important geometric transformations or mappings from point to point can be expressed as a matrix-vector product \(w= Av \), with \(A \) being \(2\times 2 \) (or \(3\times 3 \)) matrix. In the examples below, we consider the mapping from \(v \) to \(w \), and focus on the 2-D case (for which some of the matrices are simpler to describe).
Scaling
Scaling is the mapping \(w = av \), where \(a \) is a scalar. This can be expressed as \(w = Av \) with \(A = aI \). This mapping stretches a vector by the factor \(\left | a \right | \) (or shrinks it when \(\left | a \right |< 1 \)), and it flips the vector (reverses its direction) if \(a< 0 \).
Dilation
Dilation is the mapping \(w = Dv \), where \(D \) is a diagonal matrix, \(D= diag\left ( d_{1},d_{2} \right ) \) .This mapping stretches the vector \(v \) by different factors along the two different axes. (Or shrinks, if \(\left | d_{i} \right |< 1 \), and flips, if \(d_{i}< 0 \).)
Rotation
Suppose that \(w \) is the vector obtained by rotating \(v \) by \(\theta \) radians counterclockwise. Then we have
$$ w= \begin{bmatrix}\cos \theta & -\sin\theta \\ \sin\theta& \cos \theta \end{bmatrix}v $$
This matrix is called (for obvious reasons) a rotation matrix.
Reflection
Suppose that \(w \) is the vector obtained by reflecting \(v \) through the line that passes through the origin, inclined \(\theta \) radians with respect to horizontal. Then we have
$$ w= \begin{bmatrix}\cos \left ( 2\theta \right ) & \sin\left ( 2\theta \right ) \\ \sin\left ( 2\theta \right )& -\cos \left ( 2\theta \right ) \end{bmatrix}v $$
So, one way how we can also think about transformation is to imagine again our 2-D plane and vectors that sit in it. We will have an input vector of two coordinates \(x_{in} \) and \(y_{in} \).

We have to transform it somehow to get \(x_{out} \) and \(y_{out} \). Very interesting thing is that we can actually observe our basis vectors, which are defined here with \(\hat{i} \) and \(\hat{j} \).
Why is this so important? Well, let us recall that any vector is a linear combination of these two vectors. Following this concept, we can obtain coordinates of any vector. For instance,

So, this is one of the examples where we have vector \(\begin{bmatrix}-2\\4\end{bmatrix} \) and we have to multiply \(\hat{i} \) with \(-2 \) and \(\hat{j} \) with \(4 \). Those vectors \(\hat{i} \) and \(\hat{j} \) are called basis vectors and they have length \(1 \). One idea is that actually we see how each of those two vectors are transformed with our linear transformation and actually they will obtain for us transformed vector \(\hat{i} \) and a transformed vector \(\hat{j} \). This is illustrated in the image bellow.
So, we will observe what will happen with our original vector \(\left ( -2,4 \right ) \) and alongside we will observe where our basis vector will be mapped for some predefined linear transformation. We will examine the transformation that maps the coordinates as illustrated in the image bellow. First of all, we can see that the transformed basis vectors are: \(\left ( 1,-2 \right ) \) and \(\left ( 3,0 \right ) \). A transformed input \(\left ( -2,4 \right ) \) vector will be at the position \(\left ( 10,4 \right ) \).

So basically, we will obtain the final vector as the linear combination of transformed basis vectors. We will just use the following formula to obtain the following vector \(\vec{v} \) as \(– 2 \) transformed vector \(\hat{i} \) and transformed vector \(\hat{j} \) times a scalar \(4 \).
So, we will proceed further as follows. If we want to map our arbitrary input vector (x, y) with a linear transformation, the output vector will preserve this pair of numbers, but they will multiply the transformed basis vectors. To summarize we started from a vector \(\begin{bmatrix}-2\\4\end{bmatrix} \) and following the simple calculation we obtained the vector \(\begin{bmatrix}10\\4\end{bmatrix} \). So, this is the vector where our input vector will be mapped.
This can also be explained with the following equations and formulas:

3. A 2×2 Matrix as a linear transformation
Now, it’s interesting that actually the whole transformation is defined with two transformed basis vectors, and then, we can map the whole 2-D plane if we know the transformed basis vectors. Each of these vectors is specified with just two numbers: in this case \(\begin{bmatrix}1 & 3\\-2 & 0\end{bmatrix} \). Then, using these two vectors we can put them into a \(2\times 2 \) matrix in a such a way that we stack these vectors along the columns and now this \(2\times 2 \) matrix actually represent a useful matrix that we can use for further vector processing.

Actually, it’s just the scaling two column vectors and then summing them and this is what we get as the resulting output. This can be more intuitive way to think about the matrix-vector multiplication.
However, we should add one more thing. If we have a matrix whose vectors are linearly dependent, then, a 2-D plane will be mapped to a single line. It will not be possible to go back and perform the reconstruction. Okay that’s all for this post!
Summary
This was quite an important post that revealed the link between a linear transformation and the matrix-vector multiplication. We have learned that the basis vectors and their transformation play a crucial role. In addition, we saw that these transformed basis vectors create a matrix that we can use for linear mapping. In the next post we will talk about determinants.