#002 Linear Algebra – Linear combination of Vectors

#002 Linear Algebra – Linear combination of Vectors

Highlights: In this post we are going to continue our story about vectors. We will talk more about basis vectors, linear combination of vectors and what is the span of vectors. We provide a code examples to demonstrate how to work with vectors in Python.

Tutorial Overview:

  1. Basis vectors \(\hat{i}\) and \(\hat{j}\)
  2. Different basis vectors
  3. Linear combination of vectors
  4. What is a span of vectors?
  5. Linearly independent and dependent vectors

1. Basis vectors \(\hat{i}\) and \(\hat{j}\)

Let’s talk about vectors in more details. Vectors are related to pairs of numbers that we call coordinates. Below, we have two vectors with coordinates (2,3) and (-1,5).

Before we proceed, we will define two special vectors for a 2D coordinate system. They are the vectors \(\hat{i} = (1,0) \) and \(\hat{j} = (0,1)\). These two vectors are illustrated in the image below. They are unit vectors along the \(x \) and along the \(y \) axis. In the \(x,y \) – coordinate system, there are two unit vectors. The one pointing to the right with length \(1 \), commonly called \(\hat{i} \) or the unit vector in the \(x \) – direction. In addition, the other one pointing straight up, with length \(1 \), commonly called \(\hat{j} \) or the unit vector in the \(y \) -direction. Now, our vector coordinates can be seen as scalars of the unit vectors.

For instance, think of the \(x \) -coordinate of our vector as a scalar that scales \(\hat{i} \), stretching it by a factor of \(3 \), and the \(y \) – coordinate as a scalar that scales \(\hat{j} \), flipping it and stretching it by a factor of \(-2 \). In this sense, the vector that these coordinates describe are the sum of two scaled vectors.

Those two vectors have a special name. Together, they’re called the basis of a coordinate system. What this means is that when you think about coordinates as scalars, the basis vectors are what those scalars actually scale.

2. Different basis vectors

Now, you can ask yourself: is it possible to chose different basis vectors? Yes, it is! With these different basis vectors we will get a completely new coordinate system. If we chose two different vectors, for example two perpendicular vectors, we can scale both of those vectors and stretch them along axes. We will get different possibilities for many other vectors. Also, we can reconstruct any vector if we scale and then sum these two vectors.

3. Linear combination of vectors

Let’s further think about the following concept. We have a vector and it is represented in a 2D coordinate system as usual. Can it be represented with some other vectors? We can select two arbitrary vectors. Then, we can scale them with different scalar values and we can finally sum the two stretched vectors. With different scalar values for these two vectors, we can get vectors that sum and reconstruct our original vectors. For this new coordinate system, we will again have two coordinates (scalar values). However, this pair of values will not be identical with the original pair from our xy coordinate system. Furthermore, if we have two vectors and we scale them with different scalars, and then, sum them we get something that we call a linear combination of two vectors.

If we have any arbitrary two vectors, we scale them and add them, we will be able to reconstruct any vector in the plane (most of the time). However, if we are unfortunate and have two vectors that are parallel, then no matter how we scale them, we won’t be able to reconstruct all of the vectors, but their linear combination will end on the vector line.

4. What is a span of vectors ?

The “span” of a vector is \(\vec{v} \) and \(\vec{w} \) is the set of all of their linear combinations. It is written as follows:

$$ a\cdot \vec{v}+b\cdot \vec{w} $$

We can see that \(a \) an \(b \) can take any real number value and the “span” is a set of all the resulting vectors that represent linear combination of these two vectors. We will have the whole plane as a span for most of the vectors. On the other hand, if two vectors are lined up or if one vector is a scaled version of another, then the span is just that line where the two vectors sit.

A span is sometimes difficult to visualize if we think of vectors as arrows. In contrast, we can think about them as points. So, for our span example, the span of most pairs of vectors ends up being the entire infinite two-dimensional plane. If the vectors are on one line, then their span is just a line. The idea of a span gets a lot more interesting if we start thinking about vectors in three-dimensional space. For example, if you take two vectors, in a 3-D space, that are not pointing in the same direction, what does it mean to take their span? Well, their span is the collection of all possible linear combinations of those two vectors. This means that we first scale them with any possible pair of real numbers. Then those two stretched vectors we sum, and we do this for every possible combination of pairs.

One way to illustrate a span would be to have a knob that we can turn. As we turn the knob, the values of scalars increase/decrease and we just follow the tip of the resulting vector (a linear combination-red vector in the image above). Two scenarios can happen. In the first one, and more common one, the span of the vectors will generate a 2D plane in the 3D space (see the image below). The two starting vectors will sit in that plane. On the other hand, if the two starting vectors are on the same line, their span will be just that line in the 3D space.

And, what will be a span if we add a third vector? First, this third vector will be able to sit in the plane of the original two and in that case the span will not change. Second, the third vector can be outside of this 2D plane. If that’s the case we will get a span that covers the complete 3D space. We can say that in that scenario the third vector “unlocks” the third dimension and we will be able to reconstruct any 3D vector.

5. Linearly independent and dependent vectors.

If from a span we can simply remove one vector because it does not add any information or we can say that it is redundant, then we can say that this vector (e.g. \(\vec{v} \)) and the set of vectors are “linearly dependent”. That means that if a vector is linearly dependent in the span and we can remove it. That is, it is already a linear combination of the other vectors because it is in their span. For example, two vectors of size two are dependent as they lie on the same line. See the image below.

On the other hand, if we add another vector to a set of vectors it can add to the span. For instance, if we had only one vector \(\vec{v} \) and \(\vec{w} \) vector is not equal to it and does not lie along the same direction, we say that they are “linearly independent”. This vector will give a contribution to the span.

For instance, in the image above, the red vector does not lie in the plane (span of blue and yellow vectors), so it adds to their span and we say that this vector is linearly independent with the remaining two.

Linear independence

A collection or list of \(n \) -vectors \(a_{1} \), . . . , \(a_{k} \) (with \(k\geq 1 \)) is called linearly dependent if

$$ \beta _{1}a_{1}+\cdots +\beta _{k}a_{k}= 0 $$

holds for some \(\beta _{1} \),…,  \(\beta _{k} \) that are not all zero. In other words, we can form the zero vector as a linear combination of the vectors, with coefficients that are not all zero.

When a collection of vectors is linearly dependent, at least one of the vectors can be expressed as a linear combination of the other vectors: If \(\beta _{i}\neq 0 \) in the equation above (and by definition, this must be true for at least one \(i \)), we can move the term \(\beta _{i}a_{i} \) to the other side of the equation and divide by \(\beta _{i} \) to get

$$ a_{i}= \left ( -\beta _{1}/\beta _{i} \right )a_{1}+\cdots +\left ( -\beta _{i-1}/\beta _{i} \right )a_{i-1}+\left ( -\beta _{i+1}/\beta _{i} \right )a_{i+1}+\cdots +\left ( -\beta _{k}/\beta _{i} \right )a_{k} $$

The converse is also true: If any vector in a collection of vectors is a linear combination of the other vectors, then the collection of vectors is linearly dependent.

Linearly independent vectors

A collection of \(n \) -vectors \(a_{1} \), . . . , \(a_{k} \) (with \(k\geq 1 \)) is called linearly independent if it is not linearly dependent, which means that

$$ \beta _{1}a_{1}+\cdots +\beta _{k}a_{k}= 0 $$

only holds for \(\beta _{1}= \cdots = \beta _{k}= 0\). In other words, the only linear combination of the vectors that equals the zero vector is the linear combination with all coefficients zero.

The vectors

\( a_{1}= \begin{bmatrix}0.2\\-7.0\\8.6\end{bmatrix} \quad a_{2}= \begin{bmatrix}-0.1\\2.0\\-1.0\end{bmatrix} \quad a_{3}= \begin{bmatrix}0.0\\-1.0\\2.2\end{bmatrix} \)

are linearly dependent, since \(a_{1}+2a_{2}-3a_{3}= 0 \). We can express any of these vectors as a linear combination of the other two. For example, we have

$$ a_{2}= \left ( -1/2 \right )a_{1}+\left ( 3/2 \right )a_{3} $$

The vectors

\( a_{1}= \begin{bmatrix}1\\0\\0\end{bmatrix} \quad a_{2}= \begin{bmatrix}0\\-1\\1\end{bmatrix} \quad a_{3}= \begin{bmatrix}-1\\1\\1\end{bmatrix} \)

are linearly independent. To see this, suppose \(\beta _{1}a_{1}+\beta _{2}a_{2}+\beta _{3}a_{3}= 0 \). This means that

$$ \beta _{1}-\beta _{3}= 0 $$

$$ -\beta _{2}+\beta _{3}= 0 $$

$$ \beta _{2}+\beta _{3}= 0 $$

However, this is only possible if all of the values are equal to zero.

Independence-dimension inequality

If the \(n \) -vectors \(a_{1} \), . . . , \(a_{k} \) are linearly independent, then \(k\leq n \). In words:

A linearly independent collection of \(n \)-vectors can have at most \(n \) elements.


With the previous definitions we can now reflect back on the basis vectors. A collection of \(n \) linearly independent \(n \) -vectors (i.e., a collection of linearly independent vectors of the maximum possible size) is called a basis. If the \(n \) -vectors \(a_{1} \), . . . , \(a_{n} \) are a basis, then any \(n \)-vector \(b \) can be written as a linear combination of them. Moreover, any \(n \) -vector \(b \) can be written in a unique way as a linear combination of them.


This post introduced some very important concepts of vectors. In many areas of machine learning we will be searching for features and we will be examining their independence. Also, a good understanding of the concept of basis allow us to transform our data from one domain to another. Then, our data processing can be more efficient. In the next post, we will learn what linear transformations are and why matrices are so important.