## CamCal 009 Stereo Geometry – Epipolar lines and Essential Matrix

*Highlights*: In this post we continue our talk about stereo geometry. We will show the relation between two calibrated cameras. The goal is to transform geometry into algebra, so then the computer can understand it.

### Tutorial Overview:

*This post covers the following topics: *

- Intro
- Stereo Correspondence
- Stereo Geometry
- Cross Product 1
- From Geometry to Algebra
- Cross Product 2
- Essential Matrix

## 1. Intro

Before we start, we have to remind ourself some things. So first thing we need to remind is projective transformation. These are \(3\times 3 \) matrices that operate in two dimensions and transform an image in a variety of ways.

Second important thing is called Homography. Basically, homography is mapping between images (image planes) taken from same center of projection. It also correspond as mapping between any images of a planar surface. A homography is a perspective transformation of a plane, that is, a reprojection of a plane from one camera into a different camera view, subject to change in the translation and rotation of the camera.So, the full transformation is referred to as a general projective transformation, or a homography.

In addition, it is important to remember what a line in an image plane. It is defined by the intersection of an image plane and a plane through the center of the projection. That plane would be defined by a normal \(l \) and that is why both a line and a point are three vectors in this homogeneous or projective space.

Also the equation of a line can be represented by the normal to the plane in the projective space.

We also talked about operations that if we have two points defined in terms of projective coordinates, we can find the line that connects them by doing the cross-product.

Likewise, if we have two lines, we can find the point of intersection by doing their cross product. These are some tools that we are gonna make use.

## 2. Stereo Correspondence

If we have two views of the scene and these can be from two cameras that are not necessarily going to have parallel image planes, what is the relationship between the location of a scene point in one camera and its location in the other camera? That is what we want to know. We need to find pairs of points that correspond to the same scene point. Just to animate it a little bit.

The idea is that we have a point \(P \) in the scene. That point images to some location in the left image plane, \(CP_{1}\). And that line (connecting \(CP_{1}\) and \(P\)) maps to a line in the center of projection number two. That is what is referred to as an epipolar line with respect to that point. So, if we are looking for that point, it has to be on that line. Likewise, the ray from the other center of projection \(CP_{2}\) maps to the corresponding epipolar line in the other image. Those pair of epipolar lines are defined by a plane called the epipolar plane, and that plane is defined by this base line vector (\(CP_{1} CP_{2}\)), the translational vector between the two centers of projection and the world point \(P\). We can also think of it as being defined by the three points \(P \), \(CP_{1} \), \(CP_{2} \). So, there is a different epipolar plane for each point out in the world.

Given these points, this allows us to define a couple of terms.

**The base line is the vector separating the two centers of projection.** The epipolar plane given a world point and the baseline defines the plane. **The epipolar lines are the intersection of the epipolar plane and each of the image planes so that is a pair of corresponding epipolar lines.** **The epipole of each image is the point in the image plane that intersects the base line.** Sometimes we do not actually see that, in fact if all of lines would intersect outside the image we would not be able to see the epipole.

## 3. Stereo Geometry

The problem with everything that we just showed you is it would be very difficult to turn it into code. The reason is that computers don’t do so well with geometry. What we need to do is we need to go from the geometry to algebra because computers are fine for algebra. That is we can convert algebraic expressions in to computations but it is very hard to go from geometry to computation.

So, we are going to start off with a pair of calibrated cameras. Here is our pair of calibrated cameras. We have two camera centers \(O_{c} \) and \({O_{c}}’ \) and they are both looking at some world point \(X\) and it is viewed in the two images and it is seen as \(x\) on the left and as \(x’\) on the right image.

Because these are calibrated cameras, we are going to assume that we know the vector \(T \) along the baseline. **So if we have to shift to get from one camera to the other in \(x, y, z \) that is what our translation would be. We also know \(R \), the rotation matrix. that we would have to rotate one camera center to become aligned with the other camera center. That is what it means to be calibrated.** So we can write this this way:

So the location of the point \(X \) in the prime frame can be expressed by rotating \(x \) as appearing in the non-prime frame and then translating it by \(T \).

$$ {X_{c}}’= RX_{c}+T $$

## 4. Cross Product

Before we continue, a couple of reminders. So, the first one is to remember what is a cross product.

So, the cross product of two vectors will be a third vector that is perpendicular to both inputs.

$$ \vec{a}\times \vec{b}= \vec{c} $$

$$ \vec{a}\cdot \vec{c}= 0 $$

$$ \vec{b}\cdot \vec{c}= 0 $$

So \(\vec{a}\times \vec{b} \) gives us a new vector, \(\vec{c}\). And that vector \(\vec{c} \) has the property of being perpendicular to both \(\vec{a} \) and \(\vec{b} \). And the magnitude of that vector is equal to the sine of the angles between them. We should realize right away that if we take any vector \(\vec{a} \) across itself, that is going to be zero. Because the angle between a vector and itself is zero, the sine is zero. So, \(\vec{a}\times \vec{a} \) is \(0 \) no matter what \(\vec{a} \) is.

## 5. From Geometry to Algebra

So now we will start the process of going from geometry to algebra. We will write what we have derived earlier.

$$ {X}’= RX+T $$

Let’s just cross both sides of that equation by \(T \). So we have:

$$ T\times {X}’= T\times RX+T\times T $$

This is a normal to the plane. Remember that \(T\times {X}’ \) is going to be something that is perpendicular to both of them, so that would be perpendicular to the epipolar plane that we are showing here. More importantly, \(T\times T \) is equal to zero. So, that just means that \(T\times {X}’= T\times RX \). Now let’s take both sides of that equation and we are going to dot it with \({X}’ \). So we got:

$$ {X}’\cdot ( T\times {X}’ )= {X}’\cdot ( T\times RX ) $$

Well, let’s take a look at this left-hand side. \(T\times {X}’ \) gives us a vector that is perpendicular to the plane containing \(T \) and \({X}’ \).So it is perpendicular to \(T \) and it is perpendicular to \({X}’ \). And then we dotted it with \({X}’ \) and same as before, we get zero, because it is perpendicular and if everything goes well, we get the following equation:

$$ 0= {X}’\cdot( T\times RX) $$

## 6. Cross Product 2

Time for reminder number two. It is more about cross products. So there is a simple formula for how to get a cross product.

$$ \vec{a}\times\vec{b}= \begin{bmatrix} 0 & -a_{3} & a_{2} \\ a_{3} & 0 & -a_{1}\\ -a_{2} & a_{1} & 0 \end{bmatrix}\begin{bmatrix}b_{1}\\ b_{2}\\ b_{3}\end{bmatrix} = \vec{c}$$

We can just write this as a matrix multiplication. So we just build this matrix from our \(\vec{a} \) and take our vector \(\vec{b} \). As the result says that the first component of \(\vec{c} \) is going to be \(-a_{3}\times b_{2}+a_{2}\times b_{3} \), and that is the first component of the cross product of \(\vec{a} \) and \(\vec{b} \). Same thing for the second component and the third component. So we can just write cross product as matrix multiplication. We are going to define a operator, and this operator is written here:

$$ [a] _{x} = \begin{bmatrix}0 & -a_{3} &a_{2} \\a_{3} & 0 &-a_{1} \\ -a_{2} &a_{1} &0 \end{bmatrix}$$

This \(x \) means cross product. So when we do this little bracket \([]\), that means that we are going to substitute in this matrix, which is sort of the cross product in matrix of \(a \). If we want to do \(\vec{a}\times \vec{b} \) we would just multiply this matrix with \(\vec{b} \). So the notation as we wrote here is:

$$ \vec{a}\times \vec{b} = [\vec{a}] _{x} \vec{b} $$

Now, the rank of this matrix is \(2 \) and it is a \(3\times 3 \) matrix. Now we are going to make use of that later, because as you may remember,** if you multiply matrices together, ranks can only get smaller**. We multiply a rank \(2 \) matrix by some other matrix, and we have got a rank \(2 \) situation.

## 7. Essential Matrix

Remember this equation, \(0= {X}’\cdot \left ( T\times RX \right )\)? We can just rewrite that again. So, now we can get rid of that whole cross product thing, and we get equation:

$$ {X}’\cdot \left ( \left [ T \right ] _{x} RX \right ) = 0 $$

So, let’s do a substitution. We are going to use \(E= \left [ T \right ] _{x} R\) so it is just this substitution in here. Now, we have this expression that says:

$$ {X}’^{T}EX= 0 $$

And this \(E \) is what is called the **essential matrix**. It basically relates the point \(X \) and the same point, but described in the other camera frame ( \(X’ \)). Note that these are world points, in these frames for these calibrated systems. In a little bit we will talk about the relationship between world points and the image points. In fact, one way of thinking about this really easily is this is equal to \(0 \). So what if we multiplied \(X \) by some value \(A \)? Would the equation still be true? Of course it would. So not only is it true for the point out here, as expressed in this coordinate frame, it is true for every point along that ray, expressed in that coordinate frame. Likewise for every point along that second ray, expressed in that coordinate frame.

We said that this works for any point along that ray, and we can think of that as the homogeneous representation of all the points along that ray. So this \(EX \) in the equation above is another vector. We will just call that \(l \). So remember that \(X^{T}l \), or \(p^{T}l \), \(l^{T}p \) was the definition of a line in projective geometry. What this means is, if we knew where some point was in the image and we know what ray it is along in the image and where it is in the image, we could put that point into this equation. And we would have this \(l \), which defines for us a line, in the other frame. Just \({X}’^{T}l= 0 \), that was the definition of a line in our projective geometry. **In other words, if we have a point in one image of a calibrated set of cameras, there is a line in the other image on which that point must lie. That is the epipolar line. **

### Summary

In brief, what we have done is we have now converted the epipolar constraint in to an algebraic expression. The next thing is essential matrix computation, after which we can make use of it.