OpenCV #011 Discrete Fourier Transform, Part 1
Digital Image Processing using OpenCV (Python & C++)
Highlights: In this post, we will learn about why the Fourier transform is so important. We will also explain some fundamental properties of Fourier transform. Also, we will discuss the advantages of using frequency-domain versus time-domain representations of a signal. Finally, uses cases will be shown where it may be applied.
In Mathematics, we often use the Fourier Transform. You take a hard problem, convert it to an easier problem, solve that problem, then convert back.
Tutorial Overview:
This post covers the following topics:
- Frequency Domain Analysis
- Time and Frequency Domain
- The Linearity of Sum in 2 Dimension
- Discrete Fourier Transform
- Code for Discrete Fourier Transform in 2D
1. Frequency Domain Analysis.
When Jean Baptiste Joseph Fourier presented the results of heat propagation and diffusion to the Institut de France. He mentioned in his presentation that any periodic signal could be represented as a series of sinusoids and this thought has been used in many areas in science, mathematics, and engineering. This concept is the foundation of what we know today as the Fourier Series. In the later section below, will show how a square wave can be created by a distribution of square waves.
Now, what is the frequency-domain analysis? It is an important tool in signal processing applications and has been applied in vast areas such as communications, remote sensing and image processing. In this section, we would focus more on image processing.
When we mentioned that any signal is composed of different frequencies, this applies to 1-dimensional signals such as audio signals going to a speaker or a 2-dimensional signal such as the image. Let’s take a prism as an illustration of how a signal is a composition of signals of alternating frequencies. What happens when light passes through a prism? The prism breaks the light into its frequencies revealing a full-color spectrum.
The time-domain analysis shows how a signal changes over time, whereas frequency-domain analysis shows how the signal’s energy is distributed over a range of frequencies. Okay let’s build on this.
As an illustration, here we have just a sinusoid in the time domain with a frequency of \(10 Hz\) with a sampling rate of \(200 Hz\). Once we convert this sinusoid into the frequency domain, we may see the frequencies that are present
A frequency-domain representation also includes information on the phase shift that must be applied to each frequency component in order to recover the original time signal with a combination of all the individual frequency components.
2. Time and Frequency
The same thing can also be applied to square wave. Let’s say we have 2 signals. First, the sinusoid \(\sin(2 \pi f_{0}t) \) of frequency \(f \). Along with the second one, \(\frac{1}{3} \sin(2\pi (3f_{0}) t) \) of frequency \(3f\), scaled to the \(\frac{1}{3}\). As soon as we sum up these 2 different signals, we get the image on the left \(g (x) \). Let’s take a look at this image, you will notice a similar shape to a square wave is starting to take shape.
Next, let’s start adding more of these odd frequencies. It is important to realize when took \(x\) and added something that wiggles at 5 times more than the original frequency. If you take a closer look at the image, you will quickly notice it is gettin
Out of curiosity, let’s continue adding odd frequencies, 7, 9, 11, and 13.
In the final analysis, let’s define a square wave. For instance, let’s simply put together by adding up enough amount of non-identical sinusoids vibrating at different frequencies. As a result, once combined, we get something as close to a square wave. By taking a look at the frequency spectrum below, you may see that as the frequency increases, the amplitude decreases.
3. The Linearity of Sum in 2 Dimension
Since we have seen this in 1 dimension, it extends pretty simply in 2 dimensions. Let’s say we have a sinusoid and it’s at a particular frequency, in our case \(7 Hz\). If it is only made up of vertical stripes, what is the power spectrum of this image? Well, take a look at the image below.
As an illustration, let’s look at the image at the right titled \(Frequency Spectrum 1\). If you look at the image in detail, there are 3 bright dots. In the middle and there is a dot to the left and also to the right. Those are spikes from the frequency of this particular sinusoid.
The picture you see now only has horizontal stripes. This is simply the transposed version of the previously shown image. Let’s suppose we take two images of the sinusoids \(Sinusoid 1\) and \(Sinusoid 2\), summed them together. We are going to get some a combination of sinusoids below.
Python
C++
#include <opencv2/core/mat.hpp>
#include <opencv2/opencv.hpp>
#include <iostream>
#include <math.h>
#define CV_PI 3.1415926535897932384626433832795
using namespace std;
using namespace cv;
double A = 0.5;
int f = 7;
double t[100];
double output[100];
int length = 100;
double phi = CV_PI / 4;
double result[100];
double sinusoid[100][100];
void magnitudeSpectrum(Mat complex);
void fftShift(Mat magI);
cv::Mat computeDFT(Mat image);
int main()
{
// create a range of number from 0 - 1
for (int i = 0; i < length; i++)
{
t[i] = i/100.0;
}
// this what you need 0 -> min value after norm
// 1 -> max value after nom
// cv::NORM_MINMAX normalize for min and max values
for (int i = 0; i < length; ++i)
{
result[i] = A*cos(2*CV_PI*f*t[i] + phi);
//cout << result[i] << endl;
}
// Normalize our input image
cv::Mat img(1, 100, CV_8UC1, result), output;
cv::normalize(img, output, 0, 1, cv::NORM_MINMAX);
// Copy the result multiple times from the array
for (int i = 0; i < length; i++)
{
for (int j = 0; j < length; j++)
{
sinusoid[i][j] = result[j]*length;
}
}
// convert a matrix into Mat in opencv
cv::Mat Sinusoid1(100, 100, CV_64F, sinusoid);
// Vertical 2d signal
cv::imshow("Original image", Sinusoid1);
cv::Mat dft_1 = computeDFT(Sinusoid1);
magnitudeSpectrum(dft_1);
waitKey(0);
// Horizontal 2d signal
// Transpose the matrix
Mat Sinusoid2 = Sinusoid1.t();
cv::imshow("Original image", Sinusoid2);
cv::Mat dft_2;
dft_2 = computeDFT(Sinusoid2);
magnitudeSpectrum(dft_2);
waitKey(0);
// Linear combination of 2, 2d signal
cv::Mat combine_signal;
cv::add(Sinusoid1, Sinusoid2, combine_signal);
cv::imshow("Original image", combine_signal);
cv::Mat dft_1_2 = computeDFT(combine_signal);
magnitudeSpectrum(dft_1_2);
waitKey(0);
return 0;
}
// Compute the Discrete fourier transform
cv::Mat computeDFT(Mat image) {
Mat padded; //expand input image to optimal size
int m = getOptimalDFTSize( image.rows );
int n = getOptimalDFTSize( image.cols ); // on the border add zero values
copyMakeBorder(image, padded, 0, m - image.rows, 0, n - image.cols, BORDER_CONSTANT, Scalar::all(0));
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complex;
merge(planes, 2, complex); // Add to the expanded another plane with zeros
dft(complex, complex, DFT_COMPLEX_OUTPUT); // furier transform
return complex;
}
void fftShift(Mat magI) {
// crop if it has an odd number of rows or columns
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
int cx = magI.cols/2;
int cy = magI.rows/2;
Mat q0(magI, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat q1(magI, Rect(cx, 0, cx, cy)); // Top-Right
Mat q2(magI, Rect(0, cy, cx, cy)); // Bottom-Left
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);
}
void magnitudeSpectrum(Mat complex) {
Mat magI;
Mat planes[] = {
Mat::zeros(complex.size(), CV_32F),
Mat::zeros(complex.size(), CV_32F)
};
split(complex, planes); // planes[0] = Re(DFT(I)), planes[1] = Im(DFT(I))
magnitude(planes[0], planes[1], magI); // sqrt(Re(DFT(I))^2 + Im(DFT(I))^2)
// switch to logarithmic scale: log(1 + magnitude)
magI += Scalar::all(1);
log(magI, magI);
fftShift(magI); // rearrage quadrants
// Transform the magnitude matrix into a viewable image (float values 0-1)
normalize(magI, magI, 1, 0, NORM_INF);
imshow("Magnitude Spectrum", magI);
}
4. Discrete Fourier Transform
Usually, when experimenting with digital images, we must work with a finite number of discrete samples. These discrete samples are the pixels that make up an image. It is also good to know that computer analysis of images requires the discrete Fourier transform.
The question now is why are we learning about the Fourier Transform? Let’s say we have an input image, we want to understand the frequency available by transforming our signal, instead of by \(f(x) \), which is in space, to \(F( \omega) \) in the frequency domain.
\(F(\omega)= R(\omega) + I(\omega) \)
Now, what does it achieve? When we give the algorithm an input image, it will decompose an image into its sine and cosine components. In other words, it will transform an image from its spatial domain to its frequency domain.
The formula to compute the discrete Fourier transform on an \(M * N \) size image is
\(F(u,v)={\frac{1}{MN}} \sum_{x=0}^{M-1} \sum_{y=0}^{N-1} f(x,y) e^{-j2\pi( ux/M + vy/N)} \)
The formula to return to the spatial domain is:
\(f(x,y)=\sum_{u=0}^{M-1} \sum_{v=0}^{N-1} F(u,v) e^{j2\pi( ux/M + vy/N)} \)
Here \(f \) is the image value in its spatial domain and \(F\) in its frequency domain. The result of the transformation is a complex number that holds both the amplitude and phase. In complex numbers, we have two parts. A real part which is the cosine part and the imaginary part which is the sine part. By taking the square root of the sum of these 2 elements, we express the magnitude as
\(A= \pm \sqrt{R\left ( \omega \right )^{2}+I\left ( \omega \right )^{2}} \)
and the phase as:
\(\Theta (u, v) = \tan^{-1} [\frac{I(u,v)}{R(u,v)}] \)
Where \(R(u,v)\) is the real part and \(I(u,v)\) is the imaginary. The magnitude is the amplitude of sine and cosine waves in the Fourier transform formula. As expected, \(theta( \Theta) \) is the phase of the sine and cosine waves.
5. Code for Discrete Fourier Transform in 2D
Let’s see a little experiment on how we could analyze an image by transforming it from its spatial domain into its frequency domain. Here we provided the implementation of the discrete Fourier Transform both in python and C++. Let’s take as an example an image of a rectangle and plot the magnitude spectrum of the signal
Looking at the Magnitude spectrum of the image you may say that the most influential components of the frequency domain are the brightest dots on the magnitude image. Here there is more power in the area where it’s really bright then it starts to fall out towards the edges in those darker regions.
Python
C++
#include <opencv2/opencv.hpp>
#include <opencv2/core/mat.hpp>
#include <opencv2/opencv.hpp>
#include <iostream>
#include <math.h>
#define CV_PI 3.1415926535897932384626433832795
using namespace std;
using namespace cv;
void magnitudeSpectrum(Mat complex);
void fftShift(Mat magI);
cv::Mat computeDFT(Mat image);
int main()
{
cv::Mat img;
img = imread("rectangle.jpg", IMREAD_GRAYSCALE);
cv::imshow("Original image", img);
waitKey(0);
cv::Mat dft = computeDFT(img);
magnitudeSpectrum(dft);
waitKey(0);
return 0;
}
// Compute the Discrete fourier transform
cv::Mat computeDFT(Mat image) {
Mat padded; //expand input image to optimal size
int m = getOptimalDFTSize( image.rows );
int n = getOptimalDFTSize( image.cols ); // on the border add zero values
copyMakeBorder(image, padded, 0, m - image.rows, 0, n - image.cols, BORDER_CONSTANT, Scalar::all(0));
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complex;
merge(planes, 2, complex); // Add to the expanded another plane with zeros
dft(complex, complex, DFT_COMPLEX_OUTPUT); // furier transform
return complex;
}
void fftShift(Mat magI) {
// crop if it has an odd number of rows or columns
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
int cx = magI.cols/2;
int cy = magI.rows/2;
Mat q0(magI, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat q1(magI, Rect(cx, 0, cx, cy)); // Top-Right
Mat q2(magI, Rect(0, cy, cx, cy)); // Bottom-Left
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);
}
void magnitudeSpectrum(Mat complex) {
Mat magI;
Mat planes[] = {
Mat::zeros(complex.size(), CV_32F),
Mat::zeros(complex.size(), CV_32F)
};
split(complex, planes); // planes[0] = Re(DFT(I)), planes[1] = Im(DFT(I))
magnitude(planes[0], planes[1], magI); // sqrt(Re(DFT(I))^2 + Im(DFT(I))^2)
// switch to logarithmic scale: log(1 + magnitude)
magI += Scalar::all(1);
log(magI, magI);
fftShift(magI); // rearrage quadrants
// Transform the magnitude matrix into a viewable image (float values 0-1)
normalize(magI, magI, 1, 0, NORM_INF);
imshow("Magnitude Spectrum", magI);
}
Output
Summary
If you are just starting with Fourier Analysis, this post gave you insights on what its about. We will use these ideas heavily in image analysis, and filtering in the later sections. In the next post, we will talk more about filtering in the frequency domain. This is more focused on applying the high and low pass filters in a 2d image. So sit tight, grab a cup of tea and let’s ride on.