#001 How to read a video and access a webcam with OpenCV in Python?

#001 How to read a video and access a webcam with OpenCV in Python?

Highlight: In the previous post we talked how we can manipulate pixels and images in Python using the OpenCV library. Now, it’s time to focus our attention to videos. In this post, you will learn some basic operations which are necessary for building your computer vision applications. First, we will explain how you can load camera frames and video files. Second, you will see how you can read, display and save videos using OpenCV. Third, you will learn how to create some simple video animations. Let’s start!

Tutorial Overview:

  1. How to read a video file with OpenCV?
  2. How to process and display a video file?
  3. How to save a video file using OpenCV in Pyrhon?
  4. Funny hacking with OpenCV

Every computer vision project commonly consists of the following three components. In the diagram below, we can see that a project starts with some input files, and after processing, we will get some output files.

Illustration of project with input and output files

At first, let’s learn how to load or read a video and display it. It is actually a very simple task: we can just load a video file from our hard disc, or alternatively, we can capture a video stream directly from a camera connected to our computer.

1. How to read a video file with OpenCV?

First of all, a video is actually a sequence of images which gives the appearance of motion. Videos can be seen as a collection of images (frames).

First, we need to upload a video from our computer since we are working in Google Colab. Second, we will connect it with an object usually called cap. Third, we are going to apply the function cv2.VideoCapture and create a class instance. As an argument we can specify the input video file name. On the other hand, in order to access a video stream, we will put the camera parameters instead. We can provide an index of our camera from which we want to read data. In case that you only have one camera, by default, it will be indexed with 0. If you have more than one camera the second will be named 1, third with 2 and so on. Let’s show this in an example.

# Necessary imports
import numpy as np
import cv2
import matplotlib.pyplot as plt
# If we are working in Google colab, we can display our captured frame
# with a function cv2_imshow(). If not, we use the function cv2.imshow().
from google.colab.patches import cv2_imshow
# Create a VideoCapture object
cap=cv2.VideoCapture("Video.mp4")
# Capture or input video frame-by-frame 
for i in range(10):
  ret, frame=cap.read()
  # Display the captured frame
cv2_imshow(frame)
While loop, video, OpenCV, Python

At the start of this process our indicator is on the first frame. When we apply command cap.read() the first frame from our video file will be loaded. It will be stored in a variable frame. If we call this command again, the second frame will be loaded and so on. Variable ret is a boolean data type that returns True if we are able to execute the read function successfully. Our frame can be loaded as a color image (it will have 3 channels) or grayscale image (it will have 1 channel). If we need more then one frame we will use “for” or “while” loops. We will explain that in more detail in the further text.

Output:

Output from code
# Create a Videocapture object
# we want to access a video camera
cap=cv2.VideoCapture(0)

There are many properties which we can read using the method cap.get(). Let’s see how we can get some properties from our video: frame width, frame height, and frames per second (fps). Our output will be a width, a height and a number of frames per second.

print(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

360

print(cap.get(cv2.CAP_PROP_FRAME_WIDTH))

640

print(cap.get(cv2.CAP_PROP_FPS))

25.0

2. How to process and display a video file?

After we read a video file or capture a live stream, we want to process and display our video output. The following code creates a while loop that reads frames from our video continuously. We can do this with a command cap.read(). Our frame is stored in a frame variable and ret is boolean data type that returns True if Python is able to read the VideoCapture object. After we finish this process we can release our output with command cap.release().

# Loop that runs until we reach an end of a video
while(True):
  ret, frame=cap.read()
# We will use this command if we are working in Python
# on our computer
# We would use command cv2_imshow(frame) for Google colab
# but it would not make much sense because Colab is
# not perfect to display a video. So, for this example, we 
# recommend to use standalone Python installation (e.g. Anaconda)  
  cv2.imshow("Web cam", frame)
  # When we press Q on our keyboard to exit video
  if cv2.waitKey(1) & 0xFF == ord('q'):
      break
# Release the video capture object
cap.release()

# Closes all the frames
cv2.destroyAllWindows()

Output:

Now, we can modify our frame. Let’s say that we want to convert our color video to a grayscale video. The code will be similar to the previous example. In particular, you need to define a variable (for example gray) and use it to store a newly generated frame. Next, we will call a method called cv2.cvtColor and we are going to use the argument cv2.color_BGR2GRAY. This will convert a color image (BGR frame) to a grayscale image (frame).

# Infinite loop that runs until video is turned off
while(True):
  ret, frame=cap.read()
  gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)  
# This command we use if we work in Python
# on our computer
# We could use command cv2_imshow(frame) for Google Colab,
# but it would not make much sense because Colab is
# not perfect to display a video 
  cv2.imshow("Web cam",frame)
  cv2.imshow("grayscale video",gray_frame)
  # When we press Q on our keyboard we will exit a video
  if cv2.waitKey(1) & 0xFF == ord('q'):
      break
# Release the video capture object
cap.release()

# Closes all the frames
cv2.destroyAllWindows()

When we worked with a video processing note that we use quite often a cv2.waitKey() function. It tells us how many milliseconds to wait for smoother video display. Commonly we chose from 1 to 50 milliseconds.

Once, we are finished, it is a good practice to close all windows, by calling a cv2.destoryAllWindows(). The windows will be automatically closed.

Finally, this code snippet will give us two outputs. The first one will be a color (BGR) video, whereas the second one will be a grayscale video.

Output:

Video capture, OpenCV, Python

3. How to save a video using OpenCV in Python?

After reading and displaying our video, we can move to the next step – how to save our output video. Similarly to the cv2.VideoCapture(), which we used for capturing a video, we are going to create a cv2.VideoWriter() object. By doing that, we will be able to write our file. This class takes a few arguments.

The first one is the name of our output, where we specify the name and extension of the file.

The second argument is the Fourcc code (four-character code). It is a 4-byte code which is used to compress the frames and to specify the video codec. It is good to know that supported codecs are platform-dependent, which means that if you want to work with them, these codecs should already be installed on your system. Some of the most common codecs are: DIVX, XVID, X264,  MJPG and others. You can see the list of possible Fourcc codecs here.Video file formats such as AVI (.avi), MP4 (.mp4) and Windows Media Video (.wmv) are most commonly used to store a digital video data. The combination of video file formats and Fourcc is not always straightforward (DIVX and AVI don’t match). If we are creating a video file in OpenCV, we need to take these factors in consideration. In Google Colab this combination of file extension and Fourcc code will work, and this will not.

Fourcc codec, OpenCV, Python

The third argument is a number of frames per second (fps) which defines how many frames you want to store per second in the output video file. The fourth argument is the size, which is the width and the height of our video frame. In our code, we are going to provide this argument in the form of a tuple (640, 480).

Let’s see an example how we can save a video from our camera.

cap=cv2.VideoCapture(0)
# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc('X', 'V', 'I', 'D')
out = cv2.VideoWriter('output.avi', fourcc, 20, (640,480))
while(True):
  ret, frame=cap.read()
  cv2.imshow("Camera", frame)
  out.write(frame)
  # Press Q on keyboard to exit
  if cv2.waitKey(1) & 0xFF == ord('q'):
      break
# When everything done, release the video capture object 
# and save the output
cap.release()
out.release()
# Closes all the windows
cv2.destroyAllWindows()

This code will save a video file “output.avi“. the program will exit once we press ‘Q’.

4. Funny hacking with OpenCV

It’s time for a little bit of fun. We want to load a video and try to change and modify its colors. To do this, we split every color frame into three color channels (BGR). They are stored as matrices with the same height and the width as our original video. Then, we will treat a blue channel as a NumPy matrix. Then, by changing and modifying its values we can create a fun visual effect. For instance, we can increase the pixel intensity of blue color, and as a result, our output video will appear more blue than the original one. When you are doing similar experiments, do make sure that the values in matrices remain of the uint8 data type.

cap=cv2.VideoCapture("Video.mp4")
ret, frame=cap.read()
cv2_imshow(frame)

Output:

Output from code
# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc('M', 'P', '4', 'V')
out = cv2.VideoWriter('output2.mp4', fourcc, 10, (640,360))
# Spliting our chanels
(b, g, r)=cv2.split(frame)
for i in range(100):
 # In every iteration increase a blue channel pixel value for 1. 
  b = b+1  
  frame = cv2.merge([b, g ,r ] )
  out.write(frame)
out.release()

Output:

In addition, we can create a new example video which will start as a black image. Then, we will create some interesting effects in that image.

# Create 3 images that will represent our
# blue, green and red channels
b = np.zeros((256, 256), dtype = 'uint8')
g = np.zeros((256, 256), dtype = 'uint8')
r = np.zeros((256, 256), dtype = 'uint8')
fourcc = cv2.VideoWriter_fourcc('M', 'P', '4', 'V')
out = cv2.VideoWriter('output.mp4', fourcc, 10, (256,256))

for i in range(256):
# Gradually increase the intensity of the blue color
# from the first column to the last
  b[:,i] = i
# Gradually decrease the intensity of the green color
# from the first row to the last  
  g[i,:] = 255 - i   
  r[:,i] = 0
  frame = cv2.merge([b, g ,r ] )
  out.write(frame)
out.release()

Output:

To conclude, we do hope that you love these experiments and that you will have fun creating your own.

Summary

In this post we learned how to work with video files in Python with OpenCV. We covered three common steps that are necessary for a computer vision project. We explained how to load a video, and how to perform some basic frame processing. In addition, we learned how we can output the processed content and save it to our computer. In the next post, we will learn how to draw shapes and write a text on images with OpenCV in Python.

References:

[1] Loading Video Source – OpenCV with Python for Image and Video Analysis 2