Category: Other

#024 CNN Convolutional Operation of Sliding Windows

Convolutional operation of sliding windows In the previous post we learned about the sliding windows object detection algorithm using a \(convnet \), but we saw that it was too slow. In this post we will see how to implement that algorithm convolutionaly. Let’s see what that means. To build up the convolutional implementation of sliding windows, let’s first see how we can turn \(Fully \enspace connected \) layers in our neural network into \(Convolutional\) layes.…
Read more

#023 CNN Object Detection

Object Detection We have learned about object localization as well as landmark detection, now let’s build an object detection algorithm. In this post we’ll learn how to use a \(convnet \) to perform object detection using a Sliding windows detection algorithm. Car detection – an example Let’s say we want to build a car detection algorithm. Some examples of a training set We can first create a labeled training set \((x,y) \) with closely cropped…
Read more

#022 CNN Landmark Detection

Landmark Detection In the previous post we saw how we can get a neural network to output \(4 \) numbers: \(b_{x} \), \(b_{y} \) ,\(b_{h} \), and \(b_{w} \) to specify the bounding box of an object we want neural network to localize. In more general cases we can have a neural network which outputs just \(x \) and \(y \) coordinates of important points in the image, sometimes called landmarks.  Let’s see a few…
Read more

#021 CNN Object Localization

Object Localization Object detection is one of the areas of computer vision that’s exploding and it’s working so much better than just a couple years ago. In order to build up object detection we first learn about object localization. Let’s start by defining what that means. We have already said that the image classification task is to look at a picture and say is there a car or not. Classification with localization means not only do…
Read more

#020 CNN Data Augmentation

Data Augmentation Most computer vision tasks could use more data and data augmentation is one of the techniques that is often used to improve the performance of computer vision systems. The computer vision is a pretty complicated task. For an input image we have to figure out what is in that picture and we need to learn a decently complicated function to do that. In practice, having more data will help  for almost all computer…
Read more