The convolutional operation is of the building block of CNNs. In this section we will see the example of Vertical Edge Detection
Here the 6 x 6 matrix is an image we want to work on. The 3 x 3 matrix is called as the filter. In some research papers it is also known as the kernel.
* is the convolution operator. Here it is not the multiplication operator.
Let us consider [1, 1] element which is -5 of the resultant matrix, we get this value as follows :
Imagine the 3 x 3 filter is kept on the image matrix with its top corner aligning with the top corner of the image matrix. Here I have written the values of the filter matrix in the subscript of that of the image matrix.
To find the value -5 we multiply the image matrix value on which the filter is there (i.e. ones with the subscript) with the filter value (i.e. the subscript value).
For the [1, 2] element of the resultant matrix, the filer is shifted one column towards the right. Similarly, for the [2, 1] element the filter is shifted one row towards the bottom.
This procedure is repeated until we do not find the complete resultant matrix.
But how does this help us with edge detection. Think of the filter matrix values as brightness. So the matrix used above has bright values in the first column and light values in the third column. The zero column means that the values might me in the mid range.
So we can say that vertical edge is a 3 x 3 region (since our filter is a 3x3 matrix) where there brighter pixels on one side and lighter pixels on the other.
When we convolute this filter onto the image the vertical edges become more visible.
Note
To implement this in code we use the following :
- In Python (manual implementation) -
conv-forward- In
tensorflow-tf.nn.conv2d- In
keras-Conv2D
For a horizontal edge we use the matrix :
There are several types of filters :
-
Sobel filter - It put more weight on the central pixel
-
Scharr filter -
In complicated datasets one option is to learn the filter matrix values as parameters during training.
By setting this values as parameters, it is seen that neural networks can learn to recognise lower level features.
In general if we have a n x n image and a f x f filter then our resultant image matrix will have dimensions (n - f + 1) x (n - f + 1)
In order to build deep neural networks, one modification which we need to make to the basic convolutional operation is Padding
In this example we performed convolution on only two dimensional images i.e. with only two colour channels (such as black and white images). To perform it on images with more channels were do Convolution over volumes.