Strided Convolution

One method to apply convolution is to use strides. Usually while performing convolution, if we want to find the value of the next element of a row towards the right direction, then we move the filter towards the right by one column. Similarly, if we want to find the value of an next element in a column, towards the bottom direction we move the filter in the bottom direction by one row.

In strided convolution instead of moving the filter by one column or row, we move it by s columns/rows

2637430364822179831434836819678639224934819374634 * 31 - 1 400423 = 91694410091728312774

So for the [1, 1] element of the we place the filter’s top left corner with the image top left corner. Now for the next element of the resultant i.e. [1, 2], we shift the filter by 2 columns.

26374303648221 7_{3} 9_{1} 8_{- 1} 3143 4_{4} 8_{0} 3_{0} 6819 6_{4} 7_{2} 8_{3} 6392 24934819374634 = 100

Similar for element [2, 1], we shift the filter by 2 rows

26 3_{3} 7_{1} 4_{- 1} 30 36 4_{4} 8_{0} 2_{0} 21 79 8_{4} 3_{2} 1_{3} 43 4836819678639224934819374634 = 69

In general if we have a n x n image, a f x f filter, if we use padding p and stride s, then the resultant image has dimensions $(\frac{n + 2 p - f}{s} + 1) \times (\frac{n + 2 p - f}{s} + 1)$

If the above fraction is not an integer, we round it down. By convention, while performing strided convolution, if after moving the filter by either s columns or rows, if we find the filter having some of its part outside the image, we do not compute that part.

You might have seen the use of the word convolution to represent a different process, but in deep learning literature this is what is defined as convolution. To read more on this go to - Technical note on cross-correlation vs. convolution

Now we know how convolution is performed on two dimension images, let us move to three dimensional volumes (example. RGB images) in Convolutions over Volume

Digital Garden

Explorer

Strided Convolution

Graph View

Backlinks