How CNN depth changes after applying filters?

In CNNs, a lot of variation in the dimensions of the matrix is seen due to applications of the various filter channels of varying dimensions. It sometimes confuses how after convolving filter to an input image changes the dimension of the depth.

To understand the intuition behind it, lets first take a basic Input image of size 16x16x3 and convolve it with one 3x3 kernel.

Convolution with 1 filter (3rd dimensions of both input and kernel are always equal)

Every channel of input image is multiplied with the same kernel and summed up to output a single a feature value.

After convolution with 1 kernel of size 3x3, output is of 14x14x1 feature map

In case, when there are more than 1 filters ( 2 in this case), feature output for each filter is stacked up.

The output dimension is now 14x14x2 ,because of 2 filters of size 3x3

Similarly, if N filters of size (3x3) are applied to input image with 0 padding and stride =1, the output dimension changes to (14x14xN). Each feature map is the output of each filter applied.

--

--

Anukool Chaturvedi

Data Scientist with keen interest in Image Segmentation and GANs