How CNN depth changes after applying filters?
In CNNs, a lot of variation in the dimensions of the matrix is seen due to applications of the various filter channels of varying dimensions. It sometimes confuses how after convolving filter to an input image changes the dimension of the depth.
To understand the intuition behind it, lets first take a basic Input image of size 16x16x3 and convolve it with one 3x3 kernel.
Every channel of input image is multiplied with the same kernel and summed up to output a single a feature value.
In case, when there are more than 1 filters ( 2 in this case), feature output for each filter is stacked up.
Similarly, if N filters of size (3x3) are applied to input image with 0 padding and stride =1, the output dimension changes to (14x14xN). Each feature map is the output of each filter applied.