Understanding Convolutional Neural Networks with CNNs

  • #1
1,427
96
TL;DR Summary
Understand conv net layers in a CNN
Hello,
I have been learning about convolutional neural networks (CNNs) recently and wonder if I could get some help with a specific question:
  • Assume we start with an input grayscale image having size NxN pixels. The image is passed to the 1st convolutional layer which has 3 filters (kernels) of smaller size called K1, K2, K3.
  • Three convolutions are performed in this first conv layer: the 3 different kernels are sequentially applied to the input image to create the 3 different feature maps FP1, FP2, FP3 (the outputs of the convolution operations).
  • The 3 feature maps FP1, FP2, FP3 are then stacked in a 3D matrix called M1.
Assume there is also a 2nd convolutional layer with 3 more and different kernels K4, K5, K6.
How are the 3 kernels K4, K5, K6 in the 2nd conv layer applied to the 3 feature maps FP1, FP2, FP3 generated in the 1st conv layer?
Is K4 convolved with FP1, FP2, FP3, then K5 is convolved with FP1, FP2, FP3, and finally K6 is convolved with FP1, FP2, FP3? If so, we end up with a volume containing 9 new feature maps. Is that correct?


At very end, the 9 new features maps from the last convolutional layer are all flattened into a vector (1D array) with as many elements as the nodes in the input layer of the artificial neural network: starting with the first feature map, its rows are concatenated one by one in a straight line and this process continues for all other 8 feature maps. What we get a is a very long 1D vector that is then fed into the input layer of the ANN...

Thanks!
 
Last edited:

Answers and Replies

  • #2
Yes, that is correct. The 3 kernels K4, K5, K6 in the 2nd conv layer are applied to the 3 feature maps FP1, FP2, FP3 generated in the 1st conv layer. K4 is convolved with FP1, FP2, FP3, then K5 is convolved with FP1, FP2, FP3, and finally K6 is convolved with FP1, FP2, FP3. This process will result in a volume containing 9 new feature maps.At the end, the 9 new features maps from the last convolutional layer are all flattened into a vector (1D array) with as many elements as the nodes in the input layer of the artificial neural network. Starting with the first feature map, its rows are concatenated one by one in a straight line and this process continues for all other 8 feature maps. This results in a very long 1D vector that is then fed into the input layer of the ANN.
 

Suggested for: Understanding Convolutional Neural Networks with CNNs

Replies
2
Views
555
Replies
18
Views
795
Replies
2
Views
572
Replies
4
Views
805
Replies
9
Views
815
Replies
1
Views
388
Back
Top