Overview
- CNNs are uncommonly good at finding features (in an image).
- They have a hierarchal way of seeing things; inspired by examining the way our own visual cortex works.
- A convolutional layer is in contrast to a fully connected layer in that each neuron is only connected to a few nearby local neurons in the previous layer.
Mathematical Background
- The kernel/filter is just a matrix of weights, like the example below:
- The input image is also pre-processed into a matrix of weights, although larger in dimensionality than the kernel.
Process of Operation
- Convolutional layer
- Nonlinarity layer
- Pooling layer
Networks that contain only the above three layers are known as 'Fully Convolutional Networks'

- Once trained, new images are classified by inferring their similarities to the trained model.
Training
- CNNs will receive a big, labelled, data set with many images of the type of object(s) they aim to recognise.
- The outputs of the network are compared to what is expected (possible due to the fact that the training data is labelled).
- The cost function that is used in this comparison, depends on the task:
- Once loss is measured, the back propagation algorithm will calculate the partial derivatives with regards to the neural network parameters. Then a regular gradient descent method will be used.