Convolutional neural network (CNN or ConvNet) is a type of neural network used in artificial intelligence that is commonly applied to analyzing images.
They can be considered a pre-processing compared to image classification algorithms. They have applications in image and video recognition, recommender systems, image classification, natural language processing, etc.
: filter size.
: activation function.
The convolution operation is the most representative. The convolutional operation is the Hadamard/Element-wise product.
We can rewrite our operation using our notation:
where is the matrix output and is a subset of the input matrix which first element is and the same size of , the filter.
The pooling layer combines the output of neuron clusters at one single neuron, it is used to reduce the size of the image.
Pooling layer needs two hyper-parameter: stride and filter size
There are two kinds of pooling layers: and .
- Max pooling
Outputs the highest value of the cluster.
- Average pooling
Outputs the average of the cluster.
ReLU is the abbreviation of rectified linear unit. .
Consequently, it removes the negative values.
Fully connected layers connect every neuron in one layer to every neuron in another layer. It is in principle the same as the traditional multilayer perceptron neural network (MLP).