09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Convolutional Neural Networks

In other words, a CNN has multiple filters stacked together that learn to recognize

specific visual features independently from the location in the image itself. Those

visual features are simple in the initial layers of the network and become more

and more sophisticated deeper in the network. Training of a CNN requires the

identification of the right values for each filter so that an input, when passed through

multiple layers, activates certain neurons of the last layer so that it will predict the

correct values.

An example of DCNN ‒ LeNet

Yann LeCun, who very recently won the Turing Award, proposed [1] a family

of convnets named LeNet trained for recognizing MNIST handwritten characters

with robustness to simple geometric transformations and distortion. The core idea

of LeNets is to have lower layers alternating convolution operations with maxpooling

operations. The convolution operations are based on carefully chosen local

receptive fields with shared weights for multiple feature maps. Then, higher levels

are fully connected based on a traditional MLP with hidden layers and softmax as

output layer.

LeNet code in TensorFlow 2.0

To define a LeNet in code we use a convolutional 2D module:

layers.Convolution2D(20, (5, 5), activation='relu', input_shape=input_

shape))

Note that tf.keras.layers.Conv2D is an alias of tf.keras.

layers.Convolution2D so the two can be used in an

interchangeable way. See https://www.tensorflow.org/

api_docs/python/tf/keras/layers/Conv2D.

Where the first parameter is the number of output filters in the convolution, and

the next tuple is the extension of each filter. An interesting optional parameter is

padding. There are two options: padding='valid' means that the convolution is

only computed where the input and the filter fully overlap and therefore the output

is smaller than the input, while padding='same' means that we have an output

which is the same size as the input, for which the area around the input is padded

with zeros.

In addition, we use a MaxPooling2D module:

layers.MaxPooling2D(pool_size=(2, 2), strides=(2, 2))

[ 114 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!