pdfcoffee

soumyasankar99
from soumyasankar99 More from this publisher
09.05.2023 Views

In fact, it is possible to slide the submatrices by only 23 positions before touchingthe borders of the images. In Keras, the number of pixels along one edge of thekernel, or submatrix, is the kernel size; the stride length, however, is the numberof pixels by which the kernel is moved at each step in the convolution.Chapter 4Let's define the feature map from one layer to another. Of course, we can havemultiple feature maps that learn independently from each hidden layer. Forexample, we can start with 28×28 input neurons for processing MINST images,and then recall k feature maps of size 24×24 neurons each (again with stride of 5×5)in the next hidden layer.Shared weights and biasLet's suppose that we want to move away from the pixel representation in a rawimage, by gaining the ability to detect the same feature independently from thelocation where it is placed in the input image. A simple approach is to use the sameset of weights and biases for all the neurons in the hidden layers. In this way, eachlayer will learn a set of position-independent latent features derived from the image,bearing in mind that a layer consists of a set of kernels in parallel, and each kernelonly learns one feature.A mathematical exampleOne simple way to understand convolution is to think about a sliding windowfunction applied to a matrix. In the following example, given the input matrix I andthe kernel K, we get the convolved output. The 3×3 kernel K (sometimes called thefilter or feature detector) is multiplied elementwise with the input matrix to get onecell in the output matrix. All the other cells are obtained by sliding the window over I:Figure 2: Input matrix I and kernel K producing a Convolved output[ 111 ]

Convolutional Neural NetworksIn this example we decided to stop the sliding window as soon as we touch theborders of I (so the output is 3×3). Alternatively, we could have chosen to pad theinput with zeros (so that the output would have been 5×5). This decision relatesto the padding choice adopted. Note that kernel depth is equal to input depth(channel).Another choice is about how far along we slide our sliding windows with each step.This is called the stride. A larger stride generates less applications of the kernel anda smaller output size, while a smaller stride generates more output and retains moreinformation.The size of the filter, the stride, and the type of padding are hyperparameters thatcan be fine-tuned during the training of the network.ConvNets in TensorFlow 2.xIn TensorFlow 2.x if we want to add a convolutional layer with 32 parallel featuresand a filter size of 3×3, we write:import tensorflow as tffrom tensorflow.keras import datasets, layers, modelsmodel = models.Sequential()model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))This means that we are applying a 3×3 convolution on 28×28 images with one inputchannel (or input filters) resulting in 32 output channels (or output filters).An example of convolution is provided in Figure 3:Figure 3: An example of convolution[ 112 ]

In fact, it is possible to slide the submatrices by only 23 positions before touching

the borders of the images. In Keras, the number of pixels along one edge of the

kernel, or submatrix, is the kernel size; the stride length, however, is the number

of pixels by which the kernel is moved at each step in the convolution.

Chapter 4

Let's define the feature map from one layer to another. Of course, we can have

multiple feature maps that learn independently from each hidden layer. For

example, we can start with 28×28 input neurons for processing MINST images,

and then recall k feature maps of size 24×24 neurons each (again with stride of 5×5)

in the next hidden layer.

Shared weights and bias

Let's suppose that we want to move away from the pixel representation in a raw

image, by gaining the ability to detect the same feature independently from the

location where it is placed in the input image. A simple approach is to use the same

set of weights and biases for all the neurons in the hidden layers. In this way, each

layer will learn a set of position-independent latent features derived from the image,

bearing in mind that a layer consists of a set of kernels in parallel, and each kernel

only learns one feature.

A mathematical example

One simple way to understand convolution is to think about a sliding window

function applied to a matrix. In the following example, given the input matrix I and

the kernel K, we get the convolved output. The 3×3 kernel K (sometimes called the

filter or feature detector) is multiplied elementwise with the input matrix to get one

cell in the output matrix. All the other cells are obtained by sliding the window over I:

Figure 2: Input matrix I and kernel K producing a Convolved output

[ 111 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!