Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

import randomimport numpy as npfrom PIL import Imageimport torchimport torch.optim as optimimport torch.nn as nnimport torch.nn.functional as Ffrom torch.utils.data import DataLoader, Datasetfrom torchvision.transforms import Compose, Normalizefrom data_generation.image_classification import generate_datasetfrom helpers import index_splitter, make_balanced_samplerfrom stepbystep.v1 import StepByStepConvolutionsIn Chapter 4, we talked about pixels as features. We considered each pixel as anindividual, independent feature, thus losing information while flattening theimage. We also talked about weights as pixels and how we could interpret theweights used by a neuron as an image, or, more specifically, a filter.Now, it is time to take that one step further and learn about convolutions. Aconvolution is "a mathematical operation on two functions (f and g) that produces athird function (f * g) expressing how the shape of one is modified by the other." [90] Inimage processing, a convolution matrix is also called a kernel or filter. Typicalimage processing operations—like blurring, sharpening, edge detection, and more, areaccomplished by performing a convolution between a kernel and an image.Filter / KernelSimply put, one defines a filter (or kernel, but we’re sticking with "filter" here) andapplies this filter to an image (that is, convolving an image). Usually, the filters aresmall square matrices. The convolution itself is performed by applying the filter onthe image repeatedly. Let’s try a concrete example to make it more clear.Convolutions | 345

We’re using a single-channel image, and the most boring filter ever, the identityfilter.Figure 5.1 - Identity filterSee the gray region on the top left corner of the image, which is the same size asthe filter? That’s the region to which the filter is being applied and is called thereceptive field, drawing an analogy to the way human vision works.Moreover, look at the shapes underneath the images: They follow the NCHWshape convention used by PyTorch. There is one image, one channel, six-by-sixpixels in size. There is one filter, one channel, three-by-three pixels in size.Finally, the asterisk represents the convolution operation between the two.Let’s create Numpy arrays to follow the operations; after all, everything gets easierto understand in code, right?single = np.array([[[[5, 0, 8, 7, 8, 1],[1, 9, 5, 0, 7, 7],[6, 0, 2, 4, 6, 6],[9, 7, 6, 6, 8, 4],[8, 3, 8, 5, 1, 3],[7, 2, 7, 0, 1, 0]]]])single.shapeOutput(1, 1, 6, 6)346 | Chapter 5: Convolutions

We’re using a single-channel image, and the most boring filter ever, the identity

filter.

Figure 5.1 - Identity filter

See the gray region on the top left corner of the image, which is the same size as

the filter? That’s the region to which the filter is being applied and is called the

receptive field, drawing an analogy to the way human vision works.

Moreover, look at the shapes underneath the images: They follow the NCHW

shape convention used by PyTorch. There is one image, one channel, six-by-six

pixels in size. There is one filter, one channel, three-by-three pixels in size.

Finally, the asterisk represents the convolution operation between the two.

Let’s create Numpy arrays to follow the operations; after all, everything gets easier

to understand in code, right?

single = np.array(

[[[[5, 0, 8, 7, 8, 1],

[1, 9, 5, 0, 7, 7],

[6, 0, 2, 4, 6, 6],

[9, 7, 6, 6, 8, 4],

[8, 3, 8, 5, 1, 3],

[7, 2, 7, 0, 1, 0]]]]

)

single.shape

Output

(1, 1, 6, 6)

346 | Chapter 5: Convolutions

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!