Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

A pooling kernel of two-by-two results in an image whose dimensions are half ofthe original. A pooling kernel of three-by-three makes the resulting image onethirdthe size of the original, and so on. Moreover, only full chunks count: If we try akernel of four-by-four in our six-by-six image, only one chunk fits, and the resultingimage would have a single pixel.In PyTorch, as usual, we have both forms: F.max_pool2d() and nn.MaxPool2d. Let’suse the functional form to replicate the max pooling in the figure above:pooled = F.max_pool2d(conv_padded, kernel_size=2)pooledOutputtensor([[[[22., 23., 11.],[24., 7., 1.],[13., 13., 13.]]]])And then let’s use the module version to illustrate the large four-by-four pooling:maxpool4 = nn.MaxPool2d(kernel_size=4)pooled4 = maxpool4(conv_padded)pooled4Outputtensor([[[[24.]]]])A single pixel, as promised!"Can I perform some other operation?"Sure, besides max pooling, average pooling is also fairly common. As the namesuggests, it will output the average pixel value for each chunk. In PyTorch, we haveF.avg_pool2d() and nn.AvgPool2d. "Can I use a stride of a different size?" Pooling | 365

Of course, you can! In this case, there will be an overlap between regions instead ofa clean split into chunks. So, it looks like a regular kernel of a convolution, but theoperation is already defined (max or average, for instance). Let’s go through aquick example:F.max_pool2d(conv_padded, kernel_size=3, stride=1)Outputtensor([[[[24., 24., 23., 23.],[24., 24., 23., 23.],[24., 24., 13., 13.],[13., 13., 13., 13.]]]])The max pooling kernel, sized three-by-three, will move over the image (just likethe convolutional kernel) and compute the maximum value of each region it goesover. The resulting shape follows the formula in Equation 5.4.FlatteningWe’ve already seen this one! It simply flattens a tensor, preserving the firstdimension such that we keep the number of data points while collapsing all otherdimensions. It has a module version, nn.Flatten:flattened = nn.Flatten()(pooled)flattenedOutputtensor([[22., 23., 11., 24., 7., 1., 13., 13., 13.]])It has no functional version, but there is no need for one since we can accomplishthe same thing using view():pooled.view(1, -1)366 | Chapter 5: Convolutions

A pooling kernel of two-by-two results in an image whose dimensions are half of

the original. A pooling kernel of three-by-three makes the resulting image onethird

the size of the original, and so on. Moreover, only full chunks count: If we try a

kernel of four-by-four in our six-by-six image, only one chunk fits, and the resulting

image would have a single pixel.

In PyTorch, as usual, we have both forms: F.max_pool2d() and nn.MaxPool2d. Let’s

use the functional form to replicate the max pooling in the figure above:

pooled = F.max_pool2d(conv_padded, kernel_size=2)

pooled

Output

tensor([[[[22., 23., 11.],

[24., 7., 1.],

[13., 13., 13.]]]])

And then let’s use the module version to illustrate the large four-by-four pooling:

maxpool4 = nn.MaxPool2d(kernel_size=4)

pooled4 = maxpool4(conv_padded)

pooled4

Output

tensor([[[[24.]]]])

A single pixel, as promised!

"Can I perform some other operation?"

Sure, besides max pooling, average pooling is also fairly common. As the name

suggests, it will output the average pixel value for each chunk. In PyTorch, we have

F.avg_pool2d() and nn.AvgPool2d.

"Can I use a stride of a different size?" Pooling | 365

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!