Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

Figure 7.4 - 1x1 convolutionThe input is an RGB image, and there are two filters; each filter has three 1x1kernels, one for each channel of the input. What are these filters actually doing?Let’s check it out!Figure 7.5 - 1x1 convolutionMaybe it is even more clear if it is presented as a formula:Equation 7.1 - Filter arithmeticA filter using a 1x1 convolution corresponds to a weightedaverage of the input channels.In other words, a 1x1 convolution is a linear combination of theinput channels, computed pixel by pixel.There is another way to get a linear combination of the inputs: a linear layer, also1x1 Convolutions | 525

referred to as a fully connected layer. Performing a 1x1 convolution is akin toapplying a linear layer to each individual pixel over its channels.This is the reason why a 1x1 convolution is said to be equivalentto a fully connected (linear) layer.In the example above, each of the two filters produces a different linearcombination of the RGB channels. Does this ring any bells? In Chapter 6, we sawthat grayscale images can be computed using a linear combination of the red,green, and blue channels of colored images. So, we can convert an image tograyscale using a 1x1 convolution!scissors = Image.open('rps/scissors/scissors01-001.png')image = ToTensor()(scissors)[:3, :, :].view(1, 3, 300, 300)weights = torch.tensor([0.2126, 0.7152, 0.0722]).view(1, 3, 1, 1)convolved = F.conv2d(input=image, weight=weights)converted = ToPILImage()(convolved[0])grayscale = scissors.convert('L')Figure 7.6 - Convolution vs conversionSee? They are the same … or are they? If you have a really sharp eye, maybe you areable to notice a subtle difference between the two shades of gray. It doesn’t haveanything to do with the use of convolutions, though—it turns out, PIL uses slightlydifferent weights for converting RGB into grayscale.526 | Chapter 7: Transfer Learning

referred to as a fully connected layer. Performing a 1x1 convolution is akin to

applying a linear layer to each individual pixel over its channels.

This is the reason why a 1x1 convolution is said to be equivalent

to a fully connected (linear) layer.

In the example above, each of the two filters produces a different linear

combination of the RGB channels. Does this ring any bells? In Chapter 6, we saw

that grayscale images can be computed using a linear combination of the red,

green, and blue channels of colored images. So, we can convert an image to

grayscale using a 1x1 convolution!

scissors = Image.open('rps/scissors/scissors01-001.png')

image = ToTensor()(scissors)[:3, :, :].view(1, 3, 300, 300)

weights = torch.tensor([0.2126, 0.7152, 0.0722]).view(1, 3, 1, 1)

convolved = F.conv2d(input=image, weight=weights)

converted = ToPILImage()(convolved[0])

grayscale = scissors.convert('L')

Figure 7.6 - Convolution vs conversion

See? They are the same … or are they? If you have a really sharp eye, maybe you are

able to notice a subtle difference between the two shades of gray. It doesn’t have

anything to do with the use of convolutions, though—it turns out, PIL uses slightly

different weights for converting RGB into grayscale.

526 | Chapter 7: Transfer Learning

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!