Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub
Moreover, notice that if we were to take another step of two pixels the gray regionwould be placed partially outside the underlying image. This is a big no-no, so thereare only two valid operations while moving horizontally. The same thing willeventually happen when we move vertically. The first stride of two pixels down isfine, but the second will be, once again, a failed operation.The resulting image, after the only four valid operations, looks like this.Figure 5.10 - Shrinking even more!The identity kernel may be boring, but it is definitely useful for highlighting theinner workings of the convolutions. It is crystal clear in the figure above where thepixel values in the resulting image come from.Also, notice that using a larger stride made the shape of the resulting image evensmaller.The larger the stride, the smaller the resulting image.Once again, it makes sense: If we are skipping pixels in the input image, there arefewer regions of interest to apply the filter to. We can extend our previous formulato include the stride size (s):Equation 5.3 - Shape after a convolution with strideAs we’ve seen before, the stride is only an argument of the convolution, so let’s usePyTorch’s functional convolution to double-check the results:Convolutions | 357
convolved_stride2 = F.conv2d(image, kernel_identity, stride=2)convolved_stride2Outputtensor([[[[9., 0.],[7., 6.]]]])Cool, it works!So far, the operations we have performed have been shrinking the images. Whatabout restoring them to their original glory, I mean, size?PaddingPadding means stuffing. We need to stuff the original image so it can sustain the"attack" on its size."How do I stuff an image?"Glad you asked! Simply add zeros around it. An image is worth a thousand words inthis case.Figure 5.11 - Zero-padded imageSee what I mean? By adding columns and rows of zeros around it, we expand theinput image such that the gray region starts centered in the actual top left cornerof the input image. This simple trick can be used to preserve the original size of theimage.358 | Chapter 5: Convolutions
- Page 332 and 333: Equation 4.2 - Equivalence of deep
- Page 334 and 335: w_nn_equiv = w_nn_output.mm(w_nn_hi
- Page 336 and 337: Weights as PixelsDuring data prepar
- Page 338 and 339: is only 0.25 (for z = 0) and that i
- Page 340 and 341: nn.Tanh()(dummy_z)Outputtensor([-0.
- Page 342 and 343: dummy_z = torch.tensor([-3., 0., 3.
- Page 344 and 345: As you can see, in PyTorch the coef
- Page 346 and 347: Figure 4.16 - Deep model (for real)
- Page 348 and 349: Figure 4.18 - Losses (before and af
- Page 350 and 351: Equation 4.3 - Activation functions
- Page 352 and 353: Helper Function #41 def index_split
- Page 354 and 355: Model Configuration1 # Sets learnin
- Page 356 and 357: Bonus ChapterFeature SpaceThis chap
- Page 358 and 359: Affine TransformationsAn affine tra
- Page 360 and 361: Figure B.3 - Annotated model diagra
- Page 362 and 363: Figure B.5 - In the beginning…But
- Page 364 and 365: OK, now we can clearly see a differ
- Page 366 and 367: In the model above, the sigmoid fun
- Page 368 and 369: the more dimensions, the more separ
- Page 370 and 371: import randomimport numpy as npfrom
- Page 372 and 373: identity = np.array([[[[0, 0, 0],[0
- Page 374 and 375: Figure 5.4 - Striding the image, on
- Page 376 and 377: Output-----------------------------
- Page 378 and 379: Outputtensor([[[[9., 5., 0., 7.],[0
- Page 380 and 381: OutputParameter containing:tensor([
- Page 384 and 385: In code, as usual, PyTorch gives us
- Page 386 and 387: Outputtensor([[[[5., 5., 0., 8., 7.
- Page 388 and 389: edge = np.array([[[[0, 1, 0],[1, -4
- Page 390 and 391: A pooling kernel of two-by-two resu
- Page 392 and 393: Outputtensor([[22., 23., 11., 24.,
- Page 394 and 395: Figure 5.15 - LeNet-5 architectureS
- Page 396 and 397: • second block: produces 16-chann
- Page 398 and 399: Transformed Dataset1 class Transfor
- Page 400 and 401: LossNew problem, new loss. Since we
- Page 402 and 403: Outputtensor([4.0000, 1.0000, 0.500
- Page 404 and 405: The loss only considers the predict
- Page 406 and 407: Outputtensor([[-1.5229, -0.3146, -2
- Page 408 and 409: IMPORTANT: I can’t stress this en
- Page 410 and 411: figures at the beginning of this ch
- Page 412 and 413: The three units in the output layer
- Page 414 and 415: StepByStep Method@staticmethoddef _
- Page 416 and 417: The meow() method is totally indepe
- Page 418 and 419: StepByStep Methoddef visualize_filt
- Page 420 and 421: dummy_model = nn.Linear(1, 1)dummy_
- Page 422 and 423: dummy_listOutput[(Linear(in_feature
- Page 424 and 425: Output{Conv2d(1, 1, kernel_size=(3,
- Page 426 and 427: will be the externally defined vari
- Page 428 and 429: Removing Hookssbs_cnn1.remove_hooks
- Page 430 and 431: return figsetattr(StepByStep, 'visu
convolved_stride2 = F.conv2d(image, kernel_identity, stride=2)
convolved_stride2
Output
tensor([[[[9., 0.],
[7., 6.]]]])
Cool, it works!
So far, the operations we have performed have been shrinking the images. What
about restoring them to their original glory, I mean, size?
Padding
Padding means stuffing. We need to stuff the original image so it can sustain the
"attack" on its size.
"How do I stuff an image?"
Glad you asked! Simply add zeros around it. An image is worth a thousand words in
this case.
Figure 5.11 - Zero-padded image
See what I mean? By adding columns and rows of zeros around it, we expand the
input image such that the gray region starts centered in the actual top left corner
of the input image. This simple trick can be used to preserve the original size of the
image.
358 | Chapter 5: Convolutions