Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub
Equation 4.2 - Equivalence of deep and shallow modelsThe first row below the line shows the sequence of matrices. The bottom rowshows the result of the matrix multiplication. This result is exactly the sameoperation shown in the "Notation" subsection of the shallow model; that is, thelogistic regression.In a nutshell, a model with any number of hidden layers has an equivalent modelDeep-ish Model | 307
with no hidden layers. We’re not including the bias here, because it would make itmuch harder to illustrate this point.Show Me the Code!If equations are not your favorite way of looking at this, let’s try using some code.First, we need to get the weights for the layers in our deep-ish model. We can usethe weight attribute of each layer, without forgetting to detach() it from thecomputation graph, so we can freely use them in other operations:w_nn_hidden0 = model_nn.hidden0.weight.detach()w_nn_hidden1 = model_nn.hidden1.weight.detach()w_nn_output = model_nn.output.weight.detach()w_nn_hidden0.shape, w_nn_hidden1.shape, w_nn_output.shapeOutput(torch.Size([5, 25]), torch.Size([3, 5]), torch.Size([1, 3]))The shapes should match both our model’s definition and the weight matrices inthe equations above the line.We can compute the bottom row—that is, the equivalent model—using matrixmultiplication (which happens from right to left, as in the equations):w_nn_equiv = w_nn_output @ w_nn_hidden1 @ w_nn_hidden0w_nn_equiv.shapeOutputtorch.Size([1, 25])"What is @ doing in the expression above?"It is performing a matrix multiplication, exactly like torch.mm() does. We couldhave written the expression above like this:308 | Chapter 4: Classifying Images
- Page 282 and 283: actual data, it is as bad as it can
- Page 284 and 285: If you want to learn more about bot
- Page 286 and 287: Model Training1 n_epochs = 10023 sb
- Page 288 and 289: step in your journey! What’s next
- Page 290 and 291: Chapter 4Classifying ImagesSpoilers
- Page 292 and 293: Data GenerationOur images are quite
- Page 294 and 295: Images and ChannelsIn case you’re
- Page 296 and 297: image_rgb = np.stack([image_r, imag
- Page 298 and 299: That’s fairly straightforward; we
- Page 300 and 301: • Transformations based on Tensor
- Page 302 and 303: position of an object in a picture
- Page 304 and 305: Outputtensor([[[0., 0., 0., 1., 0.]
- Page 306 and 307: Outputtensor([[[-1., -1., -1., 1.,
- Page 308 and 309: We can convert the former into the
- Page 310 and 311: composer = Compose([RandomHorizonta
- Page 312 and 313: Output<torch.utils.data.dataset.Sub
- Page 314 and 315: train_composer = Compose([RandomHor
- Page 316 and 317: The minority class should have the
- Page 318 and 319: train_loader = DataLoader(dataset=t
- Page 320 and 321: implemented in Chapter 2.1? Let’s
- Page 322 and 323: Let’s take one mini-batch of imag
- Page 324 and 325: What does our model look like? Visu
- Page 326 and 327: Model TrainingLet’s train our mod
- Page 328 and 329: preceding hidden layer to compute i
- Page 330 and 331: fig = sbs_nn.plot_losses()Figure 4.
- Page 334 and 335: w_nn_equiv = w_nn_output.mm(w_nn_hi
- Page 336 and 337: Weights as PixelsDuring data prepar
- Page 338 and 339: is only 0.25 (for z = 0) and that i
- Page 340 and 341: nn.Tanh()(dummy_z)Outputtensor([-0.
- Page 342 and 343: dummy_z = torch.tensor([-3., 0., 3.
- Page 344 and 345: As you can see, in PyTorch the coef
- Page 346 and 347: Figure 4.16 - Deep model (for real)
- Page 348 and 349: Figure 4.18 - Losses (before and af
- Page 350 and 351: Equation 4.3 - Activation functions
- Page 352 and 353: Helper Function #41 def index_split
- Page 354 and 355: Model Configuration1 # Sets learnin
- Page 356 and 357: Bonus ChapterFeature SpaceThis chap
- Page 358 and 359: Affine TransformationsAn affine tra
- Page 360 and 361: Figure B.3 - Annotated model diagra
- Page 362 and 363: Figure B.5 - In the beginning…But
- Page 364 and 365: OK, now we can clearly see a differ
- Page 366 and 367: In the model above, the sigmoid fun
- Page 368 and 369: the more dimensions, the more separ
- Page 370 and 371: import randomimport numpy as npfrom
- Page 372 and 373: identity = np.array([[[[0, 0, 0],[0
- Page 374 and 375: Figure 5.4 - Striding the image, on
- Page 376 and 377: Output-----------------------------
- Page 378 and 379: Outputtensor([[[[9., 5., 0., 7.],[0
- Page 380 and 381: OutputParameter containing:tensor([
Equation 4.2 - Equivalence of deep and shallow models
The first row below the line shows the sequence of matrices. The bottom row
shows the result of the matrix multiplication. This result is exactly the same
operation shown in the "Notation" subsection of the shallow model; that is, the
logistic regression.
In a nutshell, a model with any number of hidden layers has an equivalent model
Deep-ish Model | 307