Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub
OK, now we can clearly see a difference: The decision boundary on the originalfeature space has a corner, a direct consequence of the ReLU’s own corner when itsinput is zero. On the right, we can also verify that the range of the activated featurespace has only positive values, as expected.Next, let’s try the Parametric ReLU (PReLU).Figure B.9 - Activated feature space—PReLUThis is even more different! Given that the PReLU learns a slope for the negativevalues, effectively bending the feature space instead of simply chopping off parts ofit like the plain ReLU, the result looks like the feature space was folded in twodifferent places. I don’t know about you, but I find this really cool!So far, all models were trained for 160 epochs, which was about enough training forthem to converge to a solution that would completely separate both parabolas.This seems like quite a lot of epochs to solve a rather simple problem, right? But,keep in mind what we discussed in Chapter 3: Increasing dimensionality makes iteasier to separate the classes. So, we’re actually imposing a severe restriction onthese models by keeping it two-dimensional (two units in the hidden layer) andperforming only one transformation (only one hidden layer).Let’s cut our models some slack and give them more power…More Functions, More Boundaries | 339
More Layers, More BoundariesOne way to give a model more power is to make it deeper. We can make it deeper,while keeping it strictly two-dimensional, by adding another hidden layer withtwo units. It looks like the diagram:Figure B.10 - Deeper modelA sequence of one or more hidden layers, all with the same size asthe input layer, as in the figure above (up to "Activation #1"), is atypical architecture used to model the hidden state in recurrentneural networks (RNNs). We’ll get back to it in a later chapter.And it looks like this in code (we’re using a hyperbolic tangent as an activationfunction because it looks good when visualizing a sequence of transformations):fs_model = nn.Sequential()fs_model.add_module('hidden0', nn.Linear(2, 2))fs_model.add_module('activation0', nn.Tanh())fs_model.add_module('hidden1', nn.Linear(2, 2))fs_model.add_module('activation1', nn.Tanh())fs_model.add_module('output', nn.Linear(2, 1))fs_model.add_module('sigmoid', nn.Sigmoid())340 | Bonus Chapter: Feature Space
- Page 314 and 315: train_composer = Compose([RandomHor
- Page 316 and 317: The minority class should have the
- Page 318 and 319: train_loader = DataLoader(dataset=t
- Page 320 and 321: implemented in Chapter 2.1? Let’s
- Page 322 and 323: Let’s take one mini-batch of imag
- Page 324 and 325: What does our model look like? Visu
- Page 326 and 327: Model TrainingLet’s train our mod
- Page 328 and 329: preceding hidden layer to compute i
- Page 330 and 331: fig = sbs_nn.plot_losses()Figure 4.
- Page 332 and 333: Equation 4.2 - Equivalence of deep
- Page 334 and 335: w_nn_equiv = w_nn_output.mm(w_nn_hi
- Page 336 and 337: Weights as PixelsDuring data prepar
- Page 338 and 339: is only 0.25 (for z = 0) and that i
- Page 340 and 341: nn.Tanh()(dummy_z)Outputtensor([-0.
- Page 342 and 343: dummy_z = torch.tensor([-3., 0., 3.
- Page 344 and 345: As you can see, in PyTorch the coef
- Page 346 and 347: Figure 4.16 - Deep model (for real)
- Page 348 and 349: Figure 4.18 - Losses (before and af
- Page 350 and 351: Equation 4.3 - Activation functions
- Page 352 and 353: Helper Function #41 def index_split
- Page 354 and 355: Model Configuration1 # Sets learnin
- Page 356 and 357: Bonus ChapterFeature SpaceThis chap
- Page 358 and 359: Affine TransformationsAn affine tra
- Page 360 and 361: Figure B.3 - Annotated model diagra
- Page 362 and 363: Figure B.5 - In the beginning…But
- Page 366 and 367: In the model above, the sigmoid fun
- Page 368 and 369: the more dimensions, the more separ
- Page 370 and 371: import randomimport numpy as npfrom
- Page 372 and 373: identity = np.array([[[[0, 0, 0],[0
- Page 374 and 375: Figure 5.4 - Striding the image, on
- Page 376 and 377: Output-----------------------------
- Page 378 and 379: Outputtensor([[[[9., 5., 0., 7.],[0
- Page 380 and 381: OutputParameter containing:tensor([
- Page 382 and 383: Moreover, notice that if we were to
- Page 384 and 385: In code, as usual, PyTorch gives us
- Page 386 and 387: Outputtensor([[[[5., 5., 0., 8., 7.
- Page 388 and 389: edge = np.array([[[[0, 1, 0],[1, -4
- Page 390 and 391: A pooling kernel of two-by-two resu
- Page 392 and 393: Outputtensor([[22., 23., 11., 24.,
- Page 394 and 395: Figure 5.15 - LeNet-5 architectureS
- Page 396 and 397: • second block: produces 16-chann
- Page 398 and 399: Transformed Dataset1 class Transfor
- Page 400 and 401: LossNew problem, new loss. Since we
- Page 402 and 403: Outputtensor([4.0000, 1.0000, 0.500
- Page 404 and 405: The loss only considers the predict
- Page 406 and 407: Outputtensor([[-1.5229, -0.3146, -2
- Page 408 and 409: IMPORTANT: I can’t stress this en
- Page 410 and 411: figures at the beginning of this ch
- Page 412 and 413: The three units in the output layer
More Layers, More Boundaries
One way to give a model more power is to make it deeper. We can make it deeper,
while keeping it strictly two-dimensional, by adding another hidden layer with
two units. It looks like the diagram:
Figure B.10 - Deeper model
A sequence of one or more hidden layers, all with the same size as
the input layer, as in the figure above (up to "Activation #1"), is a
typical architecture used to model the hidden state in recurrent
neural networks (RNNs). We’ll get back to it in a later chapter.
And it looks like this in code (we’re using a hyperbolic tangent as an activation
function because it looks good when visualizing a sequence of transformations):
fs_model = nn.Sequential()
fs_model.add_module('hidden0', nn.Linear(2, 2))
fs_model.add_module('activation0', nn.Tanh())
fs_model.add_module('hidden1', nn.Linear(2, 2))
fs_model.add_module('activation1', nn.Tanh())
fs_model.add_module('output', nn.Linear(2, 1))
fs_model.add_module('sigmoid', nn.Sigmoid())
340 | Bonus Chapter: Feature Space