Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub
Figure 4.16 - Deep model (for real)Let’s see how it performs now.Model ConfigurationFirst, we translate the model above to code:Model Configuration1 # Sets learning rate - this is "eta" ~ the "n"-like Greek letter2 lr = 0.134 torch.manual_seed(17)5 # Now we can create a model6 model_relu = nn.Sequential()7 model_relu.add_module('flatten', nn.Flatten())8 model_relu.add_module('hidden0', nn.Linear(25, 5, bias=False))9 model_relu.add_module('activation0', nn.ReLU())10 model_relu.add_module('hidden1', nn.Linear(5, 3, bias=False))11 model_relu.add_module('activation1', nn.ReLU())12 model_relu.add_module('output', nn.Linear(3, 1, bias=False))13 model_relu.add_module('sigmoid', nn.Sigmoid())1415 # Defines an SGD optimizer to update the parameters16 # (now retrieved directly from the model)17 optimizer_relu = optim.SGD(model_relu.parameters(), lr=lr)1819 # Defines a binary cross-entropy loss function20 binary_loss_fn = nn.BCELoss()Deep Model | 321
The chosen activation function is the rectified linear unit (ReLU), one of the mostcommonly used functions.We kept the bias out of the picture for the sake of comparing this model to theprevious one, which is completely identical except for the activation functionsintroduced after each hidden layer.In real problems, as a general rule, you should keep bias=True.Model TrainingLet’s train our new, deep, and activated model for 50 epochs using the StepByStepclass and visualize the losses:Model Training1 n_epochs = 5023 sbs_relu = StepByStep(model_relu, binary_loss_fn, optimizer_relu)4 sbs_relu.set_loaders(train_loader, val_loader)5 sbs_relu.train(n_epochs)fig = sbs_relu.plot_losses()Figure 4.17 - LossesThis is more like it! But, to really grasp the difference made by the activationfunctions, let’s plot all models on the same chart.322 | Chapter 4: Classifying Images
- Page 296 and 297: image_rgb = np.stack([image_r, imag
- Page 298 and 299: That’s fairly straightforward; we
- Page 300 and 301: • Transformations based on Tensor
- Page 302 and 303: position of an object in a picture
- Page 304 and 305: Outputtensor([[[0., 0., 0., 1., 0.]
- Page 306 and 307: Outputtensor([[[-1., -1., -1., 1.,
- Page 308 and 309: We can convert the former into the
- Page 310 and 311: composer = Compose([RandomHorizonta
- Page 312 and 313: Output<torch.utils.data.dataset.Sub
- Page 314 and 315: train_composer = Compose([RandomHor
- Page 316 and 317: The minority class should have the
- Page 318 and 319: train_loader = DataLoader(dataset=t
- Page 320 and 321: implemented in Chapter 2.1? Let’s
- Page 322 and 323: Let’s take one mini-batch of imag
- Page 324 and 325: What does our model look like? Visu
- Page 326 and 327: Model TrainingLet’s train our mod
- Page 328 and 329: preceding hidden layer to compute i
- Page 330 and 331: fig = sbs_nn.plot_losses()Figure 4.
- Page 332 and 333: Equation 4.2 - Equivalence of deep
- Page 334 and 335: w_nn_equiv = w_nn_output.mm(w_nn_hi
- Page 336 and 337: Weights as PixelsDuring data prepar
- Page 338 and 339: is only 0.25 (for z = 0) and that i
- Page 340 and 341: nn.Tanh()(dummy_z)Outputtensor([-0.
- Page 342 and 343: dummy_z = torch.tensor([-3., 0., 3.
- Page 344 and 345: As you can see, in PyTorch the coef
- Page 348 and 349: Figure 4.18 - Losses (before and af
- Page 350 and 351: Equation 4.3 - Activation functions
- Page 352 and 353: Helper Function #41 def index_split
- Page 354 and 355: Model Configuration1 # Sets learnin
- Page 356 and 357: Bonus ChapterFeature SpaceThis chap
- Page 358 and 359: Affine TransformationsAn affine tra
- Page 360 and 361: Figure B.3 - Annotated model diagra
- Page 362 and 363: Figure B.5 - In the beginning…But
- Page 364 and 365: OK, now we can clearly see a differ
- Page 366 and 367: In the model above, the sigmoid fun
- Page 368 and 369: the more dimensions, the more separ
- Page 370 and 371: import randomimport numpy as npfrom
- Page 372 and 373: identity = np.array([[[[0, 0, 0],[0
- Page 374 and 375: Figure 5.4 - Striding the image, on
- Page 376 and 377: Output-----------------------------
- Page 378 and 379: Outputtensor([[[[9., 5., 0., 7.],[0
- Page 380 and 381: OutputParameter containing:tensor([
- Page 382 and 383: Moreover, notice that if we were to
- Page 384 and 385: In code, as usual, PyTorch gives us
- Page 386 and 387: Outputtensor([[[[5., 5., 0., 8., 7.
- Page 388 and 389: edge = np.array([[[[0, 1, 0],[1, -4
- Page 390 and 391: A pooling kernel of two-by-two resu
- Page 392 and 393: Outputtensor([[22., 23., 11., 24.,
- Page 394 and 395: Figure 5.15 - LeNet-5 architectureS
The chosen activation function is the rectified linear unit (ReLU), one of the most
commonly used functions.
We kept the bias out of the picture for the sake of comparing this model to the
previous one, which is completely identical except for the activation functions
introduced after each hidden layer.
In real problems, as a general rule, you should keep bias=True.
Model Training
Let’s train our new, deep, and activated model for 50 epochs using the StepByStep
class and visualize the losses:
Model Training
1 n_epochs = 50
2
3 sbs_relu = StepByStep(model_relu, binary_loss_fn, optimizer_relu)
4 sbs_relu.set_loaders(train_loader, val_loader)
5 sbs_relu.train(n_epochs)
fig = sbs_relu.plot_losses()
Figure 4.17 - Losses
This is more like it! But, to really grasp the difference made by the activation
functions, let’s plot all models on the same chart.
322 | Chapter 4: Classifying Images