Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub
Fancier Model (Constructor)class CNN2(nn.Module):def __init__(self, n_filters, p=0.0):super(CNN2, self).__init__()self.n_filters = n_filtersself.p = p# Creates the convolution layersself.conv1 = nn.Conv2d(in_channels=3,out_channels=n_filters,kernel_size=3)self.conv2 = nn.Conv2d(in_channels=n_filters,out_channels=n_filters,kernel_size=3)# Creates the linear layers# Where does this 5 * 5 come from?! Check it belowself.fc1 = nn.Linear(n_filters * 5 * 5, 50)self.fc2 = nn.Linear(50, 3)# Creates dropout layersself.drop = nn.Dropout(self.p)There are two convolutional layers, and two linear layers, fc1 (the hidden layer)and fc2 (the output layer)."Where are the layers for activation functions and max pooling?"Well, the max pooling layer doesn’t learn anything, so we can use its functionalform: F.max_pool2d(). The same goes for the chosen activation function: F.relu().If you choose the parametric ReLU (PReLU), you shouldn’t usethe functional form since it needs to learn the coefficient ofleakage (the slope of the negative part).On the one hand, you keep the model’s attributes to a minimum. On the other hand,you don’t have layers to hook anymore, so you cannot capture the output ofactivation functions and max pooling operations anymore.Fancier Model | 429
Let’s create our two convolutional blocks in a method aptly named featurizer:Fancier Model (Featurizer)def featurizer(self, x):# First convolutional block# 3@28x28 -> n_filters@26x26 -> n_filters@13x13x = self.conv1(x)x = F.relu(x)x = F.max_pool2d(x, kernel_size=2)# Second convolutional block# n_filters@13x13 -> n_filters@11x11 -> n_filters@5x5x = self.conv2(x)x = F.relu(x)x = F.max_pool2d(x, kernel_size=2)# Input dimension (n_filters@5x5)# Output dimension (n_filters * 5 * 5)x = nn.Flatten()(x)return xThis structure, where an argument x is both input and output of every operation ina sequence, is fairly common. The featurizer produces a feature tensor of sizen_filters times 25.The next step is to build the classifier using the linear layers, one as a hidden layer,the other as the output layer. But there is more to it: There is a dropout layerbefore each linear layer, and it will drop values with a probability p (the secondargument of our constructor method):430 | Chapter 6: Rock, Paper, Scissors
- Page 404 and 405: The loss only considers the predict
- Page 406 and 407: Outputtensor([[-1.5229, -0.3146, -2
- Page 408 and 409: IMPORTANT: I can’t stress this en
- Page 410 and 411: figures at the beginning of this ch
- Page 412 and 413: The three units in the output layer
- Page 414 and 415: StepByStep Method@staticmethoddef _
- Page 416 and 417: The meow() method is totally indepe
- Page 418 and 419: StepByStep Methoddef visualize_filt
- Page 420 and 421: dummy_model = nn.Linear(1, 1)dummy_
- Page 422 and 423: dummy_listOutput[(Linear(in_feature
- Page 424 and 425: Output{Conv2d(1, 1, kernel_size=(3,
- Page 426 and 427: will be the externally defined vari
- Page 428 and 429: Removing Hookssbs_cnn1.remove_hooks
- Page 430 and 431: return figsetattr(StepByStep, 'visu
- Page 432 and 433: Figure 5.22 - Feature maps (classif
- Page 434 and 435: classification: The predicted class
- Page 436 and 437: convolutional layers to our model a
- Page 438 and 439: Capturing Outputsfeaturizer_layers
- Page 440 and 441: the filters learned by the model pr
- Page 442 and 443: given chapter are imported at its v
- Page 444 and 445: Data PreparationThe data preparatio
- Page 446 and 447: model anyway. We’ll use it to com
- Page 448 and 449: StepByStep Method@staticmethoddef m
- Page 450 and 451: "What’s wrong with the colors?"Th
- Page 452 and 453: three_channel_filter = np.array([[[
- Page 456 and 457: Fancier Model (Classifier)def class
- Page 458 and 459: torch.manual_seed(44)dropping_model
- Page 460 and 461: Outputtensor([0.1000, 0.2000, 0.300
- Page 462 and 463: Figure 6.8 - Output distribution fo
- Page 464 and 465: Adaptive moment estimation (Adam) u
- Page 466 and 467: torch.manual_seed(13)# Model Config
- Page 468 and 469: Outputtorch.Size([5, 3, 3, 3])Its s
- Page 470 and 471: Choosing a learning rate that works
- Page 472 and 473: Higher-Order Learning Rate Function
- Page 474 and 475: Perfect! Now let’s build the actu
- Page 476 and 477: ax.set_xlabel('Learning Rate')ax.se
- Page 478 and 479: LRFinderThe function we’ve implem
- Page 480 and 481: value in our moving average has an
- Page 482 and 483: Figure 6.15 - Distribution of weigh
- Page 484 and 485: In code, the implementation of the
- Page 486 and 487: As expected, the EWMA without corre
- Page 488 and 489: optimizer = optim.Adam(model.parame
- Page 490 and 491: IMPORTANT: The logging function mus
- Page 492 and 493: Output{'state': {140601337662512: {
- Page 494 and 495: different optimizer, set them to ca
- Page 496 and 497: • dampening: dampening factor for
- Page 498 and 499: Figure 6.20 - Paths taken by SGD (w
- Page 500 and 501: Equation 6.16 - Looking aheadOnce N
- Page 502 and 503: Figure 6.22 - Path taken by each SG
Let’s create our two convolutional blocks in a method aptly named featurizer:
Fancier Model (Featurizer)
def featurizer(self, x):
# First convolutional block
# 3@28x28 -> n_filters@26x26 -> n_filters@13x13
x = self.conv1(x)
x = F.relu(x)
x = F.max_pool2d(x, kernel_size=2)
# Second convolutional block
# n_filters@13x13 -> n_filters@11x11 -> n_filters@5x5
x = self.conv2(x)
x = F.relu(x)
x = F.max_pool2d(x, kernel_size=2)
# Input dimension (n_filters@5x5)
# Output dimension (n_filters * 5 * 5)
x = nn.Flatten()(x)
return x
This structure, where an argument x is both input and output of every operation in
a sequence, is fairly common. The featurizer produces a feature tensor of size
n_filters times 25.
The next step is to build the classifier using the linear layers, one as a hidden layer,
the other as the output layer. But there is more to it: There is a dropout layer
before each linear layer, and it will drop values with a probability p (the second
argument of our constructor method):
430 | Chapter 6: Rock, Paper, Scissors