Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

Figure 5.15 - LeNet-5 architectureSource: Generated using Alexander Lenail’s NN-SVG and adapted by the author. Formore details, see LeCun, Y., et al. (1998). "Gradient-based learning applied todocument recognition," Proceedings of the IEEE, 86(11), 2278–2324. [92]Do you see anything familiar? The typical convolutional blocks are already there(to some extent): convolutions (C layers), activation functions (not shown), andsubsampling (S layers). There are some differences, though:• Back then, the subsampling was more complex than today’s max pooling, butthe general idea still holds.• The activation function, a sigmoid at the time, was applied after thesubsampling instead of before, as is typical today.• The F6 and Output layers were connected by something called Gaussianconnections, which is more complex than the typical activation function onewould use today.Typical Architecture | 369

Adapting LeNet-5 to today’s standards, it could be implemented like this:lenet = nn.Sequential()# Featurizer# Block 1: 1@28x28 -> 6@28x28 -> 6@14x14lenet.add_module('C1',nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5, padding=2))lenet.add_module('func1', nn.ReLU())lenet.add_module('S2', nn.MaxPool2d(kernel_size=2))# Block 2: 6@14x14 -> 16@10x10 -> 16@5x5lenet.add_module('C3',nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5))lenet.add_module('func2', nn.ReLU())lenet.add_module('S4', nn.MaxPool2d(kernel_size=2))# Block 3: 16@5x5 -> 120@1x1lenet.add_module('C5',nn.Conv2d(in_channels=16, out_channels=120, kernel_size=5))lenet.add_module('func2', nn.ReLU())# Flatteninglenet.add_module('flatten', nn.Flatten())# Classification# Hidden Layerlenet.add_module('F6', nn.Linear(in_features=120, out_features=84))lenet.add_module('func3', nn.ReLU())# Output Layerlenet.add_module('OUTPUT',nn.Linear(in_features=84, out_features=10))LeNet-5 used three convolutional blocks, although the last one does not have amax pooling, because the convolution already produces a single pixel. Regardingthe number of channels, they increase as the image size decreases:• input image: single-channel 28x28 pixels• first block: produces six-channel 14x14 pixels370 | Chapter 5: Convolutions

Adapting LeNet-5 to today’s standards, it could be implemented like this:

lenet = nn.Sequential()

# Featurizer

# Block 1: 1@28x28 -> 6@28x28 -> 6@14x14

lenet.add_module('C1',

nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5, padding=2)

)

lenet.add_module('func1', nn.ReLU())

lenet.add_module('S2', nn.MaxPool2d(kernel_size=2))

# Block 2: 6@14x14 -> 16@10x10 -> 16@5x5

lenet.add_module('C3',

nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5)

)

lenet.add_module('func2', nn.ReLU())

lenet.add_module('S4', nn.MaxPool2d(kernel_size=2))

# Block 3: 16@5x5 -> 120@1x1

lenet.add_module('C5',

nn.Conv2d(in_channels=16, out_channels=120, kernel_size=5)

)

lenet.add_module('func2', nn.ReLU())

# Flattening

lenet.add_module('flatten', nn.Flatten())

# Classification

# Hidden Layer

lenet.add_module('F6', nn.Linear(in_features=120, out_features=84))

lenet.add_module('func3', nn.ReLU())

# Output Layer

lenet.add_module('OUTPUT',

nn.Linear(in_features=84, out_features=10)

)

LeNet-5 used three convolutional blocks, although the last one does not have a

max pooling, because the convolution already produces a single pixel. Regarding

the number of channels, they increase as the image size decreases:

• input image: single-channel 28x28 pixels

• first block: produces six-channel 14x14 pixels

370 | Chapter 5: Convolutions

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!