Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

Another advantage of these shortcuts is that they provide ashorter path for the gradients to travel back to the initial layers,thus preventing the vanishing gradients problem.Residual BlocksWe’re finally ready to tackle the main component of the ResNet model (the topperformer of ILSVRC-2015), the residual block.Figure 7.10 - Residual blockThe residual block isn’t so different from our own DummyResidual model, except forthe fact that the residual block has two consecutive weight layers and a ReLUactivation at the end. Moreover, it may have more than two consecutive weightlayers, and the weight layers do not necessarily need to be linear.For image classification, it makes much more sense to use convolutional instead oflinear layers, right? Right! And why not throw some batch normalization layers inthe mix? Sure! The residual block looks like this now:Residual Connections | 551

class ResidualBlock(nn.Module):def __init__(self, in_channels, out_channels, stride=1):super(ResidualBlock, self).__init__()self.conv1 = nn.Conv2d(in_channels, out_channels,kernel_size=3, padding=1, stride=stride,bias=False)self.bn1 = nn.BatchNorm2d(out_channels)self.relu = nn.ReLU(inplace=True)self.conv2 = nn.Conv2d(out_channels, out_channels,kernel_size=3, padding=1,bias=False)self.bn2 = nn.BatchNorm2d(out_channels)self.downsample = Noneif out_channels != in_channels:self.downsample = nn.Conv2d(in_channels, out_channels,kernel_size=1, stride=stride)def forward(self, x):identity = x# First "weight layer" + activationout = self.conv1(x)out = self.bn1(out)out = self.relu(out)# Second "weight layer"out = self.conv2(out)out = self.bn2(out)# What is that?!if self.downsample is not None:identity = self.downsample(identity)# Adding inputs before activationout = out + identityout = self.relu(out)return out552 | Chapter 7: Transfer Learning

class ResidualBlock(nn.Module):

def __init__(self, in_channels, out_channels, stride=1):

super(ResidualBlock, self).__init__()

self.conv1 = nn.Conv2d(

in_channels, out_channels,

kernel_size=3, padding=1, stride=stride,

bias=False

)

self.bn1 = nn.BatchNorm2d(out_channels)

self.relu = nn.ReLU(inplace=True)

self.conv2 = nn.Conv2d(

out_channels, out_channels,

kernel_size=3, padding=1,

bias=False

)

self.bn2 = nn.BatchNorm2d(out_channels)

self.downsample = None

if out_channels != in_channels:

self.downsample = nn.Conv2d(

in_channels, out_channels,

kernel_size=1, stride=stride

)

def forward(self, x):

identity = x

# First "weight layer" + activation

out = self.conv1(x)

out = self.bn1(out)

out = self.relu(out)

# Second "weight layer"

out = self.conv2(out)

out = self.bn2(out)

# What is that?!

if self.downsample is not None:

identity = self.downsample(identity)

# Adding inputs before activation

out = out + identity

out = self.relu(out)

return out

552 | Chapter 7: Transfer Learning

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!