Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

Model Configuration1 optimizer_model = optim.Adam(model.parameters(), lr=3e-4)2 sbs_incep = StepByStep(model, inception_loss, optimizer_model)"Wait, aren’t we pre-processing the dataset this time?"Unfortunately, no. The preprocessed_dataset() function cannot handle multipleoutputs. Instead of making the process convoluted in order to handle thepeculiarities of the Inception model, I am sticking with the simpler (yet slower) wayof training the last layer while it is still attached to the rest of the model.The Inception model is also different from the others in its expected input size: 299instead of 224. So, we need to recreate the data loaders accordingly:Data Preparation1 normalizer = Normalize(mean=[0.485, 0.456, 0.406],2 std=[0.229, 0.224, 0.225])34 composer = Compose([Resize(299),5 ToTensor(),6 normalizer])78 train_data = ImageFolder(root='rps', transform=composer)9 val_data = ImageFolder(root='rps-test-set', transform=composer)10 # Builds a loader of each set11 train_loader = DataLoader(12 train_data, batch_size=16, shuffle=True)13 val_loader = DataLoader(val_data, batch_size=16)We’re ready, so let’s train our model for a single epoch and evaluate the result:Model Training1 sbs_incep.set_loaders(train_loader, val_loader)2 sbs_incep.train(1)StepByStep.loader_apply(val_loader, sbs_incep.correct)Auxiliary Classifiers (Side-Heads) | 523

Outputtensor([[108, 124],[116, 124],[108, 124]])It achieved an accuracy of 89.25% on the validation set. Not bad!There is more to the Inception model than auxiliary classifiers, though. Let’s checkout some of its other architectural elements.1x1 ConvolutionsThis particular architectural element is not exactly new, but it is a somewhat specialcase of an already known element. So far, the smallest kernel used in aconvolutional layer had a size of three-by-three. These kernels performed anelement-wise multiplication, and then they added up the resulting elements toproduce a single value for each region to which they were applied. So far, nothingnew.The idea of a kernel of size one-by-one is somewhat counterintuitive at first. For asingle channel, this kernel is only scaling the values of its input and nothing else.That seems hardly useful.But everything changes if you have multiple channels! Remember the threechannelconvolutions from Chapter 6? A filter has as many channels as its input.This means that each channel will be scaled independently and the results will beadded up, resulting in one channel as output (per filter).A 1x1 convolution can be used to reduce the number ofchannels; that is, it may work as a dimension-reduction layer.An image is worth a thousand words, so let’s visualize this.524 | Chapter 7: Transfer Learning

Output

tensor([[108, 124],

[116, 124],

[108, 124]])

It achieved an accuracy of 89.25% on the validation set. Not bad!

There is more to the Inception model than auxiliary classifiers, though. Let’s check

out some of its other architectural elements.

1x1 Convolutions

This particular architectural element is not exactly new, but it is a somewhat special

case of an already known element. So far, the smallest kernel used in a

convolutional layer had a size of three-by-three. These kernels performed an

element-wise multiplication, and then they added up the resulting elements to

produce a single value for each region to which they were applied. So far, nothing

new.

The idea of a kernel of size one-by-one is somewhat counterintuitive at first. For a

single channel, this kernel is only scaling the values of its input and nothing else.

That seems hardly useful.

But everything changes if you have multiple channels! Remember the threechannel

convolutions from Chapter 6? A filter has as many channels as its input.

This means that each channel will be scaled independently and the results will be

added up, resulting in one channel as output (per filter).

A 1x1 convolution can be used to reduce the number of

channels; that is, it may work as a dimension-reduction layer.

An image is worth a thousand words, so let’s visualize this.

524 | Chapter 7: Transfer Learning

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!