Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

remained unchanged.ResNet (MSRA Team)The trick developed by Kaiming He, et al. was to add residual connections, orshortcuts, to a very deep architecture.We train neural networks with depth of over 150 layers. We propose a "deepresidual learning" framework that eases the optimization and convergence ofextremely deep networks.Source: Results (ILSVRC2015) [120]In a nutshell, it allows the network to more easily learn the identity function. We’llget back to it in the "Residual Connections" section later in this chapter. If you wantto learn more about it, the paper is called "Deep Residual Learning for ImageRecognition." [121]By the way, Kaiming He also has an initialization scheme namedafter him—sometimes referred to as "He initialization," sometimesreferred to as "Kaiming initialization"—and we’ll learn about thosein the next chapter.ImagenetteIf you are looking for a smaller, more manageable dataset that’s ImageNetlike,Imagenette is for you! Developed by Jeremy Howard from fast.ai, it is asubset of ten easily classified classes from ImageNet.You can find it here: https://github.com/fastai/imagenette.Comparing ArchitecturesNow that you’re familiar with some of the popular architectures (many of which arereadily available as Torchvision models), let’s compare their performance (Top-1accuracy %), number of operations in a single forward pass (billions), and sizes (inmillions of parameters). The figure below is very illustrative in this sense.Comparing Architectures | 503

Figure 7.1 - Comparing architectures (size proportional to number of parameters)Source: Data for accuracy and GFLOPs estimates obtained from this report [122] ; numberof parameters (proportional to the size of the circles) obtained from Torchvision’s models.For a more detailed analysis, see Canziani, A., Culurciello, E., Paszke, A. "An Analysis ofDeep Neural Network Models for Practical Applications" [123] (2017).See how massive the VGG models are, both in size and in the number of operationsrequired to deliver a single prediction? On the other hand, check out Inception-V3and ResNet-50's positions in the plot: They would give more bang for your buck.The former has a slightly higher performance, and the latter is slightly faster.These are the models you’re likely using for transfer learning:Inception and ResNet.On the bottom left, there is AlexNet. It was miles ahead of anything else in 2012,but it is not competitive at all anymore."If AlexNet is not competitive, why are you using it to illustratetransfer learning?"A fair point indeed. The reason is, its architectural elements are already familiar toyou, thus making it easier for me to explain how we’re modifying it to fit ourpurposes.504 | Chapter 7: Transfer Learning

remained unchanged.

ResNet (MSRA Team)

The trick developed by Kaiming He, et al. was to add residual connections, or

shortcuts, to a very deep architecture.

We train neural networks with depth of over 150 layers. We propose a "deep

residual learning" framework that eases the optimization and convergence of

extremely deep networks.

Source: Results (ILSVRC2015) [120]

In a nutshell, it allows the network to more easily learn the identity function. We’ll

get back to it in the "Residual Connections" section later in this chapter. If you want

to learn more about it, the paper is called "Deep Residual Learning for Image

Recognition." [121]

By the way, Kaiming He also has an initialization scheme named

after him—sometimes referred to as "He initialization," sometimes

referred to as "Kaiming initialization"—and we’ll learn about those

in the next chapter.

Imagenette

If you are looking for a smaller, more manageable dataset that’s ImageNetlike,

Imagenette is for you! Developed by Jeremy Howard from fast.ai, it is a

subset of ten easily classified classes from ImageNet.

You can find it here: https://github.com/fastai/imagenette.

Comparing Architectures

Now that you’re familiar with some of the popular architectures (many of which are

readily available as Torchvision models), let’s compare their performance (Top-1

accuracy %), number of operations in a single forward pass (billions), and sizes (in

millions of parameters). The figure below is very illustrative in this sense.

Comparing Architectures | 503

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!