Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

OK, now we can clearly see a difference: The decision boundary on the originalfeature space has a corner, a direct consequence of the ReLU’s own corner when itsinput is zero. On the right, we can also verify that the range of the activated featurespace has only positive values, as expected.Next, let’s try the Parametric ReLU (PReLU).Figure B.9 - Activated feature space—PReLUThis is even more different! Given that the PReLU learns a slope for the negativevalues, effectively bending the feature space instead of simply chopping off parts ofit like the plain ReLU, the result looks like the feature space was folded in twodifferent places. I don’t know about you, but I find this really cool!So far, all models were trained for 160 epochs, which was about enough training forthem to converge to a solution that would completely separate both parabolas.This seems like quite a lot of epochs to solve a rather simple problem, right? But,keep in mind what we discussed in Chapter 3: Increasing dimensionality makes iteasier to separate the classes. So, we’re actually imposing a severe restriction onthese models by keeping it two-dimensional (two units in the hidden layer) andperforming only one transformation (only one hidden layer).Let’s cut our models some slack and give them more power…More Functions, More Boundaries | 339

More Layers, More BoundariesOne way to give a model more power is to make it deeper. We can make it deeper,while keeping it strictly two-dimensional, by adding another hidden layer withtwo units. It looks like the diagram:Figure B.10 - Deeper modelA sequence of one or more hidden layers, all with the same size asthe input layer, as in the figure above (up to "Activation #1"), is atypical architecture used to model the hidden state in recurrentneural networks (RNNs). We’ll get back to it in a later chapter.And it looks like this in code (we’re using a hyperbolic tangent as an activationfunction because it looks good when visualizing a sequence of transformations):fs_model = nn.Sequential()fs_model.add_module('hidden0', nn.Linear(2, 2))fs_model.add_module('activation0', nn.Tanh())fs_model.add_module('hidden1', nn.Linear(2, 2))fs_model.add_module('activation1', nn.Tanh())fs_model.add_module('output', nn.Linear(2, 1))fs_model.add_module('sigmoid', nn.Sigmoid())340 | Bonus Chapter: Feature Space

More Layers, More Boundaries

One way to give a model more power is to make it deeper. We can make it deeper,

while keeping it strictly two-dimensional, by adding another hidden layer with

two units. It looks like the diagram:

Figure B.10 - Deeper model

A sequence of one or more hidden layers, all with the same size as

the input layer, as in the figure above (up to "Activation #1"), is a

typical architecture used to model the hidden state in recurrent

neural networks (RNNs). We’ll get back to it in a later chapter.

And it looks like this in code (we’re using a hyperbolic tangent as an activation

function because it looks good when visualizing a sequence of transformations):

fs_model = nn.Sequential()

fs_model.add_module('hidden0', nn.Linear(2, 2))

fs_model.add_module('activation0', nn.Tanh())

fs_model.add_module('hidden1', nn.Linear(2, 2))

fs_model.add_module('activation1', nn.Tanh())

fs_model.add_module('output', nn.Linear(2, 1))

fs_model.add_module('sigmoid', nn.Sigmoid())

340 | Bonus Chapter: Feature Space

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!