Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

In the model above, the sigmoid function isn’t an activationfunction: It is there only to convert logits into probabilities.You may be wondering: "Can I mix different activation functions inthe same model?" It is definitely possible, but it is also highlyunusual. In general, models are built using the same activationfunction across all hidden layers. The ReLU or one of its variantsare the most common choices because they lead to fastertraining, while TanH and sigmoid activation functions are used invery specific cases (recurrent neural networks, for instance).But, more important, since it can perform two transformations now (andactivations, obviously), this is how the model is working:Figure B.11 - Activated feature space—deeper modelFirst of all, these plots were built using a model trained for 15 epochs only(compared to 160 epochs in all previous models). Adding another hidden layersurely makes the model more powerful, thus leading to a satisfactory solution in amuch shorter amount of time."Great, let’s just make ridiculously deep models and solve everything!Right?"Not so fast! As models grow deeper, other issues start popping up, like the(in)famous vanishing gradients problem. We’ll get back to that later. For now,adding one or two extra layers is likely safe, but please don’t get carried away withit.More Dimensions, More BoundariesWe can also make a model more powerful by adding more units to a hidden layer.By doing this, we’re increasing dimensionality; that is, mapping our twodimensionalfeature space into a, say, ten-dimensional feature space (which wecannot visualize). But we can map it back to two dimensions in a second hiddenMore Dimensions, More Boundaries | 341

layer with the sole purpose of taking a peek at it.I am skipping the diagram, but here is the code:fs_model = nn.Sequential()fs_model.add_module('hidden0', nn.Linear(2, 10))fs_model.add_module('activation0', nn.PReLU())fs_model.add_module('hidden1', nn.Linear(10, 2))fs_model.add_module('output', nn.Linear(2, 1))fs_model.add_module('sigmoid', nn.Sigmoid())Its first hidden layer has ten units now and uses PReLU as an activation function.The second hidden layer, however, has no activation function: This layer isworking as a projection of 10D into 2D, such that the decision boundary can bevisualized in two dimensions.In practice, this extra hidden layer is redundant. Remember,without an activation function between two layers, they areequivalent to a single layer. We are doing this here with the solepurpose of visualizing it.And here are the results, after training it for ten epochs only.Figure B.12 - Activated feature space—wider modelBy mapping the original feature space into some crazy ten-dimensional one, wemake it easier for our model to figure out a way of separating the data. Remember,342 | Bonus Chapter: Feature Space

layer with the sole purpose of taking a peek at it.

I am skipping the diagram, but here is the code:

fs_model = nn.Sequential()

fs_model.add_module('hidden0', nn.Linear(2, 10))

fs_model.add_module('activation0', nn.PReLU())

fs_model.add_module('hidden1', nn.Linear(10, 2))

fs_model.add_module('output', nn.Linear(2, 1))

fs_model.add_module('sigmoid', nn.Sigmoid())

Its first hidden layer has ten units now and uses PReLU as an activation function.

The second hidden layer, however, has no activation function: This layer is

working as a projection of 10D into 2D, such that the decision boundary can be

visualized in two dimensions.

In practice, this extra hidden layer is redundant. Remember,

without an activation function between two layers, they are

equivalent to a single layer. We are doing this here with the sole

purpose of visualizing it.

And here are the results, after training it for ten epochs only.

Figure B.12 - Activated feature space—wider model

By mapping the original feature space into some crazy ten-dimensional one, we

make it easier for our model to figure out a way of separating the data. Remember,

342 | Bonus Chapter: Feature Space

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!