22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

OK, now we can clearly see a difference: The decision boundary on the original

feature space has a corner, a direct consequence of the ReLU’s own corner when its

input is zero. On the right, we can also verify that the range of the activated feature

space has only positive values, as expected.

Next, let’s try the Parametric ReLU (PReLU).

Figure B.9 - Activated feature space—PReLU

This is even more different! Given that the PReLU learns a slope for the negative

values, effectively bending the feature space instead of simply chopping off parts of

it like the plain ReLU, the result looks like the feature space was folded in two

different places. I don’t know about you, but I find this really cool!

So far, all models were trained for 160 epochs, which was about enough training for

them to converge to a solution that would completely separate both parabolas.

This seems like quite a lot of epochs to solve a rather simple problem, right? But,

keep in mind what we discussed in Chapter 3: Increasing dimensionality makes it

easier to separate the classes. So, we’re actually imposing a severe restriction on

these models by keeping it two-dimensional (two units in the hidden layer) and

performing only one transformation (only one hidden layer).

Let’s cut our models some slack and give them more power…

More Functions, More Boundaries | 339

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!