22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Activation Functions

"What are activation functions?"

Activation functions are nonlinear functions. They either squash or bend straight

lines. They will break the equivalence between the deep-ish and the shallow

models.

"What exactly do you mean by squash or bend straight lines?"

Excellent question! Please hold this thought, as I will illustrate this in the next

chapter, "Feature Space." First, let’s take a look at some common activation

functions. PyTorch has plenty of activation functions to choose from, but we are

focusing on five of them only.

Sigmoid

Let’s start with the most traditional of the activation functions, the sigmoid, which

we’ve already used to transform logits into probabilities. Nowadays, that is pretty

much its only usage, but in the early days of neural networks, one would find it

everywhere!

Figure 4.11 - Sigmoid function and its gradient

Let’s quickly recap the shape of a sigmoid: As you can see in the figure above, a

sigmoid activation function "squashes" its input values (z) into the range (0, 1)

(same range probabilities can take, which is why it is used in the output layer for

binary classification tasks). It is also possible to verify that its gradient peak value

312 | Chapter 4: Classifying Images

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!