09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 1

Figure 7: Tanh activation function

Activation function – ReLU

The sigmoid is not the only kind of smooth activation function used for neural

networks. Recently, a very simple function named ReLU (REctified Linear Unit)

became very popular because it helps address some optimization problems observed

with sigmoids. We will discuss these problems in more detail when we talk about

vanishing gradient in Chapter 9, Autoencoders. A ReLU is simply defined as f(x) =

max(0, x) and the non-linear function is represented in Figure 8. As you can see, the

function is zero for negative values and it grows linearly for positive values. The

ReLU is also very simple to implement (generally, three instructions are enough),

while the sigmoid is a few orders of magnitude more. This helped to squeeze the

neural networks onto an early GPU:

Figure 8: A ReLU function

[ 11 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!