pdfcoffee

soumyasankar99
from soumyasankar99 More from this publisher
09.05.2023 Views

Chapter 15The derivative can be computed as follows:σσ ′ (zz) = dd dddd ( 11 + ee −zz) = 1(1 + ee −zz ) −2 dd dddd (ee−zz ) =ee −zz + 1 − 1(1 + ee −zz )1(1 + ee −zz ) (1 + ee −zz ) =ee −zz1(1 + ee −zz ) = ((1 + ee−zz )(1 + ee −zz ) − 1(1 + ee −zz ) ) 1(1 + ee −zz ) =1(1 −(1 + ee −zz ) ) ( 1(1 + ee −zz ) )(1 − σσ(zz))σσ(zz)Therefore the derivative of σσ(zz) can be computed as a very simple formσσ ′ (zz) = (1 − σσ(zz))σσ(zz) .Derivative of tanhRemember that the arctan function is defined as, tanh(zz) = eezz − ee −zzee zz + ee−zz as seen inFigure 7:Figure 7: Tanh activation functionIf you remember that dd dddd eezz = ee zz and dd dddd ee−zz = −ee −zz , then the derivative iscomputed as:dddddd tanh(xx) = (eezz + ee −zz )(ee zz + ee −zz ) − (ee zz − ee −zz )(ee zz − ee −zz )(ee zz + ee −zz ) 2 =1 − (eezz − ee −zz ) 2(ee zz + ee −zz ) 2 = 1 − tttttth2 (zz)[ 549 ]

The Math Behind Deep LearningTherefore the derivative of tanh(z) can be computed as a very simple form:tttttth ′ (zz) = 1 − tttttth 2 (zz) .Derivative of ReLUThe ReLU function is defined as f(x) = max(0, x) (see Figure 8). The derivative ofReLU is:ff ′ (xx) = { 1,0,iiiiii > 0ooooheeeeeeeeeeeeNote that ReLU is non-differentiable at zero. However, it is differentiable anywhereelse, and the value of the derivative at zero can be arbitrarily chosen to be a 0 or 1,as demonstrated in Figure 8:Figure 8: ReLU activation functionBackpropagationNow that we have computed the derivative of the activation functions, we candescribe the backpropagation algorithm – the mathematical core of deep learning.Sometimes, backpropagation is called backprop for short.[ 550 ]

The Math Behind Deep Learning

Therefore the derivative of tanh(z) can be computed as a very simple form:

tttttth ′ (zz) = 1 − tttttth 2 (zz) .

Derivative of ReLU

The ReLU function is defined as f(x) = max(0, x) (see Figure 8). The derivative of

ReLU is:

ff ′ (xx) = { 1,

0,

iiiiii > 0

ooooheeeeeeeeeeee

Note that ReLU is non-differentiable at zero. However, it is differentiable anywhere

else, and the value of the derivative at zero can be arbitrarily chosen to be a 0 or 1,

as demonstrated in Figure 8:

Figure 8: ReLU activation function

Backpropagation

Now that we have computed the derivative of the activation functions, we can

describe the backpropagation algorithm – the mathematical core of deep learning.

Sometimes, backpropagation is called backprop for short.

[ 550 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!