pdfcoffee
Chapter 15The derivative can be computed as follows:σσ ′ (zz) = dd dddd ( 11 + ee −zz) = 1(1 + ee −zz ) −2 dd dddd (ee−zz ) =ee −zz + 1 − 1(1 + ee −zz )1(1 + ee −zz ) (1 + ee −zz ) =ee −zz1(1 + ee −zz ) = ((1 + ee−zz )(1 + ee −zz ) − 1(1 + ee −zz ) ) 1(1 + ee −zz ) =1(1 −(1 + ee −zz ) ) ( 1(1 + ee −zz ) )(1 − σσ(zz))σσ(zz)Therefore the derivative of σσ(zz) can be computed as a very simple formσσ ′ (zz) = (1 − σσ(zz))σσ(zz) .Derivative of tanhRemember that the arctan function is defined as, tanh(zz) = eezz − ee −zzee zz + ee−zz as seen inFigure 7:Figure 7: Tanh activation functionIf you remember that dd dddd eezz = ee zz and dd dddd ee−zz = −ee −zz , then the derivative iscomputed as:dddddd tanh(xx) = (eezz + ee −zz )(ee zz + ee −zz ) − (ee zz − ee −zz )(ee zz − ee −zz )(ee zz + ee −zz ) 2 =1 − (eezz − ee −zz ) 2(ee zz + ee −zz ) 2 = 1 − tttttth2 (zz)[ 549 ]
The Math Behind Deep LearningTherefore the derivative of tanh(z) can be computed as a very simple form:tttttth ′ (zz) = 1 − tttttth 2 (zz) .Derivative of ReLUThe ReLU function is defined as f(x) = max(0, x) (see Figure 8). The derivative ofReLU is:ff ′ (xx) = { 1,0,iiiiii > 0ooooheeeeeeeeeeeeNote that ReLU is non-differentiable at zero. However, it is differentiable anywhereelse, and the value of the derivative at zero can be arbitrarily chosen to be a 0 or 1,as demonstrated in Figure 8:Figure 8: ReLU activation functionBackpropagationNow that we have computed the derivative of the activation functions, we candescribe the backpropagation algorithm – the mathematical core of deep learning.Sometimes, backpropagation is called backprop for short.[ 550 ]
- Page 533 and 534: An introduction to AutoMLGoogle Clo
- Page 535 and 536: An introduction to AutoMLThen, we c
- Page 537 and 538: An introduction to AutoMLOnce the d
- Page 539 and 540: An introduction to AutoMLIf your mo
- Page 541 and 542: An introduction to AutoMLClicking o
- Page 543 and 544: An introduction to AutoMLFigure 16:
- Page 545 and 546: An introduction to AutoMLYou can al
- Page 547 and 548: An introduction to AutoMLPut simply
- Page 549 and 550: An introduction to AutoMLLet's star
- Page 551 and 552: An introduction to AutoMLThe token
- Page 553 and 554: An introduction to AutoMLThis will
- Page 555 and 556: An introduction to AutoMLFigure 37:
- Page 557 and 558: An introduction to AutoMLAt the end
- Page 559 and 560: An introduction to AutoMLUsing Clou
- Page 561 and 562: An introduction to AutoMLOnce the d
- Page 563 and 564: An introduction to AutoMLAt the end
- Page 565 and 566: An introduction to AutoMLAs the nex
- Page 567 and 568: An introduction to AutoMLOnce the m
- Page 569 and 570: An introduction to AutoMLFigure 65:
- Page 571 and 572: An introduction to AutoMLOnce the m
- Page 573 and 574: An introduction to AutoMLWe can als
- Page 575 and 576: An introduction to AutoMLThe most e
- Page 577 and 578: An introduction to AutoMLReferences
- Page 579 and 580: The Math Behind Deep LearningSome m
- Page 581 and 582: The Math Behind Deep LearningSuppos
- Page 583: The Math Behind Deep LearningNote t
- Page 587 and 588: The Math Behind Deep LearningThe ea
- Page 589 and 590: The Math Behind Deep LearningThe re
- Page 591 and 592: The Math Behind Deep LearningCase 2
- Page 593 and 594: The Math Behind Deep LearningIn thi
- Page 595 and 596: The Math Behind Deep LearningHere,
- Page 597 and 598: The Math Behind Deep Learning(Note
- Page 599 and 600: The Math Behind Deep LearningIn man
- Page 601 and 602: The Math Behind Deep LearningIf we
- Page 603 and 604: The Math Behind Deep LearningChapte
- Page 605 and 606: The Math Behind Deep LearningThis c
- Page 607 and 608: Tensor Processing UnitMany people b
- Page 609 and 610: Tensor Processing UnitThe sequentia
- Page 611 and 612: Tensor Processing UnitIf you want t
- Page 613 and 614: Tensor Processing UnitOn the other
- Page 615 and 616: Tensor Processing UnitHow to use TP
- Page 617 and 618: Tensor Processing UnitNote that ful
- Page 619 and 620: Tensor Processing UnitEpoch 10/1060
- Page 621 and 622: Tensor Processing UnitFigure 11: Go
- Page 623 and 624: Tensor Processing UnitThen the usag
- Page 626 and 627: Other Books YouMay EnjoyIf you enjo
- Page 628 and 629: Other Books You May EnjoyAI Crash C
- Page 630: Other Books You May EnjoyLeave a re
- Page 633 and 634: AutoML pipelinedata preparation 493
The Math Behind Deep Learning
Therefore the derivative of tanh(z) can be computed as a very simple form:
tttttth ′ (zz) = 1 − tttttth 2 (zz) .
Derivative of ReLU
The ReLU function is defined as f(x) = max(0, x) (see Figure 8). The derivative of
ReLU is:
ff ′ (xx) = { 1,
0,
iiiiii > 0
ooooheeeeeeeeeeee
Note that ReLU is non-differentiable at zero. However, it is differentiable anywhere
else, and the value of the derivative at zero can be arbitrarily chosen to be a 0 or 1,
as demonstrated in Figure 8:
Figure 8: ReLU activation function
Backpropagation
Now that we have computed the derivative of the activation functions, we can
describe the backpropagation algorithm – the mathematical core of deep learning.
Sometimes, backpropagation is called backprop for short.
[ 550 ]