09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The Math Behind Deep Learning

Here, L denotes the error from the generic previous layer:

Figure 14: An example of the math behind backpropagation

If we want to explicitly compute the gradient with respect to the

output layer biases, it can be proven that ∂∂∂∂ = vv

∂∂bb jj . We leave this as an

ii

exercise for the reader.

In short, for case 2 (hidden-to-hidden connection) the weight delta is ∆ww = ηηvv jj yy ii

and the weight update equation for each of the hidden-hidden connections is simply:

ww iiii ← ww iiii − ηη ∂∂∂∂

∂∂ww iiii

We have arrived at the end of this section and all the mathematical tools are defined

to make our final statement. The essence of backstep is nothing other than applying

the weight update rules one layer after another, starting from the last output layer

and moving back toward the first input layer. Difficult to derive, to be sure, but

extremely easy to apply once defined. The whole forward-backward algorithm at the

core of deep learning is then the following:

1. Compute the feed-forward signals from the input to the output.

2. Compute the output error E based on the predictions y o

and the true value t o

.

[ 560 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!