22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

What’s the impact of one update on our model? Let’s visually check its predictions.

Figure 0.9 - Updated model’s predictions

It looks better … at least it started pointing in the right direction!

Learning Rate

The learning rate is the most important hyper-parameter. There is a gigantic

amount of material on how to choose a learning rate, how to modify the learning

rate during the training, and how the wrong learning rate can completely ruin the

model training.

Maybe you’ve seen this famous graph [38] (from Stanford’s CS231n class) that shows

how a learning rate that is too high or too low affects the loss during training. Most

people will see it (or have seen it) at some point in time. This is pretty much general

knowledge, but I think it needs to be thoroughly explained and visually

demonstrated to be truly understood. So, let’s start!

I will tell you a little story (trying to build an analogy here, please bear with me!):

Imagine you are coming back from hiking in the mountains and you want to get

back home as quickly as possible. At some point in your path, you can either choose

to go ahead or to make a right turn.

The path ahead is almost flat, while the path to your right is kinda steep. The

steepness is the gradient. If you take a single step one way or the other, it will lead

to different outcomes (you’ll descend more if you take one step to the right instead

of going ahead).

Step 4 - Update the Parameters | 43

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!