22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

High Learning Rate

What would have happened if we had used a high learning rate instead, say, a step

size of 0.8? As we can see in the plots below, we start to, literally, run into trouble.

Figure 0.11 - Using a high learning rate

Even though everything is still OK on the left plot, the right plot shows us a

completely different picture: We ended up on the other side of the curve. That is

not good… You’d be going back and forth, alternately hitting both sides of the

curve.

"Well, even so, I may still reach the minimum; why is it so bad?"

In our simple example, yes, you’d eventually reach the minimum because the curve

is nice and round.

But, in real problems, the "curve" has a really weird shape that allows for bizarre

outcomes, such as going back and forth without ever approaching the minimum.

In our analogy, you moved so fast that you fell down and hit the other side of the

valley, then kept going down like a ping-pong. Hard to believe, I know, but you

definitely don’t want that!

46 | Chapter 0: Visualizing Gradient Descent

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!