22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

optimizer = optim.Adam(model.parameters(), lr=0.1,

betas=(0.9, 0.999), eps=1e-8)

Visualizing Adapted Gradients

Now, I’d like to give you the chance to visualize the gradients, the EWMAs, and the

resulting adapted gradients. To make it easier, let’s bring back our simple linear

regression problem from Part I of this book and, somewhat nostalgically, perform

the training loop so that we can log the gradients.

From now on and until the end of the "Learning Rates" section,

we’ll be ONLY using the simple linear regression dataset to

illustrate the effects of different parameters on the minimization

of the loss. We’ll get back to the Rock Paper Scissors dataset in the

"Putting It All Together" section.

First, we generate the data points again and run the typical data preparation step

(building dataset, splitting it, and building data loaders):

Data Generation & Preparation

%run -i data_generation/simple_linear_regression.py

%run -i data_preparation/v2.py

Then, we go over the model configuration and change the optimizer from SGD to

Adam:

Model Configuration

1 torch.manual_seed(42)

2 model = nn.Sequential()

3 model.add_module('linear', nn.Linear(1, 1))

4 optimizer = optim.Adam(model.parameters(), lr=0.1)

5 loss_fn = nn.MSELoss(reduction='mean')

We would be ready to use the StepByStep class to train our model if it weren’t for a

minor detail: We still do not have a way of logging gradients. So, let’s tackle this

issue by adding yet another method to our class: capture_gradients(). Like the

attach_hooks() method, it will take a list of layers that should be monitored for

Learning Rates | 463

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!