22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Model Configuration (2)

1 loss_fn = nn.BCEWithLogitsLoss()

2 optimizer = optim.SGD(model.parameters(), lr=1e-2)

Weights, Activations, and Gradients

To visualize what’s happening with the weights, the activation values, and the

gradients, we need to capture these values first. Luckily, we already have the

appropriate methods for these tasks: capture_parameters(), attach_hooks(), and

capture_gradients(), respectively. We only need to create an instance of our

StepByStep class, configure these methods to capture these values for the

corresponding layers, and train it for a single epoch:

Model Training

1 hidden_layers = [f'h{i}' for i in range(1, n_layers + 1)]

2 activation_layers = [f'a{i}' for i in range(1, n_layers + 1)]

3

4 sbs = StepByStep(model, loss_fn, optimizer)

5 sbs.set_loaders(ball_loader)

6 sbs.capture_parameters(hidden_layers)

7 sbs.capture_gradients(hidden_layers)

8 sbs.attach_hooks(activation_layers)

9 sbs.train(1)

Since we’re not using mini-batches this time, training the model for one epoch will

use all data points to

• perform one forward pass, thus capturing the initial weights and generating

activation values; and

• perform one backward pass, thus computing the gradients.

To make matters even easier, I’ve also created a function, get_plot_data(), that

takes a data loader and a model as arguments and returns the captured values

after training it for a single epoch. This way, you can experiment with different

models too!

Vanishing and Exploding Gradients | 565

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!