22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

just did), or, if you are training a deeper model from scratch, it is probably best to

initialize the layers manually so you have total control over the process.

Don’t worry too much about initialization schemes just now! This

is a somewhat advanced topic already, but I thought it was worth

introducing it after going over batch normalization. As I

mentioned before, you’re likely using transfer learning with

deeper models anyway.

"What if I really want to try initializing the weights myself—how can I

do it?"

Let’s go over a simple example. Let’s say you’d like to initialize all linear layers using

the Kaiming uniform scheme with the proper nonlinearity function for the weights

and setting all the biases to zeros. You’ll have to build a function that takes a layer

as its argument:

Weight Initialization

1 def weights_init(m):

2 if isinstance(m, nn.Linear):

3 nn.init.kaiming_uniform_(m.weight, nonlinearity='relu')

4 if m.bias is not None:

5 nn.init.zeros_(m.bias)

The function may set both weight and bias attributes of the layer passed as the

argument. Notice that both methods from nn.init used to initialize the attributes

have an underscore at the end, so they are making changes in-place.

To actually apply the initialization scheme to your model, simply call the apply()

method of your model, and it will recursively apply the initialization function to all

its internal layers:

Model Configuration (3)

1 with torch.no_grad():

2 model.apply(weights_init)

You should also use the no_grad() context manager while initializing / modifying

the weights and biases of your model.

Vanishing and Exploding Gradients | 569

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!