Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

if callable(self.clipping): 1self.clipping()1# Step 4 - Updates parametersself.optimizer.step()self.optimizer.zero_grad()# Returns the lossreturn loss.item()# Returns the function that will be called inside the train loopreturn perform_train_step_fnsetattr(StepByStep, 'set_clip_grad_value', set_clip_grad_value)setattr(StepByStep, 'set_clip_grad_norm', set_clip_grad_norm)setattr(StepByStep, 'remove_clip', remove_clip)setattr(StepByStep, '_make_train_step_fn', _make_train_step_fn)1 Gradient clipping after computing gradients and before updating parametersThe gradient clipping methods above work just fine for most models, but they areof little use for recurrent neural networks (we’ll discuss them in Chapter 8), whichrequire gradients to be clipped during backpropagation. Fortunately, we can usethe backward hooks code to accomplish that:Vanishing and Exploding Gradients | 579

StepByStep Methoddef set_clip_backprop(self, clip_value):if self.clipping is None:self.clipping = []for p in self.model.parameters():if p.requires_grad:func = lambda grad: torch.clamp(grad,-clip_value,clip_value)handle = p.register_hook(func)self.clipping.append(handle)def remove_clip(self):if isinstance(self.clipping, list):for handle in self.clipping:handle.remove()self.clipping = Nonesetattr(StepByStep, 'set_clip_backprop', set_clip_backprop)setattr(StepByStep, 'remove_clip', remove_clip)The method above will attach hooks to all parameters of the model and performgradient clipping on the fly. We also adjusted the remove_clip() method to removeany handles associated with the hooks.Model Configuration & TrainingLet’s use the method below to initialize the weights so we can reset theparameters of our model that had its gradients exploded:Weight Initialization1 def weights_init(m):2 if isinstance(m, nn.Linear):3 nn.init.kaiming_uniform_(m.weight, nonlinearity='relu')4 if m.bias is not None:5 nn.init.zeros_(m.bias)580 | Extra Chapter: Vanishing and Exploding Gradients

if callable(self.clipping): 1

self.clipping()

1

# Step 4 - Updates parameters

self.optimizer.step()

self.optimizer.zero_grad()

# Returns the loss

return loss.item()

# Returns the function that will be called inside the train loop

return perform_train_step_fn

setattr(StepByStep, 'set_clip_grad_value', set_clip_grad_value)

setattr(StepByStep, 'set_clip_grad_norm', set_clip_grad_norm)

setattr(StepByStep, 'remove_clip', remove_clip)

setattr(StepByStep, '_make_train_step_fn', _make_train_step_fn)

1 Gradient clipping after computing gradients and before updating parameters

The gradient clipping methods above work just fine for most models, but they are

of little use for recurrent neural networks (we’ll discuss them in Chapter 8), which

require gradients to be clipped during backpropagation. Fortunately, we can use

the backward hooks code to accomplish that:

Vanishing and Exploding Gradients | 579

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!