Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

The procedure is exactly the same, whether you are checkpointing a partiallytrained model to resume training later or saving a fully trained model to deployand make predictions.OK, what about loading it back? In that case, it will be a bit different, depending onwhat you’re doing.Resuming TrainingIf we’re starting fresh (as if we had just turned on the computer and startedJupyter), we have to set the stage before actually loading the model. This meanswe need to load the data and configure the model.Luckily, we have code for that already: Data Preparation V2 and ModelConfiguration V3:Notebook Cell 2.5%run -i data_preparation/v2.py%run -i model_configuration/v3.pyLet’s double-check that we do have an untrained model:print(model.state_dict())OutputOrderedDict([('0.weight', tensor([[0.7645]], device='cuda:0')),('0.bias', tensor([0.8300], device='cuda:0'))])Now we are ready to load the model back, which is easy:• load the dictionary back using torch.load()• load model and optimizer state dictionaries back using the load_state_dict()method• load everything else into their corresponding variablesSaving and Loading Models | 165

Notebook Cell 2.6 - Loading checkpoint to resume trainingcheckpoint = torch.load('model_checkpoint.pth')model.load_state_dict(checkpoint['model_state_dict'])optimizer.load_state_dict(checkpoint['optimizer_state_dict'])saved_epoch = checkpoint['epoch']saved_losses = checkpoint['loss']saved_val_losses = checkpoint['val_loss']model.train() # always use TRAIN for resuming training 11 Never forget to set the mode!print(model.state_dict())OutputOrderedDict([('0.weight', tensor([[1.9448]], device='cuda:0')),('0.bias', tensor([1.0295], device='cuda:0'))])Cool, we recovered our model’s state, and we can resume training.After loading a model to resume training, make sure youALWAYS set it to training mode:model.train()In our example, this is going to be redundant because ourtrain_step_fn() function already does it. But it is important topick up the habit of setting the mode of the model accordingly.Next, we can run Model Training V5 to train it for another 200 epochs."Why 200 more epochs? Can’t I choose a different number?"Well, you could, but you’d have to change the code in Model Training V5. Thisclearly isn’t ideal, but we will make our model training code more flexible very166 | Chapter 2: Rethinking the Training Loop

The procedure is exactly the same, whether you are checkpointing a partially

trained model to resume training later or saving a fully trained model to deploy

and make predictions.

OK, what about loading it back? In that case, it will be a bit different, depending on

what you’re doing.

Resuming Training

If we’re starting fresh (as if we had just turned on the computer and started

Jupyter), we have to set the stage before actually loading the model. This means

we need to load the data and configure the model.

Luckily, we have code for that already: Data Preparation V2 and Model

Configuration V3:

Notebook Cell 2.5

%run -i data_preparation/v2.py

%run -i model_configuration/v3.py

Let’s double-check that we do have an untrained model:

print(model.state_dict())

Output

OrderedDict([('0.weight', tensor([[0.7645]], device='cuda:0')),

('0.bias', tensor([0.8300], device='cuda:0'))])

Now we are ready to load the model back, which is easy:

• load the dictionary back using torch.load()

• load model and optimizer state dictionaries back using the load_state_dict()

method

• load everything else into their corresponding variables

Saving and Loading Models | 165

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!