Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub
The procedure is exactly the same, whether you are checkpointing a partiallytrained model to resume training later or saving a fully trained model to deployand make predictions.OK, what about loading it back? In that case, it will be a bit different, depending onwhat you’re doing.Resuming TrainingIf we’re starting fresh (as if we had just turned on the computer and startedJupyter), we have to set the stage before actually loading the model. This meanswe need to load the data and configure the model.Luckily, we have code for that already: Data Preparation V2 and ModelConfiguration V3:Notebook Cell 2.5%run -i data_preparation/v2.py%run -i model_configuration/v3.pyLet’s double-check that we do have an untrained model:print(model.state_dict())OutputOrderedDict([('0.weight', tensor([[0.7645]], device='cuda:0')),('0.bias', tensor([0.8300], device='cuda:0'))])Now we are ready to load the model back, which is easy:• load the dictionary back using torch.load()• load model and optimizer state dictionaries back using the load_state_dict()method• load everything else into their corresponding variablesSaving and Loading Models | 165
Notebook Cell 2.6 - Loading checkpoint to resume trainingcheckpoint = torch.load('model_checkpoint.pth')model.load_state_dict(checkpoint['model_state_dict'])optimizer.load_state_dict(checkpoint['optimizer_state_dict'])saved_epoch = checkpoint['epoch']saved_losses = checkpoint['loss']saved_val_losses = checkpoint['val_loss']model.train() # always use TRAIN for resuming training 11 Never forget to set the mode!print(model.state_dict())OutputOrderedDict([('0.weight', tensor([[1.9448]], device='cuda:0')),('0.bias', tensor([1.0295], device='cuda:0'))])Cool, we recovered our model’s state, and we can resume training.After loading a model to resume training, make sure youALWAYS set it to training mode:model.train()In our example, this is going to be redundant because ourtrain_step_fn() function already does it. But it is important topick up the habit of setting the mode of the model accordingly.Next, we can run Model Training V5 to train it for another 200 epochs."Why 200 more epochs? Can’t I choose a different number?"Well, you could, but you’d have to change the code in Model Training V5. Thisclearly isn’t ideal, but we will make our model training code more flexible very166 | Chapter 2: Rethinking the Training Loop
- Page 140 and 141: There are MANY different layers tha
- Page 142 and 143: We use magic, just like that:%run -
- Page 144 and 145: • Step 1: compute model’s predi
- Page 146 and 147: RecapFirst of all, congratulations
- Page 148 and 149: Chapter 2Rethinking the Training Lo
- Page 150 and 151: Let’s take a look at the code onc
- Page 152 and 153: Higher-Order FunctionsAlthough this
- Page 154 and 155: def exponentiation_builder(exponent
- Page 156 and 157: Apart from returning the loss value
- Page 158 and 159: Our code should look like this; see
- Page 160 and 161: There is no need to load the whole
- Page 162 and 163: but if we want to get serious about
- Page 164 and 165: How does this change our code so fa
- Page 166 and 167: Run - Model Training V2%run -i mode
- Page 168 and 169: piece of code that’s going to be
- Page 170 and 171: for it. We could do the same for th
- Page 172 and 173: EvaluationHow can we evaluate the m
- Page 174 and 175: And then, we update our model confi
- Page 176 and 177: Run - Model Training V4%run -i mode
- Page 178 and 179: Loading Extension# Load the TensorB
- Page 180 and 181: browser, you’ll likely see someth
- Page 182 and 183: model’s graph (not quite the same
- Page 184 and 185: Figure 2.5 - Scalars on TensorBoard
- Page 186 and 187: Define - Model Training V51 %%write
- Page 188 and 189: If, by any chance, you ended up wit
- Page 192 and 193: soon, so please bear with me for no
- Page 194 and 195: After recovering our model’s stat
- Page 196 and 197: Run - Model Configuration V31 # %lo
- Page 198 and 199: This is the general structure you
- Page 200 and 201: Chapter 2.1Going ClassySpoilersIn t
- Page 202 and 203: # A completely empty (and useless)
- Page 204 and 205: # These attributes are defined here
- Page 206 and 207: # Creates the train_step function f
- Page 208 and 209: # Builds function that performs a s
- Page 210 and 211: setattrThe setattr function sets th
- Page 212 and 213: See? We effectively modified the un
- Page 214 and 215: the random seed as arguments.This s
- Page 216 and 217: The current state of development of
- Page 218 and 219: Lossesdef plot_losses(self):fig = p
- Page 220 and 221: Run - Data Preparation V21 # %load
- Page 222 and 223: Model TrainingWe start by instantia
- Page 224 and 225: Making PredictionsLet’s make up s
- Page 226 and 227: OutputOrderedDict([('0.weight', ten
- Page 228 and 229: Run - Data Preparation V21 # %load
- Page 230 and 231: • defining our StepByStep class
- Page 232 and 233: import numpy as npimport torchimpor
- Page 234 and 235: Next, we’ll standardize the featu
- Page 236 and 237: Equation 3.1 - A linear regression
- Page 238 and 239: The odds ratio is given by the rati
The procedure is exactly the same, whether you are checkpointing a partially
trained model to resume training later or saving a fully trained model to deploy
and make predictions.
OK, what about loading it back? In that case, it will be a bit different, depending on
what you’re doing.
Resuming Training
If we’re starting fresh (as if we had just turned on the computer and started
Jupyter), we have to set the stage before actually loading the model. This means
we need to load the data and configure the model.
Luckily, we have code for that already: Data Preparation V2 and Model
Configuration V3:
Notebook Cell 2.5
%run -i data_preparation/v2.py
%run -i model_configuration/v3.py
Let’s double-check that we do have an untrained model:
print(model.state_dict())
Output
OrderedDict([('0.weight', tensor([[0.7645]], device='cuda:0')),
('0.bias', tensor([0.8300], device='cuda:0'))])
Now we are ready to load the model back, which is easy:
• load the dictionary back using torch.load()
• load model and optimizer state dictionaries back using the load_state_dict()
method
• load everything else into their corresponding variables
Saving and Loading Models | 165