Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub
• Step 1: compute model’s predictions• Step 2: compute the loss• Step 3: compute the gradients• Step 4: update the parametersThis sequence is repeated over and over until the number of epochs is reached.The corresponding code for this part also comes from Notebook Cell 1.10, lines 17-36."What happened to the random initialization step?"Since we are not manually creating parameters anymore, the initialization ishandled inside each layer during model creation.Define - Model Training V01 %%writefile model_training/v0.py23 # Defines number of epochs4 n_epochs = 100056 for epoch in range(n_epochs):7 # Sets model to TRAIN mode8 model.train()910 # Step 1 - Computes model's predicted output - forward pass11 yhat = model(x_train_tensor)1213 # Step 2 - Computes the loss14 loss = loss_fn(yhat, y_train_tensor)1516 # Step 3 - Computes gradients for both "b" and "w" parameters17 loss.backward()1819 # Step 4 - Updates parameters using gradients and20 # the learning rate21 optimizer.step()22 optimizer.zero_grad()Putting It All Together | 119
Run - Model Training V0%run -i model_training/v0.pyOne last check to make sure we have everything right:print(model.state_dict())OutputOrderedDict([('0.weight', tensor([[1.9690]], device='cuda:0')),('0.bias', tensor([1.0235], device='cuda:0'))])Now, take a close, hard look at the code inside the training loop.Ready? I have a question for you then…"Would this code change if we were using a different optimizer, orloss, or even model?"Before I give you the answer, let me address something else that may be on yourmind: "What is the point of all this?"Well, in the next chapter we’ll get fancier, using more of PyTorch’s classes (likeDataset and DataLoader) to further refine our data preparation step, and we’ll alsotry to reduce boilerplate code to a minimum. So, splitting our code into threelogical parts will allow us to better handle these improvements.And here is the answer: NO, the code inside the loop would not change.I guess you figured out which boilerplate I was referring to, right?120 | Chapter 1: A Simple Regression Problem
- Page 94 and 95: Notebook Cell 1.2 - Implementing gr
- Page 96 and 97: # Sanity Check: do we get the same
- Page 98 and 99: Outputtensor(3.1416)tensor([1, 2, 3
- Page 100 and 101: Outputtensor([[1., 2., 1.],[1., 1.,
- Page 102 and 103: dummy_array = np.array([1, 2, 3])du
- Page 104 and 105: n_cudas = torch.cuda.device_count()
- Page 106 and 107: back_to_numpy = x_train_tensor.nump
- Page 108 and 109: I am assuming you’d like to use y
- Page 110 and 111: Outputtensor([0.1940], device='cuda
- Page 112 and 113: print(error.requires_grad, yhat.req
- Page 114 and 115: Output(tensor([0.], device='cuda:0'
- Page 116 and 117: 56 # need to tell it to let it go..
- Page 118 and 119: computation.If you chose "Local Ins
- Page 120 and 121: Figure 1.6 - Now parameter "b" does
- Page 122 and 123: There are many optimizers: SGD is t
- Page 124 and 125: 41 optimizer.zero_grad() 34243 prin
- Page 126 and 127: Notebook Cell 1.8 - PyTorch’s los
- Page 128 and 129: Outputarray(0.00804466, dtype=float
- Page 130 and 131: Let’s build a proper (yet simple)
- Page 132 and 133: "What do we need this for?"It turns
- Page 134 and 135: 1 Instantiating a model2 What IS th
- Page 136 and 137: In the __init__() method, we create
- Page 138 and 139: LayersA Linear model can be seen as
- Page 140 and 141: There are MANY different layers tha
- Page 142 and 143: We use magic, just like that:%run -
- Page 146 and 147: RecapFirst of all, congratulations
- Page 148 and 149: Chapter 2Rethinking the Training Lo
- Page 150 and 151: Let’s take a look at the code onc
- Page 152 and 153: Higher-Order FunctionsAlthough this
- Page 154 and 155: def exponentiation_builder(exponent
- Page 156 and 157: Apart from returning the loss value
- Page 158 and 159: Our code should look like this; see
- Page 160 and 161: There is no need to load the whole
- Page 162 and 163: but if we want to get serious about
- Page 164 and 165: How does this change our code so fa
- Page 166 and 167: Run - Model Training V2%run -i mode
- Page 168 and 169: piece of code that’s going to be
- Page 170 and 171: for it. We could do the same for th
- Page 172 and 173: EvaluationHow can we evaluate the m
- Page 174 and 175: And then, we update our model confi
- Page 176 and 177: Run - Model Training V4%run -i mode
- Page 178 and 179: Loading Extension# Load the TensorB
- Page 180 and 181: browser, you’ll likely see someth
- Page 182 and 183: model’s graph (not quite the same
- Page 184 and 185: Figure 2.5 - Scalars on TensorBoard
- Page 186 and 187: Define - Model Training V51 %%write
- Page 188 and 189: If, by any chance, you ended up wit
- Page 190 and 191: The procedure is exactly the same,
- Page 192 and 193: soon, so please bear with me for no
• Step 1: compute model’s predictions
• Step 2: compute the loss
• Step 3: compute the gradients
• Step 4: update the parameters
This sequence is repeated over and over until the number of epochs is reached.
The corresponding code for this part also comes from Notebook Cell 1.10, lines 17-
36.
"What happened to the random initialization step?"
Since we are not manually creating parameters anymore, the initialization is
handled inside each layer during model creation.
Define - Model Training V0
1 %%writefile model_training/v0.py
2
3 # Defines number of epochs
4 n_epochs = 1000
5
6 for epoch in range(n_epochs):
7 # Sets model to TRAIN mode
8 model.train()
9
10 # Step 1 - Computes model's predicted output - forward pass
11 yhat = model(x_train_tensor)
12
13 # Step 2 - Computes the loss
14 loss = loss_fn(yhat, y_train_tensor)
15
16 # Step 3 - Computes gradients for both "b" and "w" parameters
17 loss.backward()
18
19 # Step 4 - Updates parameters using gradients and
20 # the learning rate
21 optimizer.step()
22 optimizer.zero_grad()
Putting It All Together | 119