Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub
piece of code that’s going to be used repeatedly into its own function: the minibatchinner loop!The inner loop depends on three elements:• the device where data is being sent• a data loader to draw mini-batches from• a step function, returning the corresponding lossTaking these elements as inputs and using them to perform the inner loop, we’llend up with a function like this:Helper Function #21 def mini_batch(device, data_loader, step_fn):2 mini_batch_losses = []3 for x_batch, y_batch in data_loader:4 x_batch = x_batch.to(device)5 y_batch = y_batch.to(device)67 mini_batch_loss = step_fn(x_batch, y_batch)8 mini_batch_losses.append(mini_batch_loss)910 loss = np.mean(mini_batch_losses)11 return lossIn the last section, we realized that we were executing five times more updates(the train_step_fn() function) per epoch due to the mini-batch inner loop. Before,1,000 epochs meant 1,000 updates. Now, we only need 200 epochs to perform thesame 1,000 updates.What does our training loop look like now? It’s very lean!Run - Data Preparation V1, Model Configuration V1%run -i data_preparation/v1.py%run -i model_configuration/v1.pyDataLoader | 143
Define - Model Training V31 %%writefile model_training/v3.py23 # Defines number of epochs4 n_epochs = 20056 losses = []78 for epoch in range(n_epochs):9 # inner loop10 loss = mini_batch(device, train_loader, train_step_fn) 111 losses.append(loss)1 Performing mini-batch gradient descentRun - Model Training V3%run -i model_training/v3.pyAfter updating the model training part, our current state ofdevelopment is:• Data Preparation V1• Model Configuration V1• Model Training V3Let’s inspect the model’s state:# Checks model's parametersprint(model.state_dict())OutputOrderedDict([('0.weight', tensor([[1.9687]], device='cuda:0')),('0.bias', tensor([1.0236], device='cuda:0'))])So far, we’ve focused on the training data only. We built a dataset and a data loader144 | Chapter 2: Rethinking the Training Loop
- Page 118 and 119: computation.If you chose "Local Ins
- Page 120 and 121: Figure 1.6 - Now parameter "b" does
- Page 122 and 123: There are many optimizers: SGD is t
- Page 124 and 125: 41 optimizer.zero_grad() 34243 prin
- Page 126 and 127: Notebook Cell 1.8 - PyTorch’s los
- Page 128 and 129: Outputarray(0.00804466, dtype=float
- Page 130 and 131: Let’s build a proper (yet simple)
- Page 132 and 133: "What do we need this for?"It turns
- Page 134 and 135: 1 Instantiating a model2 What IS th
- Page 136 and 137: In the __init__() method, we create
- Page 138 and 139: LayersA Linear model can be seen as
- Page 140 and 141: There are MANY different layers tha
- Page 142 and 143: We use magic, just like that:%run -
- Page 144 and 145: • Step 1: compute model’s predi
- Page 146 and 147: RecapFirst of all, congratulations
- Page 148 and 149: Chapter 2Rethinking the Training Lo
- Page 150 and 151: Let’s take a look at the code onc
- Page 152 and 153: Higher-Order FunctionsAlthough this
- Page 154 and 155: def exponentiation_builder(exponent
- Page 156 and 157: Apart from returning the loss value
- Page 158 and 159: Our code should look like this; see
- Page 160 and 161: There is no need to load the whole
- Page 162 and 163: but if we want to get serious about
- Page 164 and 165: How does this change our code so fa
- Page 166 and 167: Run - Model Training V2%run -i mode
- Page 170 and 171: for it. We could do the same for th
- Page 172 and 173: EvaluationHow can we evaluate the m
- Page 174 and 175: And then, we update our model confi
- Page 176 and 177: Run - Model Training V4%run -i mode
- Page 178 and 179: Loading Extension# Load the TensorB
- Page 180 and 181: browser, you’ll likely see someth
- Page 182 and 183: model’s graph (not quite the same
- Page 184 and 185: Figure 2.5 - Scalars on TensorBoard
- Page 186 and 187: Define - Model Training V51 %%write
- Page 188 and 189: If, by any chance, you ended up wit
- Page 190 and 191: The procedure is exactly the same,
- Page 192 and 193: soon, so please bear with me for no
- Page 194 and 195: After recovering our model’s stat
- Page 196 and 197: Run - Model Configuration V31 # %lo
- Page 198 and 199: This is the general structure you
- Page 200 and 201: Chapter 2.1Going ClassySpoilersIn t
- Page 202 and 203: # A completely empty (and useless)
- Page 204 and 205: # These attributes are defined here
- Page 206 and 207: # Creates the train_step function f
- Page 208 and 209: # Builds function that performs a s
- Page 210 and 211: setattrThe setattr function sets th
- Page 212 and 213: See? We effectively modified the un
- Page 214 and 215: the random seed as arguments.This s
- Page 216 and 217: The current state of development of
Define - Model Training V3
1 %%writefile model_training/v3.py
2
3 # Defines number of epochs
4 n_epochs = 200
5
6 losses = []
7
8 for epoch in range(n_epochs):
9 # inner loop
10 loss = mini_batch(device, train_loader, train_step_fn) 1
11 losses.append(loss)
1 Performing mini-batch gradient descent
Run - Model Training V3
%run -i model_training/v3.py
After updating the model training part, our current state of
development is:
• Data Preparation V1
• Model Configuration V1
• Model Training V3
Let’s inspect the model’s state:
# Checks model's parameters
print(model.state_dict())
Output
OrderedDict([('0.weight', tensor([[1.9687]], device='cuda:0')),
('0.bias', tensor([1.0236], device='cuda:0'))])
So far, we’ve focused on the training data only. We built a dataset and a data loader
144 | Chapter 2: Rethinking the Training Loop