Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub
Output<torch.utils.data.dataset.Subset at 0x7fc6e7944290>Each subset contains the corresponding indices as an attribute:train_idx.indicesOutput[118,170,...10,161]Next, each Subset object is used as an argument to the corresponding sampler:train_sampler = SubsetRandomSampler(train_idx)val_sampler = SubsetRandomSampler(val_idx)So, we can use a single dataset from which to load the data since the split iscontrolled by the samplers. But we still need two data loaders, each using itscorresponding sampler:# Builds a loader of each settrain_loader = DataLoader(dataset=dataset, batch_size=16, sampler=train_sampler)val_loader = DataLoader(dataset=dataset, batch_size=16, sampler=val_sampler)If you’re using a sampler, you cannot set shuffle=True.Data Preparation | 287
We can also check if the loaders are returning the correct number of mini-batches:len(iter(train_loader)), len(iter(val_loader))Output(15, 4)There are 15 mini-batches in the training loader (15 mini-batches * 16 batch size =240 data points), and four mini-batches in the validation loader (4 mini-batches *16 batch size = 64 data points). In the validation set, the last mini-batch will haveonly 12 points, since there are only 60 points in total.OK, cool, this means we don’t need two (split) datasets anymore—we only needtwo samplers. Right? Well, it depends.Data Augmentation TransformsNo, I did not change topics :-) The reason why we may still need two split datasetsis exactly that: data augmentation. In general, we want to apply data augmentationto the training data only (yes, there is test-data augmentation too, but that’s adifferent matter). Data augmentation is accomplished using composingtransforms, which will be applied to all points in the dataset. See the problem?If we need some data points to be augmented, but not others, the easiest way toaccomplish this is to create two composers and use them in two different datasets.We can still use the indices, though:# Uses indices to perform the splitx_train_tensor = x_tensor[train_idx]y_train_tensor = y_tensor[train_idx]x_val_tensor = x_tensor[val_idx]y_val_tensor = y_tensor[val_idx]Then, here come the two composers: The train_composer() augments the data,and then scales it (min-max); the val_composer() only scales the data (min-max).288 | Chapter 4: Classifying Images
- Page 262 and 263: decision boundary.Look at the expre
- Page 264 and 265: Are my data points separable?That
- Page 266 and 267: model = nn.Sequential()model.add_mo
- Page 268 and 269: It looks like this:Figure 3.10 - Sp
- Page 270 and 271: True and False Positives and Negati
- Page 272 and 273: tpr_fpr(cm_thresh50)Output(0.909090
- Page 274 and 275: The trade-off between precision and
- Page 276 and 277: Figure 3.13 - Using a low threshold
- Page 278 and 279: Figure 3.16 - Trade-offs for two di
- Page 280 and 281: thresholds do not necessarily inclu
- Page 282 and 283: actual data, it is as bad as it can
- Page 284 and 285: If you want to learn more about bot
- Page 286 and 287: Model Training1 n_epochs = 10023 sb
- Page 288 and 289: step in your journey! What’s next
- Page 290 and 291: Chapter 4Classifying ImagesSpoilers
- Page 292 and 293: Data GenerationOur images are quite
- Page 294 and 295: Images and ChannelsIn case you’re
- Page 296 and 297: image_rgb = np.stack([image_r, imag
- Page 298 and 299: That’s fairly straightforward; we
- Page 300 and 301: • Transformations based on Tensor
- Page 302 and 303: position of an object in a picture
- Page 304 and 305: Outputtensor([[[0., 0., 0., 1., 0.]
- Page 306 and 307: Outputtensor([[[-1., -1., -1., 1.,
- Page 308 and 309: We can convert the former into the
- Page 310 and 311: composer = Compose([RandomHorizonta
- Page 314 and 315: train_composer = Compose([RandomHor
- Page 316 and 317: The minority class should have the
- Page 318 and 319: train_loader = DataLoader(dataset=t
- Page 320 and 321: implemented in Chapter 2.1? Let’s
- Page 322 and 323: Let’s take one mini-batch of imag
- Page 324 and 325: What does our model look like? Visu
- Page 326 and 327: Model TrainingLet’s train our mod
- Page 328 and 329: preceding hidden layer to compute i
- Page 330 and 331: fig = sbs_nn.plot_losses()Figure 4.
- Page 332 and 333: Equation 4.2 - Equivalence of deep
- Page 334 and 335: w_nn_equiv = w_nn_output.mm(w_nn_hi
- Page 336 and 337: Weights as PixelsDuring data prepar
- Page 338 and 339: is only 0.25 (for z = 0) and that i
- Page 340 and 341: nn.Tanh()(dummy_z)Outputtensor([-0.
- Page 342 and 343: dummy_z = torch.tensor([-3., 0., 3.
- Page 344 and 345: As you can see, in PyTorch the coef
- Page 346 and 347: Figure 4.16 - Deep model (for real)
- Page 348 and 349: Figure 4.18 - Losses (before and af
- Page 350 and 351: Equation 4.3 - Activation functions
- Page 352 and 353: Helper Function #41 def index_split
- Page 354 and 355: Model Configuration1 # Sets learnin
- Page 356 and 357: Bonus ChapterFeature SpaceThis chap
- Page 358 and 359: Affine TransformationsAn affine tra
- Page 360 and 361: Figure B.3 - Annotated model diagra
Output
<torch.utils.data.dataset.Subset at 0x7fc6e7944290>
Each subset contains the corresponding indices as an attribute:
train_idx.indices
Output
[118,
170,
...
10,
161]
Next, each Subset object is used as an argument to the corresponding sampler:
train_sampler = SubsetRandomSampler(train_idx)
val_sampler = SubsetRandomSampler(val_idx)
So, we can use a single dataset from which to load the data since the split is
controlled by the samplers. But we still need two data loaders, each using its
corresponding sampler:
# Builds a loader of each set
train_loader = DataLoader(
dataset=dataset, batch_size=16, sampler=train_sampler
)
val_loader = DataLoader(
dataset=dataset, batch_size=16, sampler=val_sampler
)
If you’re using a sampler, you cannot set shuffle=True.
Data Preparation | 287