Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub
argument of nn.BCEWithLogitsLoss(). To compensate for the imbalance, one canset the weight to equal the ratio of negative to positive examples:In our imbalanced dummy example, the result would be 3.0. This way, every pointin the positive class would have its corresponding loss multiplied by three. Sincethere is a single label for each data point (c = 1), the tensor used as an argument forpos_weight has only one element: tensor([3.0]). We could compute it like this:n_neg = (dummy_imb_labels == 0).sum().float()n_pos = (dummy_imb_labels == 1).sum().float()pos_weight = (n_neg / n_pos).view(1,)pos_weightOutputtensor([3])Now, let’s create yet another loss function, including the pos_weight argument thistime:loss_fn_imb = nn.BCEWithLogitsLoss(reduction='mean',pos_weight=pos_weight)Then, we can use this weighted loss function to compute the loss for ourimbalanced dataset. I guess one would expect the same loss as before; after all,this is a weighted loss. Right?loss = loss_fn_imb(dummy_imb_logits, dummy_imb_labels)lossLoss | 229
Outputtensor(0.2464)Wrong! It was 0.1643 when we had two data points, one of each class. Now it is0.2464, even though we assigned a weight to the positive class."Why is it different?"Well, it turns out, PyTorch does not compute a weighted average. Here’s what youwould expect from a weighted average:Equation 3.16 - Weighted average of lossesBut this is what PyTorch does:Equation 3.17 - PyTorch’s BCEWithLogitsLossSee the difference in the denominator? Of course, if you multiply the losses of thepositive examples without multiplying their count (N pos ), you’ll end up with anumber larger than an actual weighted average."What if I really want the weighted average?"230 | Chapter 3: A Simple Classification Problem
- Page 204 and 205: # These attributes are defined here
- Page 206 and 207: # Creates the train_step function f
- Page 208 and 209: # Builds function that performs a s
- Page 210 and 211: setattrThe setattr function sets th
- Page 212 and 213: See? We effectively modified the un
- Page 214 and 215: the random seed as arguments.This s
- Page 216 and 217: The current state of development of
- Page 218 and 219: Lossesdef plot_losses(self):fig = p
- Page 220 and 221: Run - Data Preparation V21 # %load
- Page 222 and 223: Model TrainingWe start by instantia
- Page 224 and 225: Making PredictionsLet’s make up s
- Page 226 and 227: OutputOrderedDict([('0.weight', ten
- Page 228 and 229: Run - Data Preparation V21 # %load
- Page 230 and 231: • defining our StepByStep class
- Page 232 and 233: import numpy as npimport torchimpor
- Page 234 and 235: Next, we’ll standardize the featu
- Page 236 and 237: Equation 3.1 - A linear regression
- Page 238 and 239: The odds ratio is given by the rati
- Page 240 and 241: As expected, probabilities that add
- Page 242 and 243: Sigmoid Functiondef sigmoid(z):retu
- Page 244 and 245: A picture is worth a thousand words
- Page 246 and 247: OutputOrderedDict([('linear.weight'
- Page 248 and 249: The first summation adds up the err
- Page 250 and 251: IMPORTANT: Make sure to pass the pr
- Page 252 and 253: To make it clear: In this chapter,
- Page 256 and 257: It is not that hard, to be honest.
- Page 258 and 259: Figure 3.6 - Training and validatio
- Page 260 and 261: Outputarray([[0.5504593 ],[0.949995
- Page 262 and 263: decision boundary.Look at the expre
- Page 264 and 265: Are my data points separable?That
- Page 266 and 267: model = nn.Sequential()model.add_mo
- Page 268 and 269: It looks like this:Figure 3.10 - Sp
- Page 270 and 271: True and False Positives and Negati
- Page 272 and 273: tpr_fpr(cm_thresh50)Output(0.909090
- Page 274 and 275: The trade-off between precision and
- Page 276 and 277: Figure 3.13 - Using a low threshold
- Page 278 and 279: Figure 3.16 - Trade-offs for two di
- Page 280 and 281: thresholds do not necessarily inclu
- Page 282 and 283: actual data, it is as bad as it can
- Page 284 and 285: If you want to learn more about bot
- Page 286 and 287: Model Training1 n_epochs = 10023 sb
- Page 288 and 289: step in your journey! What’s next
- Page 290 and 291: Chapter 4Classifying ImagesSpoilers
- Page 292 and 293: Data GenerationOur images are quite
- Page 294 and 295: Images and ChannelsIn case you’re
- Page 296 and 297: image_rgb = np.stack([image_r, imag
- Page 298 and 299: That’s fairly straightforward; we
- Page 300 and 301: • Transformations based on Tensor
- Page 302 and 303: position of an object in a picture
Output
tensor(0.2464)
Wrong! It was 0.1643 when we had two data points, one of each class. Now it is
0.2464, even though we assigned a weight to the positive class.
"Why is it different?"
Well, it turns out, PyTorch does not compute a weighted average. Here’s what you
would expect from a weighted average:
Equation 3.16 - Weighted average of losses
But this is what PyTorch does:
Equation 3.17 - PyTorch’s BCEWithLogitsLoss
See the difference in the denominator? Of course, if you multiply the losses of the
positive examples without multiplying their count (N pos ), you’ll end up with a
number larger than an actual weighted average.
"What if I really want the weighted average?"
230 | Chapter 3: A Simple Classification Problem