Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

To make it clear: In this chapter, we’re dealing with a single-labelbinary classification (we have only one label per data point), andthe label is binary (there are only two possible values for it, zeroor one). If the label is zero, we say it belongs to the negative class.If the label is one, it belongs to the positive class.Please do not confuse the positive and negative classes of oursingle label with c, the so-called class number in thedocumentation. That c corresponds to the number of differentlabels associated with a data point. In our example, c = 1.You can use this argument to handle imbalanced datasets, butthere’s more to it than meets the eye. We’ll get back to it in thenext sub-section.Enough talking (or writing!): Let’s see how to use this loss in code. We start bycreating the loss function itself:loss_fn_logits = nn.BCEWithLogitsLoss(reduction='mean')loss_fn_logitsOutputBCEWithLogitsLoss()Next, we use logits and labels to compute the loss. Following the same principle asbefore, logits first, then labels. To keep the example consistent, let’s get the valuesof the logits corresponding to the probabilities we used before, 0.9 and 0.2, usingour log_odds_ratio() function:Loss | 227

logit1 = log_odds_ratio(.9)logit2 = log_odds_ratio(.2)dummy_labels = torch.tensor([1.0, 0.0])dummy_logits = torch.tensor([logit1, logit2])print(dummy_logits)Outputtensor([ 2.1972, -1.3863])We have logits, and we have labels. Time to compute the loss:loss = loss_fn_logits(dummy_logits, dummy_labels)lossOutputtensor(0.1643)OK, we got the same result, as expected.Imbalanced DatasetIn our dummy example with two data points, we had one of each class: positive andnegative. The dataset was perfectly balanced. Let’s create another dummy examplebut with an imbalance, adding two extra data points belonging to the negativeclass. For the sake of simplicity and to illustrate a quirk in the behavior ofnn.BCEWithLogitsLoss(), I will give those two extra points the same logits as theother data point in the negative class. It looks like this:dummy_imb_labels = torch.tensor([1.0, 0.0, 0.0, 0.0])dummy_imb_logits = torch.tensor([logit1, logit2, logit2, logit2])Clearly, this is an imbalanced dataset. There are three times more data points inthe negative class than in the positive one. Now, let’s turn to the pos_weight228 | Chapter 3: A Simple Classification Problem

logit1 = log_odds_ratio(.9)

logit2 = log_odds_ratio(.2)

dummy_labels = torch.tensor([1.0, 0.0])

dummy_logits = torch.tensor([logit1, logit2])

print(dummy_logits)

Output

tensor([ 2.1972, -1.3863])

We have logits, and we have labels. Time to compute the loss:

loss = loss_fn_logits(dummy_logits, dummy_labels)

loss

Output

tensor(0.1643)

OK, we got the same result, as expected.

Imbalanced Dataset

In our dummy example with two data points, we had one of each class: positive and

negative. The dataset was perfectly balanced. Let’s create another dummy example

but with an imbalance, adding two extra data points belonging to the negative

class. For the sake of simplicity and to illustrate a quirk in the behavior of

nn.BCEWithLogitsLoss(), I will give those two extra points the same logits as the

other data point in the negative class. It looks like this:

dummy_imb_labels = torch.tensor([1.0, 0.0, 0.0, 0.0])

dummy_imb_logits = torch.tensor([logit1, logit2, logit2, logit2])

Clearly, this is an imbalanced dataset. There are three times more data points in

the negative class than in the positive one. Now, let’s turn to the pos_weight

228 | Chapter 3: A Simple Classification Problem

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!