22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

IMPORTANT: I can’t stress this enough: You must use the right

combination of model and loss function!

Option 1: nn.LogSoftmax as the last layer, meaning your model is

producing log probabilities, combined with the nn.NLLLoss()

function.

Option 2: No logsoftmax in the last layer, meaning your model is

producing logits, combined with the nn.CrossEntropyLoss()

function.

Mixing nn.LogSoftmax and nn.CrossEntropyLoss() is just wrong.

Now that the difference between the arguments is clear, let’s take a closer look at

the nn.CrossEntropyLoss() function. It is a higher-order function, and it takes the

same three optional arguments as nn.NLLLoss():

• reduction: It takes either mean, sum, or none, and the default is mean.

• weight: It takes a tensor of length C; that is, containing as many weights as

there are classes.

• ignore_index: It takes one integer, which corresponds to the one (and only

one) class index that should be ignored.

Let’s see a quick example of its usage, taking dummy logits as input:

torch.manual_seed(11)

dummy_logits = torch.randn((5, 3))

dummy_labels = torch.tensor([0, 0, 1, 2, 1])

loss_fn = nn.CrossEntropyLoss()

loss_fn(dummy_logits, dummy_labels)

Output

tensor(1.6553)

No logsoftmax whatsoever, but the same resulting loss, as expected.

A Multiclass Classification Problem | 383

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!