22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

LogSoftmax

The logsoftmax function returns, well, the logarithm of the softmax function

above. But, instead of manually taking the logarithm, PyTorch provides

F.log_softmax() and nn.LogSoftmax out of the box.

These functions are fast and also have better numerical properties. But, I guess your

main question at this point is:

"Why do I need to take the log of the softmax?"

The simple and straightforward reason is that the loss function expects log

probabilities as input.

Negative Log-Likelihood Loss

Since softmax returns probabilities, logsoftmax returns log probabilities. And

that’s the input for computing the negative log-likelihood loss, or nn.NLLLoss() for

short. This loss is simply an extension of the binary cross-entropy loss for handling

multiple classes.

This was the formula for computing binary cross-entropy:

Equation 5.8 - Binary cross-entropy

See the log probabilities in the summation terms? In our example, there are three

classes; that is, our labels (y) could be either zero, one, or two. So, the loss function

will look like this:

Equation 5.9 - Negative log-likelihood loss for a three-class classification problem

Take, for instance, the first class (y=0). For every data point belonging to this class

(there are N 0 of them), we take the logarithm of the predicted probability for that

point and class (log(P(y i =0))) and add them all up. Next, we repeat the process for

the other two classes, add all three results up, and divide by the total number of

data points.

378 | Chapter 5: Convolutions

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!