22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

curves that intersect each other at some point. If that’s the case, there is no clear

winner.

One possible solution to this dilemma is to look at the area under the curve. The

curve with more area under it wins! Luckily, Scikit-Learn has an auc() (area under

the curve) method, which we can use to compute the area under the curves for our

(good) model:

# Area under the curves of our model

auroc = auc(fpr, tpr)

aupr = auc(rec, prec)

print(auroc, aupr)

Output

0.9797979797979798 0.9854312354312356

Very close to the perfect value of one! But then again, this is a test example—you

shouldn’t expect figures so high in real-life problems. What about the random

model? The theoretical minimum for the area under the worst ROC curve is 0.5,

which is the area under the diagonal. The theoretical minimum for the area under

the worst PR curve is the proportion of positive samples in the dataset, which is

0.55 in our case.

# Area under the curves of the random model

auroc_random = auc(fpr_random, tpr_random)

aupr_random = auc(rec_random, prec_random)

print(auroc_random, aupr_random)

Output

0.505050505050505 0.570559046216941

Close enough; after all, the curves produced by our random model were only

roughly approximating the theoretical ones.

258 | Chapter 3: A Simple Classification Problem

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!