22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

"If the models are equivalent, how come the weights ended up being

slightly different?"

That’s a fair question. First, remember that every model is randomly initialized. We

did use the same random seed, but this was not enough to make both models

identical at the beginning. Why not? Simply put, the deep-ish model had many

more weights to be initialized, so they couldn’t have been identical at the start.

It is fairly straightforward that the logistic regression model has 25 weights. But

how many weights does the deep-ish model have? We could work it out: 25

features times five units in Hidden Layer #0 (125), plus those five units times three

units in Hidden Layer #1 (15), plus the last three weights from Hidden Layer #1 to

the Output Layer, adding up to a total of 143.

Or we could just use PyTorch’s numel() instead to return the total number of

elements (clever, right?) in a tensor. Even better, let’s make it a method of our

StepByStep class, and take only gradient-requiring tensors, so we count only those

weights that need to be updated.

StepByStep Method

def count_parameters(self):

return sum(p.numel()

for p in self.model.parameters()

if p.requires_grad)

setattr(StepByStep, 'count_parameters', count_parameters)

Right now, it is all of them, sure, but that will not necessarily be the case anymore

when we use transfer learning in Chapter 7.

sbs_logistic.count_parameters(), sbs_nn.count_parameters()

Output

(25, 143)

310 | Chapter 4: Classifying Images

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!