22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

and w to represent these parameters, so it becomes even more clear there is

nothing special to this transformation, you’ll find them represented as beta and

gamma, respectively, in the literature. Moreover, the terms may appear in a

different order, like this:

Equation 7.5 - Batch normalization with affine transformation

We’re now leaving the affine transformation aside and focusing on a different

aspect of batch normalization: that it does not only compute statistics for each

mini-batch, but also keeps track of…

Running Statistics

Since batch normalization computes statistics on mini-batches, and mini-batches

contain a small number of points, these statistics are likely to fluctuate a lot. The

smaller the mini-batches, the more the statistics will fluctuate. But, more

important, which statistics should it use for unseen data (like the data points in the

validation set)?

During the evaluation phase (or when the model is already trained and deployed),

there are no mini-batches. It is perfectly natural to feed the model a single input to

get its prediction. Clearly, there are no statistics for a single data point: It is its own

mean, and the variance is zero. How can you standardize that? You can’t! So, I

repeat the question:

"Which statistics should the model use when applying batch

normalization to unseen data?"

What about keeping track of running statistics (that is, moving averages of the

statistics)? It is a good way of smoothing the fluctuations. Besides, every data point

will have a chance to contribute to the statistics. That’s what batch normalization

does.

Let’s see it in action using code—we’ll use a dummy dataset with 200 random data

points and two features:

Batch Normalization | 535

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!