22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Zero Mean and Unit Standard Deviation

Let’s start with the unit standard deviation; that is, scaling the feature

values such that its standard deviation equals one. This is one of the most

important pre-processing steps, not only for the sake of improving the

performance of gradient descent, but for other techniques such as principal

component analysis (PCA) as well. The goal is to have all numerical features

in a similar scale, so the results are not affected by the original range of each

feature.

Think of two common features in a model: age and salary. While age usually

varies between 0 and 110, salaries can go from the low hundreds (say, 500)

to several thousand (say, 9,000). If we compute the corresponding standard

deviations, we may get values like 25 and 2,000, respectively. Thus, we need

to standardize both features to have them on equal footing.

And then there is the zero mean; that is, centering the feature at zero.

Deeper neural networks may suffer from a very serious condition called

vanishing gradients. Since the gradients are used to update the parameters,

smaller and smaller (that is, vanishing) gradients mean smaller and smaller

updates, up to the point of a standstill: The network simply stops learning.

One way to help the network to fight this condition is to center its inputs,

the features, at zero. We’ll get back to this later on in Chapter 4.

The code below will illustrate this well.

scaler = StandardScaler(with_mean=True, with_std=True)

# We use the TRAIN set ONLY to fit the scaler

scaler.fit(x_train)

# Now we can use the already fit scaler to TRANSFORM

# both TRAIN and VALIDATION sets

scaled_x_train = scaler.transform(x_train)

scaled_x_val = scaler.transform(x_val)

Notice that we are not regenerating the data—we are using the original feature x

as input for the StandardScaler and transforming it into a scaled x. The labels (y)

are left untouched.

Step 4 - Update the Parameters | 53

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!