Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

Equation 10.7 - Data points' means over features (D)inputs_mean = inputs.mean(axis=2).unsqueeze(2)inputs_meanOutputtensor([[[-0.3529],[ 0.2426]],[[ 0.9496],[-1.3038]],[[ 1.6489],[ 3.6841]]])As expected, six mean values, one for each data point. The unsqueeze() is there topreserve the original dimensionality, thus making the result a tensor of (N, L, 1)shape.Next, we compute the biased standard deviations over the same dimension (D):Equation 10.8 - Data points' standard deviations over features (D)inputs_var = inputs.var(axis=2, unbiased=False).unsqueeze(2)inputs_varLayer Normalization | 823

Outputtensor([[[6.3756],[1.6661]],[[4.0862],[0.3153]],[[2.3135],[4.6163]]])No surprises here.The actual standardization is then computed using the mean, biased standarddeviation, and a tiny epsilon to guarantee numerical stability:Equation 10.9 - Layer normalization(inputs - inputs_mean)/torch.sqrt(inputs_var+1e-5)Outputtensor([[[-1.3671, 0.9279, -0.5464, 0.9857],[ 1.1953, 0.4438, -0.1015, -1.5376]],[[-1.6706, 0.2010, 0.9458, 0.5238],[ 0.4782, 0.0485, -1.6106, 1.0839]],[[-1.6129, 0.2116, 1.1318, 0.2695],[ 0.2520, 1.5236, -1.0272, -0.7484]]])The values above are layer normalized. It is possible to achieve the very sameresults by using PyTorch’s own nn.LayerNorm, of course:824 | Chapter 10: Transform and Roll Out

Output

tensor([[[6.3756],

[1.6661]],

[[4.0862],

[0.3153]],

[[2.3135],

[4.6163]]])

No surprises here.

The actual standardization is then computed using the mean, biased standard

deviation, and a tiny epsilon to guarantee numerical stability:

Equation 10.9 - Layer normalization

(inputs - inputs_mean)/torch.sqrt(inputs_var+1e-5)

Output

tensor([[[-1.3671, 0.9279, -0.5464, 0.9857],

[ 1.1953, 0.4438, -0.1015, -1.5376]],

[[-1.6706, 0.2010, 0.9458, 0.5238],

[ 0.4782, 0.0485, -1.6106, 1.0839]],

[[-1.6129, 0.2116, 1.1318, 0.2695],

[ 0.2520, 1.5236, -1.0272, -0.7484]]])

The values above are layer normalized. It is possible to achieve the very same

results by using PyTorch’s own nn.LayerNorm, of course:

824 | Chapter 10: Transform and Roll Out

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!