pdfcoffee

soumyasankar99
from soumyasankar99 More from this publisher
09.05.2023 Views

Chapter 1During training, weights in early layers naturally change and therefore the inputs oflater layers can significantly change. In other words, each layer must continuouslyre-adjust its weights to the different distribution for every batch. This may slowdown the model's training greatly. The key idea is to make layer inputs more similarin distribution, batch after batch and epoch after epoch.Another issue is that the sigmoid activation function works very well close tozero, but tends to "get stuck" when values get sufficiently far away from zero. If,occasionally, neuron outputs fluctuate far away from the sigmoid zero, then saidneuron becomes unable to update its own weights.The other key idea is therefore to transform the layer outputs into a Gaussiandistribution unit close to zero. In this way, layers will have significantly less variationfrom batch to batch. Mathematically, the formula is very simple. The activation inputx is centered around zero by subtracting the batch mean μμ from it. Then, the resultis divided by σσ+∈ , the sum of batch variance σσ and a small number ∈ , to preventdivision by zero. Then, we use a linear transformation yy = λλxx + ββ to make sure thatthe normalizing effect is applied during training.In this way, λλ and ββ are parameters that get optimized during the training phasein a similar way to any other layer. BatchNormalization has been proven as a veryeffective way to increase both the speed of training and accuracy, because it helpsto prevent activations becoming either too small and vanishing or too big andexploding.Playing with Google Colab – CPUs,GPUs, and TPUsGoogle offers a truly intuitive tool for training neural networks and for playing withTensorFlow (including 2.x) at no cost. You can find an actual Colab, which can befreely accessed, at https://colab.research.google.com/ and if you are familiarwith Jupyter notebooks, you will find a very familiar web-based environment here.Colab stands for Colaboratory and it is a Google research project created to helpdisseminate machine learning education and research.[ 39 ]

Neural Network Foundations with TensorFlow 2.0Let's see how it works, starting with the screenshot shown in Figure 32:Figure 32: An example of notebooks in ColabBy accessing Colab, you can either check a listing of notebooks generated in the pastor you can create a new notebook. Different versions of Python are supported.When we create a new notebook, we can also select whether we want to run it onCPUs, GPUs, or in Google's TPUs as shown in Figure 25 (see Chapter 16, TensorProcessing Unit for more details on these):[ 40 ]

Chapter 1

During training, weights in early layers naturally change and therefore the inputs of

later layers can significantly change. In other words, each layer must continuously

re-adjust its weights to the different distribution for every batch. This may slow

down the model's training greatly. The key idea is to make layer inputs more similar

in distribution, batch after batch and epoch after epoch.

Another issue is that the sigmoid activation function works very well close to

zero, but tends to "get stuck" when values get sufficiently far away from zero. If,

occasionally, neuron outputs fluctuate far away from the sigmoid zero, then said

neuron becomes unable to update its own weights.

The other key idea is therefore to transform the layer outputs into a Gaussian

distribution unit close to zero. In this way, layers will have significantly less variation

from batch to batch. Mathematically, the formula is very simple. The activation input

x is centered around zero by subtracting the batch mean μμ from it. Then, the result

is divided by σσ+∈ , the sum of batch variance σσ and a small number ∈ , to prevent

division by zero. Then, we use a linear transformation yy = λλxx + ββ to make sure that

the normalizing effect is applied during training.

In this way, λλ and ββ are parameters that get optimized during the training phase

in a similar way to any other layer. BatchNormalization has been proven as a very

effective way to increase both the speed of training and accuracy, because it helps

to prevent activations becoming either too small and vanishing or too big and

exploding.

Playing with Google Colab – CPUs,

GPUs, and TPUs

Google offers a truly intuitive tool for training neural networks and for playing with

TensorFlow (including 2.x) at no cost. You can find an actual Colab, which can be

freely accessed, at https://colab.research.google.com/ and if you are familiar

with Jupyter notebooks, you will find a very familiar web-based environment here.

Colab stands for Colaboratory and it is a Google research project created to help

disseminate machine learning education and research.

[ 39 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!