pdfcoffee

soumyasankar99
from soumyasankar99 More from this publisher
09.05.2023 Views

Both the functions allow small updates if x is negative, which might be useful incertain conditions.Chapter 1Activation functionsSigmoid, Tanh, ELU, LeakyReLU, and ReLU are generally called activation functionsin neural network jargon. In the gradient descent section, we will see that thosegradual changes typical of sigmoid and ReLU functions are the basic building blocksto develop a learning algorithm that adapts little by little by progressively reducingthe mistakes made by our nets. An example of using the activation function σσ with(x 1, x 2,..., x m) input vector, (w 1, w 2,..., w m) weight vector, b bias, and ∑ summation isgiven in Figure 11. Note that TensorFlow 2.0 supports many activation functions,a full list of which is available online:Figure 11: An example of an activation function applied after a linear functionIn short – what are neural networks after all?In one sentence, machine learning models are a way to compute a function that mapssome inputs to their corresponding outputs. The function is nothing more than anumber of addition and multiplication operations. However, when combined witha non-linear activation and stacked in multiple layers, these functions can learnalmost anything [8]. You also need a meaningful metric capturing what you want tooptimize (this being the so-called loss function that we will cover later in the book),enough data to learn from, and sufficient computational power.[ 13 ]

Neural Network Foundations with TensorFlow 2.0Now, it might be beneficial to stop one moment and ask ourselves what "learning"really is? Well, we can say for our purposes that learning is essentially a processaimed at generalizing established observations [9] in order to predict future results.So, in short, this is exactly the goal we want to achieve with neural networks.A real example – recognizing handwrittendigitsIn this section we will build a network that can recognize handwritten numbers.In order to achieve this goal, we'll use MNIST (http://yann.lecun.com/exdb/mnist/), a database of handwritten digits made up of a training set of 60,000examples, and a test set of 10,000 examples. The training examples are annotated byhumans with the correct answer. For instance, if the handwritten digit is the number"3", then 3 is simply the label associated with that example.In machine learning, when a dataset with correct answers is available, we say that wecan perform a form of supervised learning. In this case we can use training examplesto improve our net. Testing examples also have the correct answer associated to eachdigit. In this case, however, the idea is to pretend that the label is unknown, let thenetwork do the prediction, and then later on reconsider the label to evaluate howwell our neural network has learned to recognize digits. Unsurprisingly, testingexamples are just used to test the performance of our net.Each MNIST image is in grayscale and consists of 28*28 pixels. A subset of theseimages of numbers is shown in Figure 12:Figure 12: A collection of MNIST imagesOne-hot encoding (OHE)We are going to use OHE as a simple tool to encode information used inside neuralnetworks. In many applications it is convenient to transform categorical (nonnumerical)features into numerical variables. For instance, the categorical feature"digit" with value d in [0 – 9] can be encoded into a binary vector with 10 positions,which always has 0 value except the d - th position where a 1 is present.[ 14 ]

Neural Network Foundations with TensorFlow 2.0

Now, it might be beneficial to stop one moment and ask ourselves what "learning"

really is? Well, we can say for our purposes that learning is essentially a process

aimed at generalizing established observations [9] in order to predict future results.

So, in short, this is exactly the goal we want to achieve with neural networks.

A real example – recognizing handwritten

digits

In this section we will build a network that can recognize handwritten numbers.

In order to achieve this goal, we'll use MNIST (http://yann.lecun.com/exdb/

mnist/), a database of handwritten digits made up of a training set of 60,000

examples, and a test set of 10,000 examples. The training examples are annotated by

humans with the correct answer. For instance, if the handwritten digit is the number

"3", then 3 is simply the label associated with that example.

In machine learning, when a dataset with correct answers is available, we say that we

can perform a form of supervised learning. In this case we can use training examples

to improve our net. Testing examples also have the correct answer associated to each

digit. In this case, however, the idea is to pretend that the label is unknown, let the

network do the prediction, and then later on reconsider the label to evaluate how

well our neural network has learned to recognize digits. Unsurprisingly, testing

examples are just used to test the performance of our net.

Each MNIST image is in grayscale and consists of 28*28 pixels. A subset of these

images of numbers is shown in Figure 12:

Figure 12: A collection of MNIST images

One-hot encoding (OHE)

We are going to use OHE as a simple tool to encode information used inside neural

networks. In many applications it is convenient to transform categorical (nonnumerical)

features into numerical variables. For instance, the categorical feature

"digit" with value d in [0 – 9] can be encoded into a binary vector with 10 positions,

which always has 0 value except the d - th position where a 1 is present.

[ 14 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!