pdfcoffee

soumyasankar99
from soumyasankar99 More from this publisher
09.05.2023 Views

Chapter 1Stochastic Gradient Descent (SGD) (see Chapter 15, The Math Behind Deep Learning)is a particular kind of optimization algorithm used to reduce the mistakes madeby neural networks after each training epoch. We will review SGD and otheroptimization algorithms in the next chapters. Once the model is compiled, itcan then be trained with the fit() method, which specifies a few parameters:• epochs is the number of times the model is exposed to the training set. Ateach iteration the optimizer tries to adjust the weights so that the objectivefunction is minimized.• batch_size is the number of training instances observed before theoptimizer performs a weight update; there are usually many batchesper epoch.Training a model in TensorFlow 2.0 is very simple:# Training the model.model.fit(X_train, Y_train,batch_size=BATCH_SIZE, epochs=EPOCHS,verbose=VERBOSE, validation_split=VALIDATION_SPLIT)Note that we've reserved part of the training set for validation. The key idea isthat we reserve a part of the training data for measuring the performance on thevalidation while training. This is a good practice to follow for any machine learningtask, and one that we will adopt in all of our examples. Please note that we willreturn to validation later in this chapter when we talk about overfitting.Once the model is trained, we can evaluate it on the test set that contains newexamples never seen by the model during the training phase.Note that, of course, the training set and the test set are rigorously separated. Thereis no point evaluating a model on an example that was already used for training. InTensorFlow 2.0 we can use the method evaluate(X_test, Y_test) to compute thetest_loss and the test_acc:#evaluate the modeltest_loss, test_acc = model.evaluate(X_test, Y_test)print('Test accuracy:', test_acc)So, congratulations! You have just defined your first neural network in TensorFlow2.0. A few lines of code and your computer should be able to recognize handwrittennumbers. Let's run the code and see what the performance is.[ 19 ]

Neural Network Foundations with TensorFlow 2.0Running a simple TensorFlow 2.0 net andestablishing a baselineSo let's see what happens when we run the code:Figure 13: Code ran from our test neural networkFirst, the net architecture is dumped and we can see the different types of layersused, their output shape, how many parameters (that is, how many weights) theyneed to optimize, and how they are connected. Then, the network is trained on48,000 samples, and 12,000 are reserved for validation. Once the neural model isbuilt, it is then tested on 10,000 samples. For now, we won't go into the internals ofhow the training happens, but we can see that the program runs for 200 iterationsand each time accuracy improves. When the training ends, we test our model on thetest set and we achieve about 89.96% accuracy on training, 90.70% on validation, and90.71% on test:Figure 14: Results from testing model, accuracies displayedThis means that nearly 1 in 10 images are incorrectly classified. We can certainly dobetter than that.[ 20 ]

Chapter 1

Stochastic Gradient Descent (SGD) (see Chapter 15, The Math Behind Deep Learning)

is a particular kind of optimization algorithm used to reduce the mistakes made

by neural networks after each training epoch. We will review SGD and other

optimization algorithms in the next chapters. Once the model is compiled, it

can then be trained with the fit() method, which specifies a few parameters:

• epochs is the number of times the model is exposed to the training set. At

each iteration the optimizer tries to adjust the weights so that the objective

function is minimized.

• batch_size is the number of training instances observed before the

optimizer performs a weight update; there are usually many batches

per epoch.

Training a model in TensorFlow 2.0 is very simple:

# Training the model.

model.fit(X_train, Y_train,

batch_size=BATCH_SIZE, epochs=EPOCHS,

verbose=VERBOSE, validation_split=VALIDATION_SPLIT)

Note that we've reserved part of the training set for validation. The key idea is

that we reserve a part of the training data for measuring the performance on the

validation while training. This is a good practice to follow for any machine learning

task, and one that we will adopt in all of our examples. Please note that we will

return to validation later in this chapter when we talk about overfitting.

Once the model is trained, we can evaluate it on the test set that contains new

examples never seen by the model during the training phase.

Note that, of course, the training set and the test set are rigorously separated. There

is no point evaluating a model on an example that was already used for training. In

TensorFlow 2.0 we can use the method evaluate(X_test, Y_test) to compute the

test_loss and the test_acc:

#evaluate the model

test_loss, test_acc = model.evaluate(X_test, Y_test)

print('Test accuracy:', test_acc)

So, congratulations! You have just defined your first neural network in TensorFlow

2.0. A few lines of code and your computer should be able to recognize handwritten

numbers. Let's run the code and see what the performance is.

[ 19 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!