pdfcoffee

soumyasankar99
from soumyasankar99 More from this publisher
09.05.2023 Views

Chapter 1[ 0., 1., 0.],[ 0., 0., 1.],[ 1., 0., 0.]], dtype=float32)Let's run the code and see what results we get with this multi-layer network:Figure 15: Running the code for a multi-layer networkThe previous screenshot shows the initial steps of the run while the followingscreenshot shows the conclusion. Not bad. As seen in the following screenshot,by adding two hidden layers we reached 90.81% on the training set, 91.40% onvalidation, and 91.18% on test. This means that we have increased accuracy ontesting with respect to the previous network, and we have reduced the numberof iterations from 200 to 50. That's good, but we want more.If you want, you can play by yourself and see what happens if you add onlyone hidden layer instead of two or if you add more than two layers. I leave thisexperiment as an exercise:Figure 16: Results after adding two hidden layers, with accuracies shownNote that improvement stops (or they become almost imperceptible) after a certainnumber of epochs. In machine learning, this is a phenomenon called convergence.[ 23 ]

Neural Network Foundations with TensorFlow 2.0Further improving the simple netin TensorFlow with DropoutNow our baseline is 90.81% on the training set, 91.40% on validation, and 91.18%on test. A second improvement is very simple. We decide to randomly drop – withthe DROPOUT probability – some of the values propagated inside our internal densenetwork of hidden layers during training. In machine learning this is a well-knownform of regularization. Surprisingly enough, this idea of randomly dropping a fewvalues can improve our performance. The idea behind this improvement is thatrandom dropout forces the network to learn redundant patterns that are usefulfor better generalization:import tensorflow as tfimport numpy as npfrom tensorflow import keras# Network and training.EPOCHS = 200BATCH_SIZE = 128VERBOSE = 1NB_CLASSES = 10 # number of outputs = number of digitsN_HIDDEN = 128VALIDATION_SPLIT = 0.2 # how much TRAIN is reserved for VALIDATIONDROPOUT = 0.3# Loading MNIST dataset.# Labels have one-hot representation.mnist = keras.datasets.mnist(X_train, Y_train), (X_test, Y_test) = mnist.load_data()# X_train is 60000 rows of 28x28 values; we reshape it to 60000 x 784.RESHAPED = 784#X_train = X_train.reshape(60000, RESHAPED)X_test = X_test.reshape(10000, RESHAPED)X_train = X_train.astype('float32')X_test = X_test.astype('float32')# Normalize inputs within [0, 1].X_train, X_test = X_train / 255.0, X_test / 255.0print(X_train.shape[0], 'train samples')print(X_test.shape[0], 'test samples')# One-hot representations for labels.[ 24 ]

Chapter 1

[ 0., 1., 0.],

[ 0., 0., 1.],

[ 1., 0., 0.]], dtype=float32)

Let's run the code and see what results we get with this multi-layer network:

Figure 15: Running the code for a multi-layer network

The previous screenshot shows the initial steps of the run while the following

screenshot shows the conclusion. Not bad. As seen in the following screenshot,

by adding two hidden layers we reached 90.81% on the training set, 91.40% on

validation, and 91.18% on test. This means that we have increased accuracy on

testing with respect to the previous network, and we have reduced the number

of iterations from 200 to 50. That's good, but we want more.

If you want, you can play by yourself and see what happens if you add only

one hidden layer instead of two or if you add more than two layers. I leave this

experiment as an exercise:

Figure 16: Results after adding two hidden layers, with accuracies shown

Note that improvement stops (or they become almost imperceptible) after a certain

number of epochs. In machine learning, this is a phenomenon called convergence.

[ 23 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!