09.05.2023 Views

pdfcoffee

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 1

As shown in the following image, we reach the accuracy of 85%, which is not bad at

all for a simple network:

Figure 37: Testing the accuracy of a simple network

Hyperparameter tuning and AutoML

The experiments defined above give some opportunities for fine-tuning a net.

However, what works for this example will not necessarily work for other examples.

For a given net, there are indeed multiple parameters that can be optimized (such

as the number of hidden neurons, BATCH_SIZE, number of epochs, and many

more depending on the complexity of the net itself). These parameters are called

"hyperparameters" to distinguish them from the parameters of the network itself,

that is, the values of the weights and biases.

Hyperparameter tuning is the process of finding the optimal combination of those

hyperparameters that minimize cost functions. The key idea is that if we have n

hyperparameters, then we can imagine that they define a space with n dimensions

and the goal is to find the point in this space that corresponds to an optimal value

for the cost function. One way to achieve this goal is to create a grid in this space

and systematically check the value assumed by the cost function for each grid

vertex. In other words, the hyperparameters are divided into buckets and different

combinations of values are checked via a brute force approach.

If you think that this process of fine-tuning the hyperparameters is manual and

expensive, then you are absolutely right! However, during the last few years we

have seen significant results in AutoML, a set of research techniques aiming at both

automatically tuning hyperparameters and searching automatically for optimal

network architecture. We will discuss more about this in Chapter 14, An introduction

to AutoML.

Predicting output

Once a net is trained, it can of course be used for making predictions. In TensorFlow

this is very simple. We can use the method:

# Making predictions.

predictions = model.predict(X)

[ 45 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!