09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 1

However, this might not be enough. A model can become excessively complex in

order to capture all the relations inherently expressed by the training data. This

increase of complexity might have two negative consequences. First, a complex

model might require a significant amount of time to be executed. Second, a complex

model might achieve very good performance on training data, but perform quite

badly on validation data. This is because the model is able to contrive relationships

between many parameters in the specific training context, but these relationships

in fact do not exist within a more generalized context. Causing a model to lose its

ability to generalize in this manner is termed "overfitting." Again, learning is more

about generalization than memorization:

Figure 31: Loss function and overfitting

As a rule of thumb, if during the training we see that the loss increases on validation,

after an initial decrease, then we have a problem of model complexity, which overfits

to the training data.

In order to solve the overfitting problem, we need a way to capture the complexity

of a model, that is, how complex a model can be. What could the solution be? Well,

a model is nothing more than a vector of weights. Each weight affects the output,

except for those which are zero, or very close to it. Therefore, the complexity of

a model can be conveniently represented as the number of non-zero weights. In

other words, if we have two models M1 and M2 achieving pretty much the same

performance in terms of loss function, then we should choose the simplest model,

the one which has the minimum number of non-zero weights.

[ 37 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!