13.08.2022 Views

advanced-algorithmic-trading

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 20

Model Selection and

Cross-Validation

In this chapter I want to discuss one of the most important and tricky issues in machine learning,

that of model selection and the bias-variance tradeoff. The latter is one of the most crucial

issues in helping us achieve profitable trading strategies based on machine learning techniques.

Model selection refers to our ability to assess performance of differing machine learning models

in order to choose the best one.

The bias-variance tradeoff is a particular property of all (supervised) machine learning models

that enforces a tradeoff between how "flexible" the model is and how well it performs on new,

unseen data. The latter is known as a models generalisation performance.

20.1 Bias-Variance Trade-Off

We will begin by understanding why model selection is important and then discuss the biasvariance

tradeoff qualitatively. We will wrap up the chapter by deriving the bias-variance tradeoff

mathematically and discuss measures to minimise the problems it introduces.

In this chapter we are considering supervised regression models. That is, models which are

trained on a set of labelled training data and produce a quantitative response. An example of

this would be attempting to predict future stock prices based on other factors such as past prices,

interest rates or foreign exchange rates.

This is in contrast to a categorical or binary response model as in the case of supervised classification.

An example of classification would be attempting to assign a topic to a text document

from a finite set of topics, as was discussed in the previous chapter on support vector machines.

The bias-variance tradeoff and model selection situations for classification are extremely similar

to the regression setting and simply require modification to handle the differing ways in which

errors and performance are measured.

20.1.1 Machine Learning Models

As with most of our discussions in machine learning the basic model is given by the following:

275

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!