13.08.2022 Views

advanced-algorithmic-trading

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

227

ŷ = ˆf(x) = argmax z∈R p(y = z | x) (16.3)

Common regression techniques include Linear Regression, Support Vector Regression and

Random Forests.

Regression will be utilised in this book to estimate future asset prices in an intraday trading

strategy, give previously known information.

16.3.1 Financial Example

To make this concrete consider an example of modelling the next days price of the London-based

equity Royal Dutch Shell (LSE:RDSB) via the historical prices of crude oil and natural gas.

Each day, which is indexed by i, has an associated pair (x i , y i ) representing the feature vector

and response for that day. The feature vector x i represents the historical prices of crude oil (c ij )

at current day i and historical lag j; similarly for natural gas (g ij ), over a period of N days, itself

indexed by j ∈ 1, . . . , N. y i represents the price of RDSB tomorrow, that is at i + 1.

In this example the goal is to estimate the function f that relates tomorrows price for RDSB

with the historical prices of crude oil and natural gas.

16.4 Training

Now that the probabilistic formulation has been defined it remains to discuss how it is possible

to "supervise" or "train" the model with a specific set of data.

In order to train the model it is necessary to define a loss function between the true value

of the response y and its estimate from the model ŷ, given by L(y, ŷ).

In the classification setting common loss models include the 0-1 loss and the cross-entropy.

In the regression setting a common loss model is given by the Mean Squared Error (MSE):

MSE = 1 N

N∑

|y i − ŷ i | 2 (16.4)

i=1

This states the the total error of a model given a particular set of data is the average of the

sum of the squared differences between all of the training values y i and their associated estimates

ŷ i .

This loss function essentially penalises estimate values far from their true values quite heavily,

as the differences are squared. Notice also that the important aspect is the squared distance

between values, not whether they are positive or negative deviations. MSE will be discussed in

more depth in the next chapter on Linear Regression.

Equipped with this loss function it is now possible to make estimates of ˆf, and thus ŷ,

by carrying out "fitting" algorithms on particular machine learning techniques that attempt to

minimise the value of these loss functions, by adjusting the parameters θ of the model.

A minimal value of a loss function states that the errors between the true values and the

estimated values are not too severe. This leads to the hope that the model will perform similarly

when exposed to data that it has not been "trained" on.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!