13.08.2022 Views

advanced-algorithmic-trading

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

289

..

import pylab as plt

..

..

def plot_test_error_curves_vs(sample_dict, random_seeds, degrees):

fig, ax = plt.subplots()

ds = range(1, degrees+1)

for i in range(1, random_seeds+1):

ax.plot(

ds,

sample_dict["seed_%s" % i],

lw=2,

label=’Test MSE - Sample %s’ % i

)

..

..

ax.plot(

ds,

sample_dict["avg"],

linestyle=’--’,

color="black",

lw=3,

label=’Avg Test MSE’

)

ax.legend(loc=0)

ax.set_xlabel(’Degree of Polynomial Fit’)

ax.set_ylabel(’Mean Squared Error’)

fig.set_facecolor(’white’)

plt.show()

We have selected the degree of our polynomial features to vary between d = 1 to d = 3, thus

providing us with up to cubic order in our features. Figure 20.4 displays the ten different random

splittings of the training and testing data along with the average test MSE (the black dashed

line).

It is immediately apparent how much variation there is across different random splits into a

training and validation set. Since it is not easy to obtain a predictive signal in using previous

days historical close prices of Amazon, we see that as the degree of the polynomial features

increases the test MSE increases as well.

In addition it is clear that the validation set suffers from high variance. In order to minimise

this issue we will now implement k-fold cross-validation on the same Amazon dataset.

20.2.6 k-Fold Cross Validation

Since we have already taken care of the imports above, I will simply outline the new functions

for carrying out k-fold cross-validation. They are almost identical to the functions used for the

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!