13.08.2022 Views

advanced-algorithmic-trading

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

425

if __name__ == "__main__":

random_state = 42

n_estimators = 400

n_jobs = 1

csv_filepath = "/path/to/your/AREX.csv"

lookback_minutes = 30

lookforward_minutes = 5

print("Importing and creating CSV DataFrame...")

start_date = datetime.datetime(2007, 11, 8)

end_date = datetime.datetime(2012, 12, 31)

ts = create_up_down_dataframe(

csv_filepath,

lookback_minutes=lookback_minutes,

lookforward_minutes=lookforward_minutes,

start=start_date, end=end_date

)

The following code uses the Pandas DataFrame generated above to create the X feature

matrix and the y response vector. In this instance only the first five prior lags are used, although

thirty are available in the initial data generation.

Once the data has been formatted into the X and y data-structures, a training-test split is

created, with the test set equalling 30% of the data.

Note that in the final model development the data should not be split in this fashion and all

of the data up to 2012 should be utilised to train the model, with the remaining 2013/2014 data

used for out of sample trading strategy validation:

# Use the first five daily lags of AREX closing prices

print("Preprocessing data...")

X = ts[

[

"Lookback%s" % str(i)

for i in range(0, 5)

]

]

y = ts["UpDown"]

# Use the training-testing split with 70% of data in the

# training data with the remaining 30% of data in the testing

print("Creating train/test split of data...")

X_train, X_test, y_train, y_test = train_test_split(

X, y, test_size=0.3, random_state=random_state

)

In this section the model is actually fit to the data. The important parameter is max_depth,

which controls the maximum depth of the grown trees in the Random Forest. It controls the

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!