13.08.2022 Views

advanced-algorithmic-trading

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 21

Unsupervised Learning

The previous chapters have all discussed supervised learning techniques. This chapter introduces

unsupervised learning, which will allow analysis of unlabelled datasets.

Supervised learning involves working with feature/response pairs. Its goal is to try and predict

the response from the associated features. It is "supervised" because in the training phase of

the learning process the algorithm has access to the ground truth via known responses to certain

input features. It uses these to adjust its model parameters such that when exposed to new

features it can make an estimate of the response.

In unsupervised learning the features are still present but there is no associated response.

Instead interest lies solely in attributes of the features themselves. This might include whether

the features form specific clusters or sub-groups in feature space. It might also include whether

very high-dimensional data can be described in a much lower-dimensional setting.

Unsupervised learning techniques are often motivated by the fact that it can be prohibitive

in terms of time and/or money to "label" feature data, which would permit analysis using supervised

techniques. An additional motivation is due to the fact that images, video, natural

language documents and scientific research data (such as gene expressions), once quantified, possess

very high dimensionality. Such high dimensionality requires supervised learning techniques

with many degrees of freedom, potentially leading to overfitting and thus poor test performance.

Unsupervised learning techniques are a partial solution to these problems.

Unfortunately the lack of ground truth or supervision for unsupervised techniques often leads

to subjective assessment of their performance. There are no widely agreed approaches for quantifying

how effective unsupervised algorithms are. Performance is largely determined on a case-bycase

basis using heuristic approaches. Such judgement-based assessments might seem unscientific

to quantitatively trained individuals, but unsupervised techniques have proven to be extremely

useful in many research areas.

Unsupervised learning techniques are often deployed in the realms of anomaly detection,

purchasing habit analysis, recommendation systems and natural language processing. In quantitative

finance they find usage in de-noising datasets, portfolio/asset clustering, market regime

detection and trading signal generation with natural language processing.

301

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!