13.08.2022 Views

advanced-algorithmic-trading

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

202

"risk manager". They will be used to assess how algorithmic trading performance varies with

and without regime detection.

14.1 Markov Models

Prior to the discussion on Hidden Markov Models it is necessary to consider the broader concept

of a Markov Model. Such a stochastic state space model involves random transitions between

states where the probability of the jump is only dependent upon the current state, rather than

any of the previous states. The model is said to possess the Markov Property and is thus

"memoryless". Random walk models, which were discussed in a previous chapter are a familiar

example of a Markov Model.

Markov Models can be categorised into four broad classes depending upon the autonomy

of the system and whether all or part of the information about the system can be observed at

each state. The Markov Model page at Wikipedia[8] provides a useful matrix that outlines these

differences, which will be repeated here:

Fully Observable

Partially Observable

Autonomous Markov Chain[6] Hidden Markov Model[5]

Controlled Markov Decision Process[7] Partially Observable Markov Decision Process[9]

The simplest model, the Markov Chain, is both autonomous and fully observable. It cannot

be modified by actions of an "agent" as in the controlled processes and all information is available

from the model at any point in time. Good examples of Markov Chains are the various Markov

Chain Monte Carlo (MCMC) algorithm used heavily in computational Bayesian inference, which

were discussed in previous chapters.

If the model is still fully autonomous but only partially observable then it is known as a Hidden

Markov Model. In such a model there are underlying latent states–and probability transitions

between them–but they are not directly observable. Instead these latent states influence the

observations. While the latent states possess the Markov Property there is no need for the

observations to do so. The most common use of HMM outside of quantitative finance is in the

field of speech recognition, where they are extremely successful.

Once the system is allowed to be controlled by an agent the processes come under the heading

of Reinforcement Learning. This is often considered to be the "third pillar" of machine learning

along with Supervised Learning and Unsupervised Learning. If the system is fully observable,

but controlled, then the model is called a Markov Decision Process (MDP). A related technique

is known as Q-Learning[15], which is used to optimise the action-selection policy for an agent

under a Markov Decision Process model. In 2015 Google DeepMind pioneered the use of Deep

Reinforcement Networks, or Deep Q Networks, to create an optimal agent for playing Atari 2600

video games solely from the pixel data within the screen buffer[70].

If the system is both controlled and only partially observable then such Reinforcement Learning

models are termed Partially Observable Markov Decision Processes (POMDP). Techniques

to solve high-dimensional POMDP are the subject of current academic research. The non-profit

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!