13.08.2022 Views

advanced-algorithmic-trading

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

188

13.2.1 A Bayesian Approach

Recall from the prior chapters on Bayesian inference that Bayes’ Rule is given by:

P (θ|D) = P (D|θ)P (θ)/P (D) (13.6)

Where θ refers to our parameters and D refers to our data or observations.

We want to apply the rule to the idea of updating the probability of seeing a state given all

of the previous data we have and our current observation. Unfortunately we need to introduce

more notation!

If we are at time t then we can represent all of the data known about the system by the

quantity D t . Oour current observations are denoted by y t . Thus we can say that D t = (D t−1 , y t ).

Our current knowledge is a mixture of our previous knowledge plus our most recent observation.

Applying Bayes’ Rule to this situation gives the following:

P (θ t |D t−1 , y t ) = P (y t|θ t )P (θ t |D t−1 )

P (y t )

(13.7)

What does this mean? It says that the posterior or updated probability of obtaining a state

θ t , given our current observation y t and previous data D t−1 , is equal to the likelihood of seeing

an observation y t , given the current state θ t multiplied by the prior or previous belief of the

current state, given only the previous data D t−1 , normalised by the probability of seeing the

observation y t regardless.

While the notation may be somewhat verbose, it is a very natural statement. It says that

we can update our view on the state, θ t , in a rational manner given the fact that we have new

information in the form of the current observation, y t .

One of the extremely useful aspects of Bayesian inference is that if our prior and likelihood are

both normally distributed we can use the concept of conjugate priors to state that our posterior

of θ t will also be normally distributed.

We utilised the same concept, albeit with different distributional forms in our previous discussion

on the inference of binomial proportions.

So how does this help us produce a Kalman Filter?

Let us specify the terms that we will be using from Bayes’ Rule above. Firstly we specify the

distributional form of the prior:

θ t |D t−1 ∼ N (a t , R t ) (13.8)

This says that the prior view of θ at time t, given our knowledge at time t − 1 is distributed

as a multivariate normal distribution with mean a t and variance-covariance R t . The latter two

parameters will be defined below.

Now let us consider the likelihood:

y t |θ t ∼ N (F T t θ t , V t ) (13.9)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!