13.08.2022 Views

advanced-algorithmic-trading

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

78

Definition 8.1.1. Expectation. The expected value or expectation, E(x), of a random variable

x is its mean average value in the population. We denote the expectation of x by µ, such that

E(x) = µ.

Now that we have the definition of expectation we can define the variance, which characterises

the "spread" of a random variable:

Definition 8.1.2. Variance. The variance of a random variable is the expectation of the squared

deviations of the variable from the mean, denoted by σ 2 (x) = E[(x − µ) 2 ].

Notice that the variance is always non-negative. This allows us to define the standard deviation:

Definition 8.1.3. Standard Deviation. The standard deviation of a random variable x, σ(x), is

the square root of the variance of x.

Now that we’ve outlined these elementary statistical definitions we can generalise the variance

to the concept of covariance between two random variables. Covariance tells us how linearly

related these two variables are:

Definition 8.1.4. Covariance. The covariance of two random variables x and y, each having

respective expectations µ x and µ y , is given by σ(x, y) = E[(x − µ x )(y − µ y )].

Covariance tells us how two variables move together.

Since we are in a statistical situation we do not have access to the population means µ x and

µ y . Instead we must estimate the covariance from a sample. For this we use the respective

sample means ¯x and ȳ.

If we consider a set of n pairs of elements of random variables from x and y, given by (x i , y i ),

the sample covariance, Cov(x, y) (also sometimes denoted by q(x, y)) is given by:

Cov(x, y) = 1

n − 1

n∑

(x i − ¯x)(y i − ȳ) (8.1)

i=1

Note: You may be wondering why we divide by n − 1 in the denominator, rather than n. This

is a valid question! The reason we choose n − 1 is that it makes Cov(x, y) an unbiased estimator.

8.1.1 Example: Sample Covariance in R

This will be our first usage of the R statistical language in the book. We have previously discussed

the installation procedure, so you can refer back to the introductory chapter if you need to install

R. Assuming you have R installed you can open up the R terminal.

In the following commands we are going to simulate two vectors of length 100, each with a

linearly increasing sequence of integers with some normally distributed noise added. Thus we

are constructing linearly associated variables by design.

We will firstly construct a scatter plot and then calculate the sample covariance using the

cor function. In order to ensure you see exactly the same data as I do, we will set a random

seed of 1 and 2 respectively for each variable:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!