- Page 2 and 3: ContentsI Introduction 11 Introduct
- Page 4 and 5: 38.2 Correlation . . . . . . . . .
- Page 6 and 7: 513.2 The Kalman Filter . . . . . .
- Page 8 and 9: 719 Support Vector Machines . . . .
- Page 10 and 11: 928 Kalman Filter-Based Pairs Tradi
- Page 12 and 13: Limit of Liability/Disclaimer ofWar
- Page 16 and 17: Chapter 1Introduction To AdvancedAl
- Page 18 and 19: 5The main idea in Time Series Analy
- Page 20 and 21: 71.4.1 MathematicsTo get the most o
- Page 22 and 23: 9All of the code in the Python sect
- Page 24: Part IIBayesian Statistics11
- Page 27 and 28: 14This is in contrast to another fo
- Page 29 and 30: 16P (B)P (A|B) = P (A ∩ B) (2.2)B
- Page 31 and 32: 18• P (θ|D) is the posterior. Th
- Page 33 and 34: 20Thus it can be seen that Bayesian
- Page 35 and 36: 22
- Page 37 and 38: 243.1 The Bayesian ApproachWhile we
- Page 39 and 40: 26k = 0 represents a tail, then the
- Page 41 and 42: 28P (θ|α, β) = θ α−1 (1 −
- Page 43 and 44: 30Using a beta distribution for the
- Page 45 and 46: 32P (θ|z, N) = P (z, N|θ)P (θ)/P
- Page 47 and 48: 34
- Page 49 and 50: 364.2 Why Markov Chain Monte Carlo?
- Page 51 and 52: 38We then generate a uniform random
- Page 53 and 54: 404.5.2 Inferring a Binomial Propor
- Page 55 and 56: 42plt.plot(x, stats.beta.pdf(x, alp
- Page 57 and 58: 44# Parameter values for prior and
- Page 59 and 60: 46
- Page 61 and 62: 48find the orientation of the hyper
- Page 63 and 64: 50Var(y) = V (E(y)) = V (g −1 (X
- Page 65 and 66:
52Figure 5.1: Simulation of noisy l
- Page 67 and 68:
54Figure 5.2: Using PyMC3 to fit a
- Page 69 and 70:
565.5 Full Codeimport matplotlib.py
- Page 71 and 72:
58sns.lmplot(x="x", y="y", data=df,
- Page 73 and 74:
60stochastic differential equations
- Page 75 and 76:
62changes as the number of degrees-
- Page 77 and 78:
64Pandas is used to obtain the raw
- Page 79 and 80:
66pm.traceplot(trace, model.vars[:-
- Page 81 and 82:
68"""Download, calculate and plot t
- Page 83 and 84:
70
- Page 86 and 87:
Chapter 7Introduction to Time Serie
- Page 88 and 89:
75We will learn R in a problem-solv
- Page 90 and 91:
Chapter 8Serial CorrelationIn the p
- Page 92 and 93:
79> set.seed(1)> x <- seq(1,100) +
- Page 94 and 95:
81There are two important points to
- Page 96 and 97:
83As with the above definitions of
- Page 98 and 99:
85Figure 8.3: Correlogram plotted i
- Page 100 and 101:
Chapter 9Random Walks and White Noi
- Page 102 and 103:
899.3 White NoiseWe will begin by m
- Page 104 and 105:
91as a confirmation that we have el
- Page 106 and 107:
93Figure 9.3: Correlogram of a Rand
- Page 108 and 109:
95Once quantmod is installed we can
- Page 110 and 111:
Chapter 10Autoregressive Moving Ave
- Page 112 and 113:
99Definition 10.2.1. Strictly Stati
- Page 114 and 115:
101In order to determine whether an
- Page 116 and 117:
103with parameter estimates for the
- Page 118 and 119:
105Figure 10.3: Realisation of AR(2
- Page 120 and 121:
107This allows us to compare "apple
- Page 122 and 123:
109In particular we note that the a
- Page 124 and 125:
111Figure 10.9: Correlogram of Firs
- Page 126 and 127:
113⎧1 if k = 0⎪⎨ q−k∑q∑
- Page 128 and 129:
115[1] 0.440208 0.764392We can see
- Page 130 and 131:
117Figure 10.12: Realisation of MA(
- Page 132 and 133:
119Let us try a MA(2) model:> amznr
- Page 134 and 135:
121This provided evidence of both c
- Page 136 and 137:
123Figure 10.17: Residuals of MA(2)
- Page 138 and 139:
12510.6.1 Bayesian Information Crit
- Page 140 and 141:
127We can start with the simplest p
- Page 142 and 143:
129> plot(x)The output of our ARMA(
- Page 144 and 145:
131residuals to determine if we hav
- Page 146 and 147:
133> getSymbols("^GSPC")> sp = diff
- Page 148 and 149:
Chapter 11Autoregressive Integrated
- Page 150 and 151:
137Alternatively it is possible to
- Page 152 and 153:
139Figure 11.2: Correlogram of the
- Page 154 and 155:
https://sanet.cd/blogs/polatebooks/
- Page 156 and 157:
143> Box.test(resid(spfinal.arima),
- Page 158 and 159:
145would further depress the price
- Page 160 and 161:
147√ √√√α0 p∑ɛ t = w t
- Page 162 and 163:
149Figure 11.7: Correlogram of a si
- Page 164 and 165:
151> ft <- as.numeric(ftrt)> ft <-
- Page 166 and 167:
153Figure 11.12: Residuals of a GAR
- Page 168 and 169:
Chapter 12Cointegrated Time SeriesI
- Page 170 and 171:
15712.3.1 Augmented Dickey-Fuller T
- Page 172 and 173:
159autocorrelation:> layout(1:2)> a
- Page 174 and 175:
161library:> library("tseries")> ad
- Page 176 and 177:
163section we are going to consider
- Page 178 and 179:
165firms paired with an ETF that tr
- Page 180 and 181:
167Figure 12.7: Scatter plot of bac
- Page 182 and 183:
169We can replicate the above steps
- Page 184 and 185:
171Figure 12.11: Residuals of the f
- Page 186 and 187:
173## Plot the residuals of the fir
- Page 188 and 189:
175∆x t = µ + Ax t−1 + Γ 1
- Page 190 and 191:
177p.d -0.1381095 -0.771055116 -0.0
- Page 192 and 193:
179> library("quantmod")We now need
- Page 194 and 195:
181Test type: trace statistic , wit
- Page 196 and 197:
183spyAdj = unclass(SPY$SPY.Adjuste
- Page 198 and 199:
Chapter 13State Space Models and th
- Page 200 and 201:
187Where F T is the transpose of F
- Page 202 and 203:
189This says that the likelihood fu
- Page 204 and 205:
19113.3 Dynamic Hedge Ratio Between
- Page 206 and 207:
193Note: You will likely need to ru
- Page 208 and 209:
195Utilise the Kalman Filter from t
- Page 210 and 211:
197from __future__ import print_fun
- Page 212 and 213:
199prices = pd.DataFrame(index=etf_
- Page 214 and 215:
Chapter 14Hidden Markov ModelsA con
- Page 216 and 217:
203team at OpenAI spend significant
- Page 218 and 219:
205Model will tend to stay in a par
- Page 220 and 221:
20714.3.1 Market RegimesApplying Hi
- Page 222 and 223:
209Figure 14.3: Simulated market re
- Page 224 and 225:
211> # Fit a Hidden Markov Model wi
- Page 226 and 227:
213highly volatile state. Subsequen
- Page 228 and 229:
215# Obtain S&P500 data from 2004 o
- Page 230:
Part IVStatistical Machine Learning
- Page 233 and 234:
https://sanet.cd/blogs/polatebooks/
- Page 235 and 236:
22215.4 Machine Learning Applicatio
- Page 237 and 238:
224disadvantages are the need to ha
- Page 239 and 240:
226to machine learning, which is th
- Page 241 and 242:
228However, a significant concern a
- Page 243 and 244:
230variance σ 2 . ɛ represents th
- Page 245 and 246:
232Figure 17.1:(2012)[71]Plot of p(
- Page 247 and 248:
234N∑NLL(θ) = − log p(y i | x
- Page 249 and 250:
236The goal of the exercise will be
- Page 251 and 252:
238lr_model.predict(X_test),color=
- Page 253 and 254:
240from sklearn import linear_model
- Page 255 and 256:
242
- Page 257 and 258:
244Where w m is the mean response i
- Page 259 and 260:
246The first task is to sum across
- Page 261 and 262:
24818.4 Advantages and Disadvantage
- Page 263 and 264:
250Unfortunately this gain in predi
- Page 265 and 266:
252import datetimeimport matplotlib
- Page 267 and 268:
254X = amzn[["Lag1", "Lag2", "Lag3"
- Page 269 and 270:
256plt.plot(estimators, bagging_mse
- Page 271 and 272:
258"""# Obtain stock information fr
- Page 273 and 274:
260)rf = RandomForestRegressor(n_es
- Page 275 and 276:
262between categories.Formally, in
- Page 277 and 278:
264b 0 + b 1 x 1 + ... + b p x p =
- Page 279 and 280:
266andb · x i + b 0 < 0, if y i =
- Page 281 and 282:
268Maximise M ∈ R, by varying b 1
- Page 283 and 284:
270andy i (b · x + b 0 ) ≥ M(1
- Page 285 and 286:
272p∑〈u, v〉 = u j v j (19.14)
- Page 287 and 288:
274
- Page 289 and 290:
276y = f(x) + ɛ (20.1)This states
- Page 291 and 292:
278the following day, we are only c
- Page 293 and 294:
280However if we plot the test MSE,
- Page 295 and 296:
28220.2.1 Overview of Cross-Validat
- Page 297 and 298:
284Secondly, note that in the 50-50
- Page 299 and 300:
286tslag["Volume"] = ts["Volume"]#
- Page 301 and 302:
288# Increase degree of linear# reg
- Page 303 and 304:
290Figure 20.4: Test MSE curves for
- Page 305 and 306:
292..We can plot these curves with
- Page 307 and 308:
294Trading volume, as well as the D
- Page 309 and 310:
296sample_dict["seed_%s" % i] = np.
- Page 311 and 312:
298label=’Avg Test MSE’)ax.lege
- Page 313 and 314:
300
- Page 315 and 316:
30221.1 High Dimensional DataQuanti
- Page 317 and 318:
304The canonical algorithm for clus
- Page 319 and 320:
306Figure 22.1: K-Means Clustering
- Page 321 and 322:
30822.1.3 Simulated DataIn this sec
- Page 323 and 324:
310Figure 22.2: K-Means Algorithm o
- Page 325 and 326:
312["Open", "High", "Low","Close",
- Page 327 and 328:
314alldays = DayLocator()weekFormat
- Page 329 and 330:
316# Plot the full OHLC candles re-
- Page 331 and 332:
318changes. Clearly this motivates
- Page 333 and 334:
320Obtains a pandas DataFrame conta
- Page 335 and 336:
322ax.xaxis.set_major_locator(monda
- Page 337 and 338:
https://sanet.cd/blogs/polatebooks/
- Page 339 and 340:
326• Continually monitor the syst
- Page 341 and 342:
328... 1049566 Dec 4 1996 reut2-020
- Page 343 and 344:
330of a quant researchers day is sp
- Page 345 and 346:
332import pprintimport retry:from h
- Page 347 and 348:
334self.docs.append( (self.topics,
- Page 349 and 350:
336’90,000 yen per 15 tonne lot f
- Page 351 and 352:
33823.4 VectorisationAt this stage
- Page 353 and 354:
340# Open the first Reuters data se
- Page 355 and 356:
34223.7 Performance MetricsThe two
- Page 357 and 358:
344print(svm.score(X_test, y_test))
- Page 359 and 360:
346def handle_endtag(self, tag):"""
- Page 361 and 362:
348def train_svm(X, y):"""Create an
- Page 364 and 365:
Chapter 24Introduction to QSTraderI
- Page 366 and 367:
353The design calls for the infrast
- Page 368 and 369:
Chapter 25Introductory Portfolio St
- Page 370 and 371:
357Ticker Name Period LinkSPY SPDR
- Page 372 and 373:
359):ticker = event.tickerif self.t
- Page 374 and 375:
361from qstrader import settingsfro
- Page 376 and 377:
363Figure 25.2: "Strategic" Weight
- Page 378 and 379:
36525.6 Full Code# monthly_rebalanc
- Page 380 and 381:
367import datetimefrom qstrader imp
- Page 382 and 383:
Chapter 26ARIMA+GARCH Trading Strat
- Page 384 and 385:
371We use the same procedure as in
- Page 386 and 387:
373value would represent data not k
- Page 388 and 389:
375Figure 26.1: Equity curve of ARI
- Page 390 and 391:
377}final.aic <- current.aicfinal.o
- Page 392 and 393:
379new_str = "%s,%s\n" % (strpf, ol
- Page 394 and 395:
Chapter 27Cointegration-Based Pairs
- Page 396 and 397:
383Figure 27.1: Backward-adjusted p
- Page 398 and 399:
385By definition a mean-reverting s
- Page 400 and 401:
387ues of the market value of a "un
- Page 402 and 403:
389commands, so that the positions
- Page 404 and 405:
391thresholds are set to 1.5 and 0.
- Page 406 and 407:
393surprising since no statisticall
- Page 408 and 409:
395self.bars_elapsed = 0def _set_co
- Page 410 and 411:
397self.invested = Nonedef calculat
- Page 412 and 413:
399# Use the ExampleCompliance comp
- Page 414 and 415:
Chapter 28Kalman Filter-Based Pairs
- Page 416 and 417:
403• TLT - For the period 3rd Aug
- Page 418 and 419:
405self.time = event.time# Set the
- Page 420 and 421:
407print("LONG: %s" % event.time)se
- Page 422 and 423:
409# Use the Naive Position Sizer (
- Page 424 and 425:
411Figure 28.1: Kalman Filter-Based
- Page 426 and 427:
413"""Sets the correct price and ev
- Page 428 and 429:
415))self.invested = "long"elif et
- Page 430 and 431:
417strategy = Strategies(strategy,
- Page 432 and 433:
Chapter 29Supervised Learning for I
- Page 434 and 435:
42129.2 Building a Prediction Model
- Page 436 and 437:
423The next steps consist of filter
- Page 438 and 439:
425if __name__ == "__main__":random
- Page 440 and 441:
427from sklearn.externals import jo
- Page 442 and 443:
429SignalEvent(self.tickers[0], "BO
- Page 444 and 445:
431)# Use the ExampleCompliance com
- Page 446 and 447:
433Figure 29.1: Tearsheet for Linea
- Page 448 and 449:
435total return. The Sharpe Ratio i
- Page 450 and 451:
437)# Create the lookback and lookf
- Page 452 and 453:
439#model = LinearDiscriminantAnaly
- Page 454 and 455:
441self.minutes += 1# Allow enough
- Page 456 and 457:
443# Use the Tearsheet Statisticsst
- Page 458 and 459:
Chapter 30Sentiment Analysis via Se
- Page 460 and 461:
447sentiment").This sample file for
- Page 462 and 463:
449Ticker Name Period LinkXOM Exxon
- Page 464 and 465:
451tuples, it is useful to create a
- Page 466 and 467:
453if event.type == EventType.TICK
- Page 468 and 469:
455threshold, then it closes the po
- Page 470 and 471:
457Figure 30.1:is mostly down or fl
- Page 472 and 473:
459Figure 30.3:Defence stocks provi
- Page 474 and 475:
461if (self.invested[ticker] is Fal
- Page 476 and 477:
https://sanet.cd/blogs/polatebooks/
- Page 478 and 479:
Chapter 31Market Regime Detection w
- Page 480 and 481:
467Ticker Name Period LinkSPY SPDR
- Page 482 and 483:
469def obtain_prices_df(csv_filepat
- Page 484 and 485:
471adjusted closing prices are plot
- Page 486 and 487:
473# short and long window barspric
- Page 488 and 489:
475# Determine the HMM predicted re
- Page 490 and 491:
477pickle_path = "/path/to/your/mod
- Page 492 and 493:
479The underlying strategy is desig
- Page 494 and 495:
481Obtain the prices DataFrame from
- Page 496 and 497:
483Requires:tickers - The list of t
- Page 498 and 499:
485)hidden_state = self.hmm_model.p
- Page 500 and 501:
487)csv_dir, events_queue, tickers,
- Page 502 and 503:
Chapter 32Strategy DecayIn this cha
- Page 504 and 505:
491It should be well remembered tha
- Page 506 and 507:
493)label=’Backtest’, ax=ax, **
- Page 508 and 509:
495Figure 32.1: Kalman Filter Pairs
- Page 510 and 511:
497Figure 32.3: Sentiment Sentdex S
- Page 512 and 513:
Bibliography[1] Wikipedia: Standard
- Page 514 and 515:
501[39] Efron, B. Bootstrap methods
- Page 516 and 517:
503[74] O’Mahony, A. Online linea