13.08.2022 Views

advanced-algorithmic-trading

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 30

Sentiment Analysis via Sentdex

Vendor Sentiment Data with

QSTrader

In addition to the "usual" tricks of statistical arbitrage, trend-following and fundamental analysis,

many quant shops and retail quants engage in natural language processing (NLP) techniques to

build systematic strategies. Such techniques fall under the banner of Sentiment Analysis.

In this chapter a group of quantitative trading strategies will be developed that utilise a set

of sentiment signals generated from a vendor API. These signals provide an integer scale ranging

from -3 ("Strongest negative sentiment") to +6 ("Strongest positive sentiment"), associated with

a date and a ticker symbol, that can be used as entry and exit thresholds in an event-driven

backtesting simulation.

A key challenge in developing such a system is integrating the events representing sentiment,

as stored in a CSV file of "datetime-ticker-sentiment" rows, into an event-driven trading system

that is usually designed to trade directly off pricing data.

The chapter will begin with a brief discussion of how sentiment analysis is carried out, along

with an outline of the nature of vendor APIs and sample files. It will continue by discussing

the sentiment functionality present in QSTrader, including snippets of the associated Python

code. It will conclude by presenting the results of three separate backtests of the sentiment

strategy applied to S&P500 stocks in the tech, defence and energy sectors. The full code for

these strategies is presented at the end of the chapter.

30.1 Sentiment Analysis

The goal of sentiment analysis is, generally, to take large quantities of "unstructured" data (such

as blog posts, newspaper articles, research reports, tweets, video, images etc) and use NLP

techniques to quantify positive or negative "sentiment" about certain assets.

For equities in particular this often amounts to a statistical machine learning analysis of

the language utilised and whether it contains bullish or bearish phrasing. This phrasing can

be quantified in terms of strength of sentiment, which translates into numerical values. Often

this means positive values reflecting bullish sentiment and negative values representing bearish

445

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!