25.10.2016 Views

SAP HANA Predictive Analysis Library (PAL)

sap_hana_predictive_analysis_library_pal_en

sap_hana_predictive_analysis_library_pal_en

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3.7.5 Grubbs' Test<br />

Grubbs’ test is used to detect outliers from a given univariate data set Y={Y 1 ,Y 2 ,...,Y n }. The algorithm<br />

assumes that Y comes from Gaussian distribution.<br />

The basic steps of the algorithm are as follows:<br />

1. Define the hypothesis.<br />

H0: There are no outliers in the data set Y.<br />

H1: There is at least one outlier in data set Y.<br />

2. Calculate Grubbs’ test statistic.<br />

Here<br />

3. Given the significance level α, if<br />

The algorithm will reject the hypothesis at the significance level α, which means that the data set contains<br />

outlier. Here<br />

denotes the quantile value of t-distribution with n-2 degrees and a significance level<br />

.<br />

The above is called two-sided test. There is another version called one-sided test for minimum value or<br />

maximum value.<br />

●<br />

For minimum value:<br />

●<br />

For maximum value:<br />

Note that you must replace with for one-sided test.<br />

Suppose Y max is an outlier from Grubbs’ test, you can calculate the statistic value U as shown below:<br />

1. Remove Y max from original data and get Z={Z 1 ,Z 2 ,...,Z n-1 .<br />

514 P U B L I C<br />

<strong>SAP</strong> <strong>HANA</strong> <strong>Predictive</strong> <strong>Analysis</strong> <strong>Library</strong> (<strong>PAL</strong>)<br />

<strong>PAL</strong> Functions

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!