25.10.2016 Views

SAP HANA Predictive Analysis Library (PAL)

sap_hana_predictive_analysis_library_pal_en

sap_hana_predictive_analysis_library_pal_en

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4 End-to-End Scenarios<br />

This section provides end-to-end scenarios of predictive analysis with <strong>PAL</strong> algorithms.<br />

4.1 Scenario: Predict Segmentation of New Customers for<br />

a Supermarket<br />

We wish to predict segmentation/clustering of new customers for a supermarket. First use the K-means<br />

function in <strong>PAL</strong> to perform segmentation/clustering for existing customers in the supermarket. The output<br />

can then be used as the training data for the C4.5 Decision Tree function to predict new customers’<br />

segmentation/clustering.<br />

Technology Background<br />

●<br />

●<br />

K-means clustering is a method of cluster analysis whereby the algorithm partitions N observations or<br />

records into K clusters, in which each observation belongs to the cluster with the nearest center. It is one<br />

of the most commonly used algorithms in clustering method.<br />

Decision trees are powerful and popular tools for classification and prediction. Decision tree learning, used<br />

in statistics, data mining, and machine learning uses a decision tree as a predictive model which maps the<br />

observations about an item to the conclusions about the item's target value.<br />

Implementation Steps<br />

Assume that:<br />

●<br />

●<br />

●<br />

DM_<strong>PAL</strong> is a schema belonging to USER1; and<br />

USER1 has been assigned the AFLPM_CREATOR_ERASER_EXECUTE role; and<br />

USER1 has been assigned the AFL__SYS_AFL_AFL<strong>PAL</strong>_EXECUTE or<br />

AFL__SYS_AFL_AFL<strong>PAL</strong>_EXECUTE_WITH_GRANT_OPTION role.<br />

Step 1<br />

Input customer data and use the K-means function to partition the data set into K clusters. In this example,<br />

nine rows of data will be input. K equals 3, which means the customers will be partitioned into three levels.<br />

SET SCHEMA DM_<strong>PAL</strong>;<br />

DROP TYPE <strong>PAL</strong>_KMEANS_RESASSIGN_T;<br />

CREATE TYPE <strong>PAL</strong>_KMEANS_RESASSIGN_T AS TABLE(<br />

"ID" INT,<br />

"CENTER_ASSIGN" INT,<br />

552 P U B L I C<br />

<strong>SAP</strong> <strong>HANA</strong> <strong>Predictive</strong> <strong>Analysis</strong> <strong>Library</strong> (<strong>PAL</strong>)<br />

End-to-End Scenarios

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!