27.03.2013 Views

SPSS® 12.0 Command Syntax Reference

SPSS® 12.0 Command Syntax Reference

SPSS® 12.0 Command Syntax Reference

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

228 CLUSTER<br />

Operations<br />

• More than one clustering method can be specified on the METHOD subcommand.<br />

The CLUSTER procedure involves four steps:<br />

• First, CLUSTER obtains distance measures of similarities between or distances separating<br />

initial clusters (individual cases or individual variables if the input is a matrix measuring<br />

distances between variables).<br />

• Second, it combines the two nearest clusters to form a new cluster.<br />

• Third, it recomputes similarities or distances of existing clusters to the new cluster.<br />

• It then returns to the second step until all items are combined in one cluster.<br />

This process yields a hierarchy of cluster solutions, ranging from one overall cluster to as<br />

many clusters as there are items being clustered. Clusters at a higher level can contain several<br />

lower-level clusters. Within each level, the clusters are disjoint (each item belongs to only<br />

one cluster).<br />

• CLUSTER identifies clusters in solutions by sequential integers (1, 2, 3, and so on).<br />

Limitations<br />

Example<br />

• CLUSTER stores cases and a lower-triangular matrix of proximities in memory. Storage<br />

requirements increase rapidly with the number of cases. You should be able to cluster 100<br />

cases using a small number of variables in an 80K workspace.<br />

• CLUSTER does not honor weights.<br />

CLUSTER V1 TO V4<br />

/PLOT=DENDROGRAM<br />

/PRINT=CLUSTER (2 4).<br />

• This example clusters cases based on their values for all variables between and including<br />

V1 and V4 in the working data file.<br />

• The analysis uses the default measure of distance (squared Euclidean) and the default<br />

clustering method (average linkage between groups).<br />

• PLOT requests a dendrogram.<br />

• PRINT displays a table of the cluster membership of each case for the two-, three-, and<br />

four-cluster solutions.<br />

Variable List<br />

The variable list identifies the variables used to compute similarities or distances between<br />

cases.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!