12.07.2015 Views

K-means clustering

K-means clustering

K-means clustering

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Database segmentation• The words segmentation and <strong>clustering</strong> areused interchangeably• Clustering is unsupervised classification: nopredefined classes and no examples


Database segmentationCluster analysis or database segmentation• Grouping a set of data objects intoclusters• “The goal of database segmentation is topartition a database into segments ofsimilar records, that is records that share anumber of properties”.


Database segmentationIncome (thousands)older highly paid managersYoung well educated professionals20 40 age


Example of data mining by using<strong>clustering</strong> techniques•fitting the troops: the US armycommissioned a study on how toredesign the uniforms of femalessoldiers.


Example of data mining by using<strong>clustering</strong> techniques•The army goal is to reduce thenumber of different sizes that have tobe kept in inventory while stillproviding each solder with wellfittinguniform.


Example of data mining by using<strong>clustering</strong> techniques•The k-<strong>means</strong> algorithm, a <strong>clustering</strong>technique, is employed in this case.The database they mined containedmore than 100 measurements foreach of nearly 3000 women.


Major Clustering Approaches•Partitioning approach: Constructvarious partitions and then evaluatethem by some criterion


Partitioning approach (1)


Partitioning approach (2)


Partitioning approach (3)


Major Clustering Approaches• Neural network approachx 1y 1y 2x 2 y 3


Major Clustering Approaches• Hierarchy approach: Create a hierarchicaldecomposition of the set of data (or objects)using some criterion


Hierarchy approachStep 0 Step 1 Step 2 Step 3 Step 4aa bba b c d ecc d edd eeStep 4 Step 3 Step 2 Step 1 Step 0agglomerative(AGNES)divisive(DIANA)


K-<strong>means</strong> <strong>clustering</strong>•the k-<strong>means</strong> <strong>clustering</strong> is the mostcommonly used in practice.


K-<strong>means</strong> <strong>clustering</strong>• Concepts of k-<strong>means</strong> algorithm•Step1: select an initial partitionwith k clusters.


K-<strong>means</strong> <strong>clustering</strong>•Step 2: Compute seed points as thecentroids of the clusters of thecurrent partition. The centroid isthe center (mean point) of thecluster


Step 2:The centroid of the cluster+ ++


K-<strong>means</strong> <strong>clustering</strong>•Step 3: Generate a new partition byassigning each pattern to its closestcluster center.


Step 3+ ++


K-<strong>means</strong> <strong>clustering</strong>•Step 4: Compute new clustercenters as the centroids of clusters•Step 5: Repeat steps 2 and 5 untilan optimum value of the criterionfunction is found.


Step4+ ++


K-<strong>means</strong> <strong>clustering</strong>•k-<strong>means</strong> algorithm•Problem statement: suppose thatthe given set of n patterns in ddimensions has somehow beenpartition into(continue)


K-<strong>means</strong> <strong>clustering</strong>K clusters { C , C , …,C } such1 2 Kthat cluster C has n patterns andK Keach pattern is in exactly onecluster, so thatK∑i = 1ni =n


K-<strong>means</strong> <strong>clustering</strong>The mean vector or center of clusterC K is defined as the centroid ofcluster orm(K)=1nKn K∑ xi=1(K)iWhere x i(K)is the i th pattern belonging to cluster C K .


K-<strong>means</strong> <strong>clustering</strong>The distance between a data, x i , andthe cluster center, m (K) , of clusterC K is the squared Euclideandistances between the data , x i , andthe cluster center m (K) :d( xi,m( k ))=(( K ))T(( K ))x − m x − mii


K-<strong>means</strong> <strong>clustering</strong>The square error for cluster C is the Ksum of the squared Euclideandistances between each pattern inC and its cluster center m (K) :Ke2K=nK∑i=1( x( K )− m( K ))T( x( K )− m( K ))ii


K-<strong>means</strong> <strong>clustering</strong>The square error for the entire<strong>clustering</strong> containing K clusters isthe sum of the within-clustervariation:K= ∑i=E22K ei1


K-<strong>means</strong> <strong>clustering</strong>•K-mean algorithm– Step1: choose K initial clustercenter m 1 (1), m 2 (1), …, m K (1).


K-<strong>means</strong> <strong>clustering</strong>–Step2: At the a th iterative stepdistribute the sample, {x}among the K cluster using therelation:x∈CBifd(x,m(B))


K-<strong>means</strong> <strong>clustering</strong>–Step3: Compute new clustercenter and the new square errorfor the entire K clusterm(K)=1nKn K∑ xi=1(K)i


K-<strong>means</strong> <strong>clustering</strong>-Step4: Ifm(a + 1) =JmJ(a)for J=1,2, …, K then stop


Example: k-mean <strong>clustering</strong> (1)K=2 M =(1,1) , M (2,2)1 2y(4,4)(2,2)(1,1)x


Example: k-mean <strong>clustering</strong> (2)11(2)12(1)121222112211),(011111111),(,11CxmxdmxdxTT∈∴=⎟⎟⎠⎞⎜⎜⎝⎛⎥⎦⎤⎢⎣⎡⎥ −⎦⎤⎢⎣⎡⋅⎟⎟⎠⎞⎜⎜⎝⎛⎥⎦⎤⎢⎣⎡⎥ −⎦⎤⎢⎣⎡==⎟⎟⎠⎞⎜⎜⎝⎛⎥⎦⎤⎢⎣⎡⎥ −⎦⎤⎢⎣⎡⋅⎟⎟⎠⎞⎜⎜⎝⎛⎥⎦⎤⎢⎣⎡⎥ −⎦⎤⎢⎣⎡=⎥⎦⎤⎢⎣⎡=


Example: k-mean <strong>clustering</strong> (3)22(2)22(1)222022222222),(211221122),(,22CxmxdmxdxTT∈∴=⎟⎟⎠⎞⎜⎜⎝⎛⎥⎦⎤⎢⎣⎡⎥ −⎦⎤⎢⎣⎡⋅⎟⎟⎠⎞⎜⎜⎝⎛⎥⎦⎤⎢⎣⎡⎥ −⎦⎤⎢⎣⎡==⎟⎟⎠⎞⎜⎜⎝⎛⎥⎦⎤⎢⎣⎡⎥ −⎦⎤⎢⎣⎡⋅⎟⎟⎠⎞⎜⎜⎝⎛⎥⎦⎤⎢⎣⎡⎥ −⎦⎤⎢⎣⎡=⎥⎦⎤⎢⎣⎡=


Example: k-mean <strong>clustering</strong> (4)23(2)32(1)323822442244),(1811441144),(,44CxmxdmxdxTT∈∴=⎟⎟⎠⎞⎜⎜⎝⎛⎥⎦⎤⎢⎣⎡⎥ −⎦⎤⎢⎣⎡⋅⎟⎟⎠⎞⎜⎜⎝⎛⎥⎦⎤⎢⎣⎡⎥ −⎦⎤⎢⎣⎡==⎟⎟⎠⎞⎜⎜⎝⎛⎥⎦⎤⎢⎣⎡⎥ −⎦⎤⎢⎣⎡⋅⎟⎟⎠⎞⎜⎜⎝⎛⎥⎦⎤⎢⎣⎡⎥ −⎦⎤⎢⎣⎡=⎥⎦⎤⎢⎣⎡=


Example: k-mean <strong>clustering</strong> (5)C1=⎧⎨⎩⎡⎢⎣11⎤⎥⎦⎫⎬⎭C2=⎧⎨⎩⎡⎢⎣22⎤⎥,⎦⎡⎢⎣44⎤⎥⎦⎫⎬⎭


Example: k-mean <strong>clustering</strong> (6)•Find new mean∴ C 1 : M 1 = 1 1=1[]11[]1C []2+ [ 4 2 : M 2 = 1( ]) =2 24•Doing until patterns of cluster stop changeand find E 2 k3[]3C 1 =12[],[ ]12K = 2 ; E 2 2 = 1C 2 =4[]4


Determining the number of clusters•Plot the error for 2, 3, …clusters andfind the knee in the curve• Use domain specific knowledge andinspect the clusters for desiredcharacteristic.


Example: k-mean <strong>clustering</strong>E 2K


Variations of the K-Means Method• A few variants of the k-<strong>means</strong> which differin• Selection of the initial k <strong>means</strong>• Dissimilarity calculations• Strategies to calculate cluster <strong>means</strong>


Variations of the K-Means Method• Handling categorical data: k-modesReplacing <strong>means</strong> of clusters with modes• Using new dissimilarity measures to dealwith categorical objects• Using a frequency-based method toupdate modes of clusters• A mixture of categorical and numerical data:k-prototype method


Neural Networks• A neural network is based on concepts ofhow the human brain is organized andhow it learns.


Neural Networksa 1Transfer functionw 1wa 2∑ F(.)2a w33 3net= ∑wii= 1aiOA unit of an artificial neural network


Neural NetworksTypical transfer functions are⎩⎨⎧+==⎩⎨⎧+==>⋅+=>−⋅+=0net0,0net1,)sgn(F(net)0net-1,0net1,)sgn(F(net)0,net)exp(-11F(net)01,net)exp(-12F(net)netnetλλλλ


Sigmoid functionsigmoid functionF(net)210-10 -5 -1 0 5 10-2net


Neural Networks•Examples of neural networksMulti-layer feedforward neural networks


Neural NetworksΔΔRecurrent neural networks


Neural Networks•Modes of Operation of NeuralNetworks•Training (Learning) Mode–Supervised Learning–Unsupervised Learning•Deploying Mode


Neural Networks for <strong>clustering</strong>• Kohonen Neural Networks or KohonenSelf-Organizing Maps• Adaptive Resonance Theory 1 (ART1)• Adaptive Resonance Theory 2 (ART2)


Kohonen neural networksVisual cortex:areas 17,18,19auditory cortex:areas 41 and 42


Kohonen neural network Architecture


Kohonen neural network Architecturex 1y 1y px 2w m2w m1x 3y mw m3 y m-1w mnx n


Learning Mode•Competitive Process•y m is the winning node ifym=(wm1−x1)2+(wm2−x2)2+ ⋅⋅⋅+(wmn−xn)2=mini=1,2,3,...,p( )222− ) + − ) + ⋅⋅⋅+ − )(wi1x1(wi2x2(winxn


Learning Modewand m∈hwnewminewjiwhere h• Adaptive Process= wold k+ α mi m=wkkoldji, for j∉haround the wining node( )old−xikwmifor i = 1,2,3,..., nis the neighborhood function centered,


Neighborhood function


Neighborhood function


Kohonen’s Learning Algorithm•Step 0: choose random number for theinitial weights.•Step 1: While Stop condition is false,do step 2-8•Step 2: for each input vector. do step3-5


Kohonen’s Learning Algorithm•Step 3: for each output node, Computey(w (w (w −222m= − x ) + −x) + ⋅⋅⋅ + x )m1 1m2 2mn n•Step 4: find index J such that y J is aminimum


Kohonen’s Learning Algorithm•Step 5: For all output node within aspecified neighborhood of J nodewnewmiold= w + α mikm( )old−xiwmi


Kohonen’s Learning Algorithm•Step 6: Update Learning Rate•Step 7: Reduce radius of topologicalneighborhood at specified times•Step 8: Test Stopping Condition


Example: Kohonen neural networkw 11w 21w 31w 12Y 1X 1Y 2wX222w 32Y 3X =[0.1] W0.2 1 =0.50.10.1W 2 =0.10.20.1


Example: Kohonen neural network•Competitive ProcessY 1 = (0.5-0.1) 2 + (0.1-0.2) 2 = 0.17Y 2 = (0.1-0.1) 2 + (0.2-0.2) 2 = 0Y 3 = (0.1-0.1) 2 + (0.1-0.2) 2 = 0.01∴ Y 2= Winner Node


Example: Kohonen neural network•Adaptive Processw 21 = 0.1 + 0.5 (0.1-0.1) = 0.6w 22 = 0.2 + 0.5 (0.2-0.2) = 0.70.50.1∴W 1 =0.6W 2 =0.70.10.1


Learning Mode22Yw1= (0.5-1) + (1-1)11 yx11w 12x w 212y 2w 22→X=→1⎡1⎤⎢ ⎥ , W⎣1⎦=⎡w⎢⎣w1112⎤⎥⎦=⎡.5⎤⎢ ⎥⎣1⎦


wLearning Modewwnew( )+ ( 0.5) x wwold 11 x-w 1new old=−1 1old1x-w 1


Deploying Kohonen neural networkx 1x 2w m2w m1x 3w m3 y m-1w mny 1y my px n


Strengths of cluster detection•Cluster detection is an undirectedknowledge discovery technique.•Cluster detection is easy to apply


Weaknesses of cluster detection•Sensitivity to initial parameters•It can be hard to interpret the resultingclusters


References•Anil K. Jan, Richard C. Dubes,Algorithms for Clustering Data,Prentice Hall, 1988.•Eric Backer, Computer-AssistedReasoning in Cluster Analysis, PrenticeHall, 1995.


References•Maria Halkidi and etc., On ClusteringValidation Techniques, J. of IntelligentInformation Systems pp 107-145, 2001.•Jain A.K., Murty M.N., and Flynn P.J.,Data Clustering: A Review, ACMComputing Survey, Vol. 31, No. 3,1999.


References•Huang Zhexue, Extensions to the k-<strong>means</strong> Algorithm for Clustering LargeData Sets with Categorical Values,Data mining and Knowledge Discovery2, pp. 283-304, 1998.


References•Jacek M. Zurada, Artificial Neuralsystems, West Publishing Company,1992.•http://www.ncs.co.uk/nn_intro.htm•IEEE Trans. On Knowledge and DataEngineering, v. 8, n. 6, 1996.


References•Joseph S. Zirilli, Financial Predictionusing Neural Networks, InternationalThomson Computer Press, 1997.•Laurence Fausett, Fundamentials ofNeural Networks, Prentice Hall, 1994.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!