13.07.2015 Views

A Queen Lead Ant-Based Algorithm for Database Clustering

A Queen Lead Ant-Based Algorithm for Database Clustering

A Queen Lead Ant-Based Algorithm for Database Clustering

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The Pennsylvania State UniversityThe Graduate SchoolCapital CollegeA <strong>Queen</strong> <strong>Lead</strong> <strong>Ant</strong>-<strong>Based</strong> <strong>Algorithm</strong><strong>for</strong> <strong>Database</strong> <strong>Clustering</strong>A Master’s Paper inComputer SciencebyJayanthi Pathapatic○2004 Jayanthi PathapatiSubmitted in Partial Fulfillmentof the Requirements<strong>for</strong> the Degree ofMaster of ScienceDecember 2004


AbstractIn this paper we introduce a novel ant-based clustering algorithm to solvethe unsupervised data clustering problem. The main inspiration behind antbasedalgorithms is the chemical recognition system of ants. In our currentalgorithm, we introduce a new method of ant clustering in which the queenof the ant colony controls the colony <strong>for</strong>mation. We compared our resultsto the k-means method and to the <strong>Ant</strong>Clust algorithm on artificial and realdata sets.i


Table of ContentsAbstractAcknowledgementList of FiguresList of Tablesiiiiivv1 Introduction 12 <strong>Database</strong> <strong>Clustering</strong> 13 <strong>Algorithm</strong> 33.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.2 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63.3 <strong>Clustering</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.4 Final <strong>Clustering</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Per<strong>for</strong>mance Analysis 135 Conclusion 17References 17ii


AcknowledgementsI am very grateful to Dr. T. Bui <strong>for</strong> his constant support and guidancethroughout the period of this project. It has been a great pleasure andinspiration working with him. I would also like to thank the committeemembers: Dr. S. Chang, Dr. Q. Ding, Dr. P. Naumov, and Dr. L. Null <strong>for</strong>reviewing this paper. Finally I would like to thank my husband Dr. S.K.Daya <strong>for</strong> his support throughout this course.iii


List of Figures1 QLA algorithm <strong>for</strong> database clustering . . . . . . . . . . . . . 42 Initialization <strong>Algorithm</strong> . . . . . . . . . . . . . . . . . . . . . . 53 <strong>Algorithm</strong> <strong>for</strong> Training Stage . . . . . . . . . . . . . . . . . . . 84 Cluster Formation <strong>Algorithm</strong> . . . . . . . . . . . . . . . . . . 105 Rules followed when the ants meet . . . . . . . . . . . . . . . 116 Local Optimization <strong>Algorithm</strong> . . . . . . . . . . . . . . . . . . 127 <strong>Algorithm</strong> For Assigning Free <strong>Ant</strong>s . . . . . . . . . . . . . . . 13iv


List of Tables1 QLA-Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Results of all the algorithms . . . . . . . . . . . . . . . . . . . 16v


1 INTRODUCTION 11 Introduction<strong>Database</strong> clustering is one of the fundamental operations of data mining.The main idea behind it is to identify homogeneous groups of data, basedon their attributes. Obtaining the closest partition possible to the naturalpartition is a complex task because there is no a-priori in<strong>for</strong>mation regardingthe structure and the original classification of the data.Artificial ants have been applied to problems of database clustering. Theinitial pioneering work in this field was per<strong>for</strong>med by Deneunbourg[1] andLumer and Faieta[8]. Since the early work, many authors have used artificialants and proposed several data clustering techniques[9,10,11]. <strong>Ant</strong> basedclustering algorithms seek to model the collective behavior of the real antsby utilizing the colonial odor to <strong>for</strong>m clusters.Our current paper proposes a new ant-based clustering algorithm. Thisalgorithm incorporates several ant clustering features as well as local optimizationtechniques. Unlike the other ant clustering algorithms where theants <strong>for</strong>m the nests on their own, we introduce a queen <strong>for</strong> each colony totake care of the colony creation process. This algorithm does not require thenumber of expected clusters as part of the input to find the final partition.The rest of the paper is organized as follows. Section 2 outlines thedatabase clustering problem and its applications. Section 3 describes thealgorithm in detail. Section 4 contains the per<strong>for</strong>mance analysis and the experimentalresults comparing our algorithm against other known algorithms.The conclusion is given in Section 5.2 <strong>Database</strong> <strong>Clustering</strong>The aim of database clustering is to partition a heterogeneous data set intogroups of more homogenous characteristics. The <strong>for</strong>mation of clusters isbased on the principle of maximizing similarity between patterns in thesame cluster and simultaneously minimizing the similarity between patternsbelonging to distinct clusters. The <strong>Database</strong> <strong>Clustering</strong> problem could be


2 DATABASE CLUSTERING 2<strong>for</strong>mally defined as follows.Given a database of n objects, find the best partition of the data objectsinto groups such that each group contains objects which are similar to eachother.Several algorithms have been proposed <strong>for</strong> the clustering problem in publishedliterature. Some of them are: BIRCH[14], ISODATA[2,7], CLARANS[13],P-CLUSTER[6] and DBSCAN[3].Most of the clustering algorithms belong to two main categories: hierarchicaland partitional. In hierarchical clustering algorithms, there aretwo main categories: Agglomerative and Divisive. In the Agglomerativeapproach, each object is assigned to an individual cluster and then similarclusters are merged to <strong>for</strong>m larger clusters. Repetition of this process leadsto the desired partition. On the other hand, Divisive algorithms begin withall the data objects placed in one cluster and then split them into smallerclusters until the final partition is obtained.In partitional clustering algorithms, given a set of data objects and apartition criterion the objects are partitioned into clusters based on the similaritiesof the objects. Furthermore, clustering is per<strong>for</strong>med to meet thespecified criterion, such as minimization of the sum of squared distance fromthe means within each cluster.Fuzzy clustering and the K-means clustering are two well known clusteringtechniques. In fuzzy clustering, an object can belong to many clusters.Each object is associated with a cluster membership that represents the likelinessof the object being in that cluster.The K-means algorithm starts by choosing K cluster centers at random.It then assigns each object to the closest cluster possible and computethat cluster’s center. This step is repeated until the cluster’s centers do notchange, thus obtaining the desired partition.Association between data attributes in a given database can be found byutilizing techniques of data clustering. This plays a major role in the dataanalysis procedures. <strong>Database</strong> clustering has been applied in various fieldsincluding biology, psychology, commerce, and in<strong>for</strong>mation technology. The


3 ALGORITHM 3main areas where it has been prominently used are: image segmentation[13],statistical data analysis[4], and data compression[14] among others.3 <strong>Algorithm</strong>As we mentioned earlier, the main aim of the <strong>Queen</strong> <strong>Lead</strong> ant-based (QLA)algorithm is to partition the given data set into clusters of similar objects.The idea of the algorithm is to simulate the way real ants <strong>for</strong>m the colonies,which are queen dominant. The ants that possess similar cuticle odor to thatof the queen of a colony will join that colony. Each object of the database istreated as an ant and it follows a common set of rules[Figure 5], which leadsto the required partition. A local optimization technique is applied to assistthe ants in achieving good partition.The input <strong>for</strong> the algorithm is just the database to be clustered. It doesnot need any other additional parameters such as the number of desiredclusters or any in<strong>for</strong>mation about the structure of the database. Each objectin the database has a certain number of attributes, and attributes can be ofeither numeric or symbolic types. The QLA algorithm, given in Figure 1,consists of four main stages. The first stage is the initialization stage in whichall the ants are initialized to the attributes of the object they represent. Thesecond stage is the training stage in which the ants learn about their possiblecolony mates by meeting and learning about other ants.The third stage consists of two phases: cluster <strong>for</strong>mation and the localoptimization. In the cluster <strong>for</strong>mation phase, we stimulate meetings betweenants so that the ants can decide which colony they belong to, thus givingus the clusters of similar ants. In the second phase, the local optimizationstep is designed to eliminate any ants that have been prematurely added toa colony even though they should not belong to it. It also checks to see ifthe queen represents a colony better than any other ant in that colony.The last stage is the final clustering stage in which the free ants, i.e.,the ants which do not belong to any colony, are added to the closest colonypossible. We describe all the stages of the algorithm in greater detail in the


3 ALGORITHM 4QLA <strong>Algorithm</strong>Input: <strong>Database</strong> to be clusteredStage 1: InitializationAssign one ant <strong>for</strong> each object in the databaseStage 2: TrainingTrain the ants so that they learn about other antsStage 3: <strong>Clustering</strong> ProcessColony <strong>for</strong>mation process and local optimizationStage 4: Final <strong>Clustering</strong>Assign all ants which do not belong to any colony tothe closest colony availablereturn the clusters <strong>for</strong>medFigure 1: QLA algorithm <strong>for</strong> database clustering


3 ALGORITHM 5Initialization<strong>for</strong> each ant dogenome = the index of the object it representscolony = -1Assign all of its attribute values with the values of theobject it representsend-<strong>for</strong>Figure 2: Initialization <strong>Algorithm</strong>following section.3.1 InitializationThe algorithm starts by assigning each artificial ant to each of the objectsin the given input database, and then assigns all the ants with the attributevalues of the object it represents. Each ant contains the following parameters.The genome parameter of the ant corresponds to an object of the dataset. It is initialized to the index of the object that the ant represents. Eachant represents the same object during the whole process of clustering, thusthe genome of an ant remains unchanged to the end of the algorithm.The colony parameter indicates the group to which the ant belongs and issimply coded by a number. This attribute is assigned the value −1 becausethe ant initially doesn’t belong to any colony, i.e., all the ants at the beginningof the algorithm are free ants.The initialization stage of the algorithm is accomplished as shown inFigure 2.


3 ALGORITHM 7and 1. This threshold value is used to determine how well the ants accepteach other. We have defined the threshold value of an ant to fall betweenthe mean and the maximum similarity values of an ant. Every time an antmeets another ant in this period, we find the similarities between the two antsand update the respective mean and maximum similarities of the ants. Soevery time an ant has a meeting, its threshold value changes. The followingequation shows how this threshold is learned:T ai ← MeanSim(a i) + MaxSim(a i ).2The MeanSim(a i ) in the above equation is the mean similarity of the ant a iand is calculated as follows:MeanSim(a i ) = 1|N i |∑a j ɛN iSim(a i ,a j )where N i is the set of ants that a i has met so far. The MaxSim(a i ) is themaximum similarity of the ant a i and is calculated as follows:MaxSim(a i ) = max{Sim(a i ,a j )}where a j is any ant with which ant a i had a meeting with.3.3 <strong>Clustering</strong>There are two phases in this stage. The first phase is the cluster <strong>for</strong>mationphase in which clusters are created. During this phase, meetings betweenrandom ants are stimulated and the behavior between the ants is describedin Figure 5. After every α (6 × n ANTS ) number of iterations in this step,we run a local optimization algorithm on the clusters we have found so far.This stage is per<strong>for</strong>med until at least β number of ants join their colonies,or <strong>for</strong> γ times, where β is 50% and γ is 40 × n ANTS . This is done as shownin Figure 4. Tests were per<strong>for</strong>med with different values of α, β and γ. Formany data sets, best partitions were achieved with the above given values ofα, β, and γ.


3 ALGORITHM 8Training Stage<strong>for</strong> i = 0 to δSelect 2 ants randomlyFind the similarity between them andUpdate their threshold and the similarity valuesIncrement their ages by 1end-<strong>for</strong>Figure 3: <strong>Algorithm</strong> <strong>for</strong> Training StageWhen two ants meet in this stage, the threshold value is used to make thedecision whether two ants accept each other or not. The Acceptance(a i ,a j )<strong>for</strong> ants a i and a j is defined as follows:A(a i ,a j ) ≡ (Sim(a i ,a j ) > 0.96 ∗ T ai ) ∧ (Sim(a i ,a j ) > 0.96 ∗ T aj )The Acceptance(a i ,queen j ) <strong>for</strong> ant a i and queen queen j is defined as follows.A(a i ,queen j ) ≡ (Sim(a i ,queen j ) > 0.98 ∗ T ai )∧ (Sim(a i ,queen j ) > 0.98 ∗ T queenj )The Acceptance(queen i ,queen j ) <strong>for</strong> queens queen i and queen j is defined asfollows.A(queen i ,queen j ) ≡ (Sim(queen i ,queen j ) > 0.94 ∗ T queeni )∧ (Sim(queen i ,queen j ) > 0.94 ∗ T queenj )


3 ALGORITHM 9The Acceptance(a i ,a j ) is defined differently between the ants and thequeens. When two ants or two queens meet, they mutually accept each otherif the similarity between them is little less or greater than their thresholdvalues. However when an ant meets a queen of a colony, both similarityvalues should be almost close to or greater than their threshold values.The threshold and the similarity values are calculated using the same<strong>for</strong>mulas as shown in the training stage. The MeanSim(a i ) and MaxSim(a i )of the ant a i are calculated using different <strong>for</strong>mulas because as the ants getolder, they update their threshold values less frequently, and the chances thatthey move from one colony to another is quite less. The following equationsare used to update a i ’s mean and maximum similarities when it meets anta j :MeanSim(a i ) =Age a iAge ai + 1 MeanSim(a 1i) +Age ai + 1 Sim(a i,a j )MaxSim(a i ) = 0.8 ∗ MaxSim(a i ) + 0.2 ∗ Sim(a i ,a j )only ifSim(a i ,a j ) > MaxSim(a i )The next phase in this stage is the Local Optimization Phase. The LocalOptimization algorithm is described in Figure 6. The aim in doing local optimizationis to make sure that the queen of a colony is the best representativeof the colony and to remove the ants that should not belong to that colony.The reasoning behind the need <strong>for</strong> the queen being the best representativeof the colony is that crucial decisions, like addition and deletion of ants,are made by the queen. It also allows us to have a better partition withouthaving to have more meetings between an ant and its potential colony mates.At the end of this stage, we have most of the ants in their respectivecolonies. We check if any two queens are similar to each other and mergethem if they accept. In the next stage, we address the clustering of theremaining ants.


3 ALGORITHM 10Cluster FormationUntil 50% of the total ants find their colonies or<strong>for</strong> i = 0 to γ doRandomly select 2 antsStimulate a meeting between themFollow the rules in the meeting algorithm andtake appropriate actionif α is 0Run the local optimization algorithmend-<strong>for</strong> orend-until<strong>for</strong> each colony C doCheck if the queen of colony C is similar to anyother colony’s queen and merge them if they acceptend-<strong>for</strong>Figure 4: Cluster Formation <strong>Algorithm</strong>


3 ALGORITHM 11Rules when two ants X and Y meetCase 1: If both of the ants do not belong to any colony andif they accept each otherCreate a new colony and add these two ants to thatcolony and make one of them the queen of that colonyCase 2: If one of the ants belongs to a colony A and theother is freeIf they accept each other and the queen of the colony Aaccepts the free antAdd the free ant to the colony A and update thequeen of that colony with the in<strong>for</strong>mation that the newant is being addedCase 3: If both ants are present in the same colony AIf they accept each otherleave them as they areElse (they do not accept each other)Check if the queen of the colony A accepts the antsDelete the ants that the queen has not accepted andin<strong>for</strong>m the queen about the removal of the antsCase 4: If the ants are present in different colonies A and BIf the ants accept each other and the queens of thecolonies A and B accept each otherMerge both colonies into one colony and makethe queen of the bigger colony the queen of the newcolony <strong>for</strong>medFigure 5: Rules followed when the ants meet


3 ALGORITHM 12Local Optimization<strong>for</strong> each colony dofind the ant a that is most similar to all ofthe rest of the ants in that colony. This is done ina two step process as follows:Step 1:<strong>for</strong> each ant a in the nest doFind the similarities between the ant aand all of other ants in the colony. Calculate theaverage of all those similarities to obtain theaverage similarity of ant aend-<strong>for</strong>Step 2:Pick the ant a with the highest average similarityin the colonyIf ant a is not the present queen of thatcolony then make a the queenFind the average similarity of the colony and remove theants that has the average similarity less than0.97 ∗ CS , where CS is the colony’s average similarityend-<strong>for</strong>Figure 6: Local Optimization <strong>Algorithm</strong>


4 PERFORMANCE ANALYSIS 13Final <strong>Clustering</strong><strong>for</strong> each free ant doFind the similarities between the free ant and all thequeens of the available colonies. Then find the mostsimilar queen and assign the free ant to the colonyto which that queen belongsend-<strong>for</strong><strong>for</strong> each colony C doCheck if the queen of colony C is similar to anyother colony’s queen and merge them if they acceptend-<strong>for</strong>Figure 7: <strong>Algorithm</strong> For Assigning Free <strong>Ant</strong>s3.4 Final <strong>Clustering</strong>There might be some free ants that have not yet joined any colony. For eachof the free ants, we find the closest colony possible and add the ant to thatcolony. The algorithm <strong>for</strong> assigning free ants is described in Figure 7.4 Per<strong>for</strong>mance AnalysisThe algorithm was tested on some artificially generated databases and somereal databases from the Machine Learning Repository[15]. These databasesare used as a benchmark as they have been used to test a number of differentant-based clustering algorithms. Thus the results can be compared to other


4 PERFORMANCE ANALYSIS 14algorithms. The databases that we used have up to 1473 objects and 35attributes. The artificial data sets Art1, Art2, Art3, Art4, Art5 and Art6,are generated according to Gaussian or uni<strong>for</strong>m laws in such a way thateach poses a unique situation <strong>for</strong> the algorithm to handle in the process ofclustering .Generally the real data sets are more difficult to cluster because they maybe noisier than the artificial data sets. The real data sets that we used totest our algorithm are Soybean, Iris, wine, Thyroid, Glass, Haberman, Ecoli,Pima and CMC.The Soybean data set has the data of four kinds of soybean diseases. Thedata set contains 45 instances, each with 35 attributes. This data set givesthe ability to test the algorithm on a data set with many attributes.The Iris data set contains 3 classes of 50 instances each, where each classrefers to a type of Iris plant. The wine data set is the result of an analysisof wines grown at a particular place. It contains 178 instances in 3 classes.The Thyroid data set contains 214 instances of diagnosis of thyroid. It hasthree classes each representing a kind of a thyroid disorder(like hyperthyroidand hypothyroid).The Glass data set is obtained from the study of classifications of types ofglass. This data set contains 9 classes, hence it is used to test the algorithm<strong>for</strong> the data set that has records from many classes. The Haberman data setis obtained from a study on survival of patients who had undergone surgery<strong>for</strong> breast cancer. It has 306 instances with 2 classes in it.The Ecoli data set contains in<strong>for</strong>mation about localization sites of protein.It has 8 classes and 336 instances in it. The Balance-Scale data set wasgenerated to model psychological experimental results. It has 625 instancesin 3 classes. The Pima data set has 798 instances of the diabetes diagnosis<strong>for</strong> the Pima Indian heritage women, the data set has 2 classes in it. TheCMC data set was obtained from 1987 National Indonesia ContraceptivePrevalence Survey. It has 1473 instances in 3 classes.To test the per<strong>for</strong>mance of our algorithm, we have used the classificationsuccess C s measure developed by Fowlkes and Mallows[5] and modified by


4 PERFORMANCE ANALYSIS 15Labroche et.al.[9]. This measure allows us to compare the clusters obtainedby our algorithm to the real clusters in the database. We denote the numberof objects in the database by N. The following equation has been used tofind the clustering success C s :C s = 1 −2N(N − 1)∑i,j∈1..N 2 ,i


4 PERFORMANCE ANALYSIS 16<strong>Database</strong> O A N MeanCS[Std] MeanNC[Std] AvgRTSoybean 47 35 4 0.92[0.03] 5.84[0.70] 0.02Iris 150 4 3 0.84[0.02] 4.98[1.00] 0.22Wine 178 13 3 0.81[0.12] 4.56[1.2] 0.26Thyroid 215 5 3 0.80[0.07] 4.92[1.26] 4.87Glass 214 9 7 0.65[0.03] 6.62[1.61] 5.39Haberman 306 3 2 0.94[0.01] 3.24[0.90] 0.63Ecoli 336 8 8 0.79[0.06] 7.74[2.12] 4.80Balance-Scale 625 4 3 0.60[0.01] 11.09[1.23] 0.30Pima 798 8 2 0.54[0.01] 3.34[1.30] 0.90CMC 1473 9 3 0.61[0.00] 15.16[2.04] 0.77Art1 400 2 4 0.84[0.01] 7.46[1.15] 0.71Art2 1000 2 2 0.78[0.04] 5.9[1.66] 34.28Art3 1100 2 4 0.80[0.03] 6.9[1.72] 40.20Art4 200 2 2 0.74[0.02] 6.04[1.32] 0.24Art5 900 2 9 0.85[0.02] 8.54[1.25] 2.77Art6 400 8 8 0.95[0.02] 6.52[1.40] 6.41Table 1: QLA-Results<strong>Database</strong> O A N QLA <strong>Ant</strong>Clust K-meansMeanCS[Std] MeanCS[Std] MeanCS[Std]Soybean 47 35 4 0.92[0.03] 0.93[0.04] 0.91[0.08]Iris 150 4 3 0.84[0.02] 0.78[0.01] 0.86[0.03]Thyroid 215 5 3 0.80[0.07] 0.84[0.03] 0.82[0.00]Glass 214 9 7 0.65[0.03] 0.64[0.02] 0.68[0.01]Pima 798 8 2 0.54[0.01] 0.54[0.01] 0.56[0.00]Art1 400 2 4 0.84[0.01] 0.78[0.03] 0.89[0.00]Art2 1000 2 2 0.78[0.04] 0.93[0.02] 0.96[0.00]Art3 1100 2 4 0.80[0.03] 0.85[0.02] 0.78[0.02]Art4 200 2 2 0.74[0.02] 0.77[0.05] 1.00[0.00]Art5 900 2 9 0.85[0.02] 0.74[0.02] 0.91[0.02]Art6 400 8 8 0.95[0.02] 0.95[0.01] 0.99[0.04]Table 2: Results of all the algorithms


5 CONCLUSION 175 ConclusionThis paper proposed a new ant-based clustering algorithm. During the processof clustering, the queen of the colony plays an important role in itsconstruction because it better represents the colony than its colony mates.This method, coupled with local optimization, results in a better partitionof the objects. The results obtained by this algorithm were close to the bestknown results <strong>for</strong> the set of benchmark databases.Future work could be done on clustering the given data based on somegiven criteria. Another idea is to make the problem distributed by dividingthe data set into subsets and find the solution <strong>for</strong> all the subsets in parallel.Then from the sub solutions, we could try to find the solution <strong>for</strong> the wholedata set. This could make the algorithm even more competitive.References[1] J.L. Deneubourg, S. Goss, N. Franks, A. Sendova-Franks, C. Detrain,and L. Chretien, “The Dynamic of Collective Sorting Robotlike<strong>Ant</strong>s and <strong>Ant</strong>-like Robots,” Proceedings of the first Conference onSimulation of Adaptive Behavior, J.A. Meyer and S.W. Wilson, MITPress/Brad<strong>for</strong>d Books, 1990, pp. 356–363.[2] R.C. Dubes and A.K. Jain, “<strong>Algorithm</strong>s <strong>for</strong> <strong>Clustering</strong> Data,” Prentice-Hall, Englewood Cliffs, NJ, 1988.[3] M. Ester, H.P. Kriegel, H. Sander, and X. Xu, “A Density-<strong>Based</strong> <strong>Algorithm</strong><strong>for</strong> Discovering Clusters in Large Spatial Datasets with Noise,”Proceedings od Second International Conference on Knowledge Discoveryand Data Mining, Portland, Oregon, 1996, pp. 1232–1239.[4] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth ,“KDD Process <strong>for</strong> ExtractingUseful Knowledge from Volumes of Data,” Communications ofthe ACM, Nov. 1996, 39(11), pp. 27–34.


REFERENCES 18[5] J. Heer and E. Chi,“Mining the Structure of User Activity Using ClusterStability,” in Proceedings of the Workshop on Web Analytics, SIAMConference on Data Mining, Arlington, VA, April. 2002.[6] D. Judd, P. McKinley and A. Jain, “Large-Scale Parallel Data <strong>Clustering</strong>,”IEEE Transactions on Pattern Analysis and Machine Intelligence20(8), August 1998, pp. 871–876.[7] L. Kaufmann and P.J. Rousseeuw, “Finding Groups in Data: An Introductionto Cluster Analysis,” John Wiley and Sons, Chichester, 1990.[8] E.D. Lumer and B. Faieta, “Diversity and Adaptation in Populationsof <strong>Clustering</strong> <strong>Ant</strong>s, Simulation of Adaptive Behavior,” From Animals toAnimats 3, D.Cliff, P.Husbands, J.A. Meyer, S.W. Wilson, MIT-Press,1994, pp. 501–508.[9] N. Labroche, N. Monmarch, and G. Venturini, “<strong>Ant</strong>Clust: <strong>Ant</strong> <strong>Clustering</strong>and Web Usage Mining,” Proceedings of the Genetic and EvolutionaryComputation Conference (GECCO), 2003, pp. 25–36.[10] N. Labroche, N. Monmarch, and G. Venturini, “A New <strong>Clustering</strong> <strong>Algorithm</strong><strong>Based</strong> on the Chemical Recognition System of <strong>Ant</strong>s,” Proceedingsof the European Conference on Artificial Intelligence, Lyon, France,2002, pp. 345–349.[11] N. Monmarche, “On data clustering with artificial ants,” Freitas A., Ed.,AAAI-99 and GECCO-99 Workshop on Data Mining with Evolutionary<strong>Algorithm</strong>s: Research Directions, Florida, 1999, pp. 23–26.[12] R. Ng and J. Han,“Efficient and Effective <strong>Clustering</strong> Methods <strong>for</strong> SpatialData Mining,” Proceedings of International Conference on Very LargeData Bases, Santiago, Chile, Sept. 1994, pp. 144–155.[13] B.J. Schachter, L.S. Davis, and A. Rosenfeld, “Some Experiments inImage Segmentation by <strong>Clustering</strong> of Local Feature Values,” PatternRecognition 11(1), 1979, pp. 19–28.


REFERENCES 19[14] T. Zhang, R. Ramakrishnan, and M. Livny ,“BIRCH: A New Data <strong>Clustering</strong><strong>Algorithm</strong> and its Applications,” Data Mining and KnowledgeDiscovery 1(2), 1997, pp. 141–182.[15] ftp://ftp.ics.uci.edu/pub/machine-learning-databases/

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!