Journal of Software - Academy Publisher

More documents

Recommendations

Info

882 JOURNAL OF SOFTWARE, VOL. 6, NO. 5, MAY 2011 w = n n ∑∑ i= 1 k= 1 j p n n ∑∑∑ j= 1 i= 1 k= 1 | z − z | ij kj | z − z | ij kj where wj denotes weight of the jth index, n zij= xij ∑ x i= 1 ij ( i = 1,2, ⋅⋅⋅ , n; j = 1,2, ⋅⋅⋅ , p) To enhance the stability of clustering result, a determinate initial clustering center is desired prior to performing clustering algorithm. In this paper, the relation matrix R is constructed by employing the included angle cosine formula (4), and the samples are classified by threshold partition method. Though the classifying effect is not well, the classification result can provide a decided initial clustering center for improved FKCM clustering algorithm. r ij = p ∑ | x x | k = 1 ik jk p p 2 2 ( ∑x )( ) ik ∑x jk k= 1 k= 1 (3) (, i j = 1,2, � , n) (4) where r ij denotes the similarity between the ith and the jth sample. B. FKCM algorithm However, the FCM clustering analysis is somewhat limited in real world problems and nonlinear clustering analysis would be highly desirable. An efficient method of obtaining the nonlinear cluster algorithm is to first map p the patterns of input space Ω into some higher q dimensional feature space Ω using a kernel function φ() ⋅ , the FCM can be performed in this feature space. When the kernel function is chosen, the Euclid distance between i x and x j in feature space 1/2 is dˆ ij( xi, x j) = [ K( xi, xi) − 2 K( xi, x j) + K( xj, x j)] , i, j = 1,2, � , n. Let V be clustering center matrix in input space, V= ( v , v , , v ) v = ( v , v , �, v ),( i = 1,2, � , c) . 1 2 � c i i1 i2 ip Let Û be membership matrix in feature, Uˆ= ( uˆ ˆ ˆ 1, u2, � , un) , uˆ ˆ ˆ ˆ i = ( ui1, ui2, �, uic),( i = 1,2, � , n) . Hence, Objective function of FKCM clustering algorithm in feature space is m( ; , c n ) = ∑∑ j= 1 i= 1 m ˆ 2 ji ij Jˆ XUD ˆ ˆ uˆd , 2 < c < n (5) New clustering center vectors in feature space are © 2011 ACADEMY PUBLISHER j j n ukj m i n ukj m k= 1 i= 1 vˆ φ( v ) ( ˆ ) φ( x )/ ( ˆ ) , j = 1, 2, � , c (6) = =∑ ∑ The φ( xi ) is dropped in (6). Unfortunately, the mapping function φ() ⋅ may not be known explicitly and if the dimension of the feature space q Ω is very high or infinite, it is difficult to solve for objective function by (6). To get around this difficulty, the problem is reformulated to involve only the dot product of the patterns x i ( i= 1, 2, � , n) in the feature space. K n m ∑( uˆkj ) K( xk, xi) ( , ˆ i j) = φ( i) ⋅ φ ( k = 1 j) = n m ∑ ( uˆ kj ) k = 1 n n ∑∑ uˆkj m uˆtj m K xkxt j j = φ j ⋅ φ j k= 1 t= 1 = n 2 ⎛ m ⎞ ⎜∑( uˆ kj ) ⎟ k = 1 x v x v (7) K ( vˆ , vˆ ) ( v ) ( v ) uˆ = = ij c j= 1 (1 / dˆ ( x , vˆ )) j= 1 2 1/( m−1) ij i j (1 / dˆ ( x , vˆ )) 2 1/( m−1) ij i j ( ) ( ) ( , ) ⎝ ⎠ (1 / ( K( x , x ) − 2 K( x , vˆ ) + K( vˆ , vˆ ))) c ∑ ∑ i i i j j j (1 / ( K( x , x ) − 2 K( x , vˆ ) + K( vˆ , vˆ ))) i i i j j j 1/( m−1) 1/( m−1) When ˆ 2 d ( , ˆ ij xiv j ) = 0, uˆ 1, ˆ ij = uit = 0, ( t∈[1, j) ∪ ( j, c]) . ˆ 1/2 d ( , ˆ ) [ ( , ) 2 ( , ˆ ) ( ˆ , ˆ ij xiv j = K xixi− K xiv j + K v j v j )] , i = 1, 2, �, n; j = 1, 2, �, c (10) To acquire the optimization membership degree matrix * Û and corresponding distance matrix * ˆD , the equation of ˆ() l ˆ( l 1) | J J | − − must be convergent, that is, equation of lim Jˆ ( ; ˆ, ˆ m XUD) comes into existence [18]. Hence, the l→∞ variance ε can be set at a random small value, the initial distance matrix and the initial membership matrix are known, then the iterative algorithm can be performed by () ( 1) (5), (7), (8), (9), and (10), if ˆ l | ˆ l− J − J | < ε , the iterative algorithm ceases, and the constraint optimization membership matrix () ˆ l U and the constraint optimization () distance matrix ˆ l D can be acquired, finally, samples is classified in terms of maximum membership principle. When dot product operation is performed between pattern and indexes weight vector in kernel function of the FKCM clustering algorithm, the improved FKCM clustering algorithm can be obtained. Furthermore, the statistic F in (11) can be utilized for acquiring the optimal classification amount. Statistic F is a conventional index for evaluating the clustering validity and the nonlinear factors of data set are not considered, hence the KF index is constructed by using kernel function and shown in (12) for evaluating more efficiently the clustering validity . (8) (9)
JOURNAL OF SOFTWARE, VOL. 6, NO. 5, MAY 2011 883 ( n−c) ni || xi −x|| SSA ( c −1) i= 1 F = = c ni SSE ( n − c) ( c −1) || x −x || c ∑ ∑∑ i= 1 t= 1 it i c 2 ( n−c) ˆ ∑ nd i ( xi, x) i= 1 c− c ni 2 ∑∑d xit xi i= 1 t= 1 SSA ( c −1) KF = = SSE ( n − c) ( 1) ˆ ( , ) 2 2 (11) (12) Where: KF is statistic of samples vectors in feature space, SSA is between class variance, SSE is inner-class variance, xi is the mean vector of the ith class samples vectors, x is the mean vector of the whole samples vectors, it x is the ith class and the tth sample vector, i n is amount of samples of the ith class. C. SVM Algorithm According to the above classification result, we can acquire a training set S = {( x1, y1),( x2, y2), � ,( xN, yN)} p of N data points, where i ∈Ω x is the ith input pattern and yi ∈ R is the ith output pattern. In most cases, the searching of a suitable hyperplane in an input space is too restrictive to be of practical use. Hence, suppose m { ϕ j ( x)} j= 1 represents the nonlinear transfer set from the input space to feature space, where m is the dimension of feature space. Hence, a decision hyperplane in feature can be defined as y i T i[ w ϕ ( x ) + b] ≥ 1, i 1, �, N, = (13) Where ϕ (⋅) is a kernel function which maps the input space into a higher dimensional space, m and n are, respectively, the dimensions of the input space and feature space. However, this function is not explicitly constructed. In order to have the possibility to violate (15), in case a separating hyperplane in this higher dimensional space does not exist, slack variables ξ k are introduced such that T ⎧y i[ w ϕ( xi ) + b] ≥ 1− ξi , i = 1, �, N ⎨ ⎩ξi ≥ 0, i = 1, �, N (14) Subsequently, according to the structural risk minimization principle, the risk bound is minimized by considering the optimization problem 1 w T Minimize ⋅ w + C∑ ξ i (15) 2 = 1 subject to (14). Where C is a constant and can be regarded as a regularization parameter. Tuning this parameter can obtain a balance between margin maximization and classification violation. In order to © 2011 ACADEMY PUBLISHER i N solve the constraint optimal problem, one constructs the Lagrangian and transformed into the dual ⎧Maximize ⎪ N N ⎪ 1 i= 1, �, N ⎪W( α) = ∑αi − ∑αα i iyiyK i ( xi, xj) ⎨ i= 1 2 i, j= 1 ⎪ N ⎪ Subject to ∑ yiαi = 0, 0 ≤αi ≤ C, i = 1, �, N ⎪⎩ i= 1 (16) Searching the optimal hyperplane in (15) is a quadratic programming (QP) problem, according to the Kuhn- Tucker theorem, the solution of the optimal problem must satisfies the equality ⎧αi( yi( xw i i + b) − 1 + ξi) = 0 ⎨ ⎩( C − αi) ξi = 0 , i = 1, �, N (17) In (12), α i is zero for most of samples, when the non- zero values α i are satisfied with the equality sign in (14), the pattern x i corresponding with α i is called support vector. * If α is the optimal solution in (16), then w * = N * ∑ α i yi xi (18) i= 1 That is the weight coefficient vector of optimal classification hyperplane is linear combination of training pattern vectors. After solving the above problems, the optimal classification function can be acquired as N ⎛ * * ⎞ f ( x) = sign⎜∑ yiα i K( xi ⋅ x) + b ⎟ (19) ⎝ i= 1 ⎠ IV. RESULET AND DISCUSSION In order to evaluate the performance of BSE and verify effect of these algorithms, the improved FKCM and SVM were performed on IRIS dataset and measurement dataset respectively. Matlab7.01 software was utilized for data processing and analyzing. A. Pre-processing of measurement dataset After data normalization, procedures of correlation analysis, weight computation, and initial clustering center were performed on IRIS dataset and measurement dataset in turn. Correlation coefficients of IRIS dataset were shown in Tab. II and Tab. III by using (2). However, some indexes with strong correlation can be eliminated by Pearson Correlation coefficient analysis method. Because of strong correlation in indexes, 4 z , 5 z and 4 z′ can be deleted in data analysis process. Furthermore, index weight coefficient of measurement dataset and IRIS dataset was, respectively, WM = (0.2399 0.2858 0.2967 0.1776) and WIRIS = (0.2374 0.1785 0.2785 0.30) by using (3). FKCM and SVM Performed on IRIS Dataset
Page 1 and 2:
Journal of Software ISSN 1796-217X
Page 3 and 4:
JOURNAL OF SOFTWARE, VOL. 6, NO. 5,
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 82 and 83:
826 JOURNAL OF SOFTWARE, VOL. 6, NO
Page 84 and 85:
Page 86 and 87:
Page 88 and 89: 832 JOURNAL OF SOFTWARE, VOL. 6, NO
Page 97 and 98: JOURNAL OF SOFTWARE, VOL. 6, NO. 5,
Page 137: JOURNAL OF SOFTWARE, VOL. 6, NO. 5,
Page 189 and 190:
Page 191 and 192:
Page 193 and 194:
Page 195 and 196:
Page 197 and 198:
Page 199 and 200:
Page 201 and 202:
Page 203 and 204:
Page 205 and 206:
Page 207 and 208:
Aims and Scope. Call for Papers and
show all

Journal of Software - Academy Publisher

Create successful ePaper yourself

Delete template?

Save as template?