08.12.2012 Views

Journal of Software - Academy Publisher

Journal of Software - Academy Publisher

Journal of Software - Academy Publisher

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

882 JOURNAL OF SOFTWARE, VOL. 6, NO. 5, MAY 2011<br />

w<br />

=<br />

n n<br />

∑∑<br />

i= 1 k=<br />

1<br />

j p n n<br />

∑∑∑<br />

j= 1 i= 1 k=<br />

1<br />

| z − z |<br />

ij kj<br />

| z − z |<br />

ij kj<br />

where wj denotes weight <strong>of</strong> the jth index,<br />

n<br />

zij= xij ∑ x<br />

i=<br />

1 ij ( i = 1,2, ⋅⋅⋅ , n; j = 1,2, ⋅⋅⋅ , p)<br />

To enhance the stability <strong>of</strong> clustering result, a<br />

determinate initial clustering center is desired prior to<br />

performing clustering algorithm. In this paper, the<br />

relation matrix R is constructed by employing the<br />

included angle cosine formula (4), and the samples are<br />

classified by threshold partition method. Though the<br />

classifying effect is not well, the classification result can<br />

provide a decided initial clustering center for improved<br />

FKCM clustering algorithm.<br />

r<br />

ij<br />

=<br />

p<br />

∑<br />

| x x |<br />

k = 1<br />

ik jk<br />

p p<br />

2 2<br />

( ∑x )( )<br />

ik ∑x<br />

jk<br />

k= 1 k=<br />

1<br />

(3)<br />

(, i j = 1,2, � , n)<br />

(4)<br />

where r ij denotes the similarity between the ith and the<br />

jth sample.<br />

B. FKCM algorithm<br />

However, the FCM clustering analysis is somewhat<br />

limited in real world problems and nonlinear clustering<br />

analysis would be highly desirable. An efficient method<br />

<strong>of</strong> obtaining the nonlinear cluster algorithm is to first map<br />

p<br />

the patterns <strong>of</strong> input space Ω into some higher<br />

q<br />

dimensional feature space Ω using a kernel<br />

function φ() ⋅ , the FCM can be performed in this feature<br />

space. When the kernel function is chosen, the Euclid<br />

distance between i x and x j in feature space<br />

1/2<br />

is dˆ ij( xi, x j) = [ K( xi, xi) − 2 K( xi, x j) + K(<br />

xj, x j)]<br />

,<br />

i, j = 1,2, � , n.<br />

Let V be clustering center matrix in input space,<br />

V= ( v , v , , v ) v = ( v , v , �, v ),( i = 1,2, � , c)<br />

.<br />

1 2 � c i i1 i2 ip<br />

Let Û be membership matrix in feature,<br />

Uˆ= ( uˆ ˆ ˆ<br />

1, u2, � , un)<br />

,<br />

uˆ ˆ ˆ ˆ<br />

i = ( ui1, ui2, �, uic),( i = 1,2, � , n)<br />

. Hence, Objective<br />

function <strong>of</strong> FKCM clustering algorithm in feature space<br />

is<br />

m( ; ,<br />

c n<br />

) = ∑∑<br />

j= 1 i=<br />

1<br />

m ˆ 2<br />

ji ij<br />

Jˆ XUD ˆ ˆ uˆd , 2 < c < n (5)<br />

New clustering center vectors in feature space are<br />

© 2011 ACADEMY PUBLISHER<br />

j<br />

j<br />

n<br />

ukj m<br />

i<br />

n<br />

ukj<br />

m<br />

k= 1 i=<br />

1<br />

vˆ φ( v ) ( ˆ ) φ(<br />

x )/ ( ˆ ) , j = 1, 2, � , c (6)<br />

= =∑ ∑<br />

The φ( xi ) is dropped in (6). Unfortunately, the mapping<br />

function φ() ⋅ may not be known explicitly and if the<br />

dimension <strong>of</strong> the feature space<br />

q<br />

Ω is very high or<br />

infinite, it is difficult to solve for objective function by<br />

(6). To get around this difficulty, the problem is<br />

reformulated to involve only the dot product <strong>of</strong> the<br />

patterns x i ( i= 1, 2, � , n)<br />

in the feature space.<br />

K<br />

n<br />

m<br />

∑(<br />

uˆkj ) K(<br />

xk, xi)<br />

( , ˆ i j) = φ( i) ⋅ φ ( k = 1<br />

j) = n<br />

m<br />

∑ ( uˆ<br />

kj )<br />

k = 1<br />

n n<br />

∑∑ uˆkj m<br />

uˆtj m<br />

K xkxt j j = φ j ⋅ φ j<br />

k= 1 t=<br />

1 =<br />

n<br />

2<br />

⎛ m ⎞<br />

⎜∑( uˆ<br />

kj ) ⎟<br />

k = 1<br />

x v x v (7)<br />

K ( vˆ , vˆ ) ( v ) ( v )<br />

uˆ<br />

=<br />

=<br />

ij c<br />

j=<br />

1<br />

(1 / dˆ<br />

( x , vˆ<br />

))<br />

j=<br />

1<br />

2 1/( m−1)<br />

ij i j<br />

(1 / dˆ<br />

( x , vˆ<br />

))<br />

2 1/( m−1)<br />

ij i j<br />

( ) ( ) ( , )<br />

⎝ ⎠<br />

(1 / ( K( x , x ) − 2 K( x , vˆ ) + K(<br />

vˆ , vˆ<br />

)))<br />

c<br />

∑<br />

∑<br />

i i i j j j<br />

(1 / ( K( x , x ) − 2 K( x , vˆ ) + K(<br />

vˆ , vˆ<br />

)))<br />

i i i j j j<br />

1/( m−1)<br />

1/( m−1)<br />

When ˆ 2<br />

d ( , ˆ ij xiv j ) = 0,<br />

uˆ 1, ˆ ij = uit = 0, ( t∈[1, j) ∪ ( j, c])<br />

.<br />

ˆ 1/2<br />

d ( , ˆ ) [ ( , ) 2 ( , ˆ ) ( ˆ , ˆ<br />

ij xiv j = K xixi− K xiv j + K v j v j )] ,<br />

i = 1, 2, �, n; j = 1, 2, �,<br />

c<br />

(10)<br />

To acquire the optimization membership degree matrix<br />

*<br />

Û and corresponding distance matrix * ˆD , the equation <strong>of</strong><br />

ˆ() l ˆ(<br />

l 1)<br />

| J J |<br />

−<br />

− must be convergent, that is, equation <strong>of</strong><br />

lim Jˆ<br />

( ; ˆ, ˆ<br />

m XUD) comes into existence [18]. Hence, the<br />

l→∞<br />

variance ε can be set at a random small value, the initial<br />

distance matrix and the initial membership matrix are<br />

known, then the iterative algorithm can be performed by<br />

() ( 1)<br />

(5), (7), (8), (9), and (10), if ˆ l<br />

| ˆ l−<br />

J − J | < ε , the<br />

iterative algorithm ceases, and the constraint optimization<br />

membership matrix () ˆ l<br />

U and the constraint optimization<br />

()<br />

distance matrix ˆ l<br />

D can be acquired, finally, samples is<br />

classified in terms <strong>of</strong> maximum membership principle.<br />

When dot product operation is performed between pattern<br />

and indexes weight vector in kernel function <strong>of</strong> the<br />

FKCM clustering algorithm, the improved FKCM<br />

clustering algorithm can be obtained. Furthermore, the<br />

statistic F in (11) can be utilized for acquiring the optimal<br />

classification amount. Statistic F is a conventional index<br />

for evaluating the clustering validity and the nonlinear<br />

factors <strong>of</strong> data set are not considered, hence the KF index<br />

is constructed by using kernel function and shown in (12)<br />

for evaluating more efficiently the clustering validity .<br />

(8)<br />

(9)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!