Journal of Software - Academy Publisher
Journal of Software - Academy Publisher
Journal of Software - Academy Publisher
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
882 JOURNAL OF SOFTWARE, VOL. 6, NO. 5, MAY 2011<br />
w<br />
=<br />
n n<br />
∑∑<br />
i= 1 k=<br />
1<br />
j p n n<br />
∑∑∑<br />
j= 1 i= 1 k=<br />
1<br />
| z − z |<br />
ij kj<br />
| z − z |<br />
ij kj<br />
where wj denotes weight <strong>of</strong> the jth index,<br />
n<br />
zij= xij ∑ x<br />
i=<br />
1 ij ( i = 1,2, ⋅⋅⋅ , n; j = 1,2, ⋅⋅⋅ , p)<br />
To enhance the stability <strong>of</strong> clustering result, a<br />
determinate initial clustering center is desired prior to<br />
performing clustering algorithm. In this paper, the<br />
relation matrix R is constructed by employing the<br />
included angle cosine formula (4), and the samples are<br />
classified by threshold partition method. Though the<br />
classifying effect is not well, the classification result can<br />
provide a decided initial clustering center for improved<br />
FKCM clustering algorithm.<br />
r<br />
ij<br />
=<br />
p<br />
∑<br />
| x x |<br />
k = 1<br />
ik jk<br />
p p<br />
2 2<br />
( ∑x )( )<br />
ik ∑x<br />
jk<br />
k= 1 k=<br />
1<br />
(3)<br />
(, i j = 1,2, � , n)<br />
(4)<br />
where r ij denotes the similarity between the ith and the<br />
jth sample.<br />
B. FKCM algorithm<br />
However, the FCM clustering analysis is somewhat<br />
limited in real world problems and nonlinear clustering<br />
analysis would be highly desirable. An efficient method<br />
<strong>of</strong> obtaining the nonlinear cluster algorithm is to first map<br />
p<br />
the patterns <strong>of</strong> input space Ω into some higher<br />
q<br />
dimensional feature space Ω using a kernel<br />
function φ() ⋅ , the FCM can be performed in this feature<br />
space. When the kernel function is chosen, the Euclid<br />
distance between i x and x j in feature space<br />
1/2<br />
is dˆ ij( xi, x j) = [ K( xi, xi) − 2 K( xi, x j) + K(<br />
xj, x j)]<br />
,<br />
i, j = 1,2, � , n.<br />
Let V be clustering center matrix in input space,<br />
V= ( v , v , , v ) v = ( v , v , �, v ),( i = 1,2, � , c)<br />
.<br />
1 2 � c i i1 i2 ip<br />
Let Û be membership matrix in feature,<br />
Uˆ= ( uˆ ˆ ˆ<br />
1, u2, � , un)<br />
,<br />
uˆ ˆ ˆ ˆ<br />
i = ( ui1, ui2, �, uic),( i = 1,2, � , n)<br />
. Hence, Objective<br />
function <strong>of</strong> FKCM clustering algorithm in feature space<br />
is<br />
m( ; ,<br />
c n<br />
) = ∑∑<br />
j= 1 i=<br />
1<br />
m ˆ 2<br />
ji ij<br />
Jˆ XUD ˆ ˆ uˆd , 2 < c < n (5)<br />
New clustering center vectors in feature space are<br />
© 2011 ACADEMY PUBLISHER<br />
j<br />
j<br />
n<br />
ukj m<br />
i<br />
n<br />
ukj<br />
m<br />
k= 1 i=<br />
1<br />
vˆ φ( v ) ( ˆ ) φ(<br />
x )/ ( ˆ ) , j = 1, 2, � , c (6)<br />
= =∑ ∑<br />
The φ( xi ) is dropped in (6). Unfortunately, the mapping<br />
function φ() ⋅ may not be known explicitly and if the<br />
dimension <strong>of</strong> the feature space<br />
q<br />
Ω is very high or<br />
infinite, it is difficult to solve for objective function by<br />
(6). To get around this difficulty, the problem is<br />
reformulated to involve only the dot product <strong>of</strong> the<br />
patterns x i ( i= 1, 2, � , n)<br />
in the feature space.<br />
K<br />
n<br />
m<br />
∑(<br />
uˆkj ) K(<br />
xk, xi)<br />
( , ˆ i j) = φ( i) ⋅ φ ( k = 1<br />
j) = n<br />
m<br />
∑ ( uˆ<br />
kj )<br />
k = 1<br />
n n<br />
∑∑ uˆkj m<br />
uˆtj m<br />
K xkxt j j = φ j ⋅ φ j<br />
k= 1 t=<br />
1 =<br />
n<br />
2<br />
⎛ m ⎞<br />
⎜∑( uˆ<br />
kj ) ⎟<br />
k = 1<br />
x v x v (7)<br />
K ( vˆ , vˆ ) ( v ) ( v )<br />
uˆ<br />
=<br />
=<br />
ij c<br />
j=<br />
1<br />
(1 / dˆ<br />
( x , vˆ<br />
))<br />
j=<br />
1<br />
2 1/( m−1)<br />
ij i j<br />
(1 / dˆ<br />
( x , vˆ<br />
))<br />
2 1/( m−1)<br />
ij i j<br />
( ) ( ) ( , )<br />
⎝ ⎠<br />
(1 / ( K( x , x ) − 2 K( x , vˆ ) + K(<br />
vˆ , vˆ<br />
)))<br />
c<br />
∑<br />
∑<br />
i i i j j j<br />
(1 / ( K( x , x ) − 2 K( x , vˆ ) + K(<br />
vˆ , vˆ<br />
)))<br />
i i i j j j<br />
1/( m−1)<br />
1/( m−1)<br />
When ˆ 2<br />
d ( , ˆ ij xiv j ) = 0,<br />
uˆ 1, ˆ ij = uit = 0, ( t∈[1, j) ∪ ( j, c])<br />
.<br />
ˆ 1/2<br />
d ( , ˆ ) [ ( , ) 2 ( , ˆ ) ( ˆ , ˆ<br />
ij xiv j = K xixi− K xiv j + K v j v j )] ,<br />
i = 1, 2, �, n; j = 1, 2, �,<br />
c<br />
(10)<br />
To acquire the optimization membership degree matrix<br />
*<br />
Û and corresponding distance matrix * ˆD , the equation <strong>of</strong><br />
ˆ() l ˆ(<br />
l 1)<br />
| J J |<br />
−<br />
− must be convergent, that is, equation <strong>of</strong><br />
lim Jˆ<br />
( ; ˆ, ˆ<br />
m XUD) comes into existence [18]. Hence, the<br />
l→∞<br />
variance ε can be set at a random small value, the initial<br />
distance matrix and the initial membership matrix are<br />
known, then the iterative algorithm can be performed by<br />
() ( 1)<br />
(5), (7), (8), (9), and (10), if ˆ l<br />
| ˆ l−<br />
J − J | < ε , the<br />
iterative algorithm ceases, and the constraint optimization<br />
membership matrix () ˆ l<br />
U and the constraint optimization<br />
()<br />
distance matrix ˆ l<br />
D can be acquired, finally, samples is<br />
classified in terms <strong>of</strong> maximum membership principle.<br />
When dot product operation is performed between pattern<br />
and indexes weight vector in kernel function <strong>of</strong> the<br />
FKCM clustering algorithm, the improved FKCM<br />
clustering algorithm can be obtained. Furthermore, the<br />
statistic F in (11) can be utilized for acquiring the optimal<br />
classification amount. Statistic F is a conventional index<br />
for evaluating the clustering validity and the nonlinear<br />
factors <strong>of</strong> data set are not considered, hence the KF index<br />
is constructed by using kernel function and shown in (12)<br />
for evaluating more efficiently the clustering validity .<br />
(8)<br />
(9)