Automatic Classification Using Neural Networks - AU Journal

Automatic Classification Using Neural Networks - AU Journal Automatic Classification Using Neural Networks - AU Journal

journal.au.edu
from journal.au.edu More from this publisher
14.03.2014 Views

Gamal A. M. Al-Shawadfi, Hindi A. Al-Hindi Automatic Classification Using Neural Networks Gamal A. M. Al-Shawadfi and Hindi A. Al-Hindi College of Business and Economics, King Saud University, Al-Qasseem Branch, Al-Melaida Saudia Arabia ABSTRACT This paper proposes an artificial neural network (ANN) to perform linear and nonlinear classification of objects into several classes. The theoretical and practical aspects of the proposed approach are introduced, and its validity was evaluated by the rate of correct classifications. A Matlab macro program was written for automatic classification using an artificial neural network for linear and nonlinear classification problems. The network is designed, trained and tested with different sample sizes. The results were compared to those obtained from Fisher discriminant function. In contrast with the classical classification procedures, ANNs do not require any pre assumptions about types of data, distributions or the variance covariance matrices. The numerical results illustrate the capabilities of ANNs in solving linear and nonlinear classification problems. Keywords: Automatic Classification, Artificial Neural Networks, Fisher Discriminant Function. 1. INTRODUCTION Classification is a multivariate technique concerned with allocating new objects (or observations) into previously defined groups (populations). A distinction should be made between 'classification' and ‘discriminant analysis’. Discriminant analysis is a multivariate technique concerned with separating distinct sets of objects and often employed on a one time basis in order to investigate observed differences when causal relationships are not well understood (Johnson and Wichern, 1992). On the other hand, classification is the problem in which an object is assigned to one of several classes and usually requires supervised learning methods in which objects are assigned to known groups (Fausett, 1994). There are many studies related to classification and neural network fields. Wernecke et al. (1995) discussed the validation of Classification Trees. Arminger and Enache (1995) illustrated the relation between statistical models and artificial neural networks. More details on fundamentals of neural networks, architectures, algorithms, 72

Gamal A. M. Al-Shawadfi, Hindi A. Al-Hindi<br />

<strong>Automatic</strong> <strong>Classification</strong> <strong>Using</strong> <strong>Neural</strong> <strong>Networks</strong><br />

Gamal A. M. Al-Shawadfi and Hindi A. Al-Hindi<br />

College of Business and Economics,<br />

King Saud University,<br />

Al-Qasseem Branch, Al-Melaida<br />

Saudia Arabia<br />

ABSTRACT<br />

This paper proposes an artificial neural<br />

network (ANN) to perform linear and<br />

nonlinear classification of objects into several<br />

classes. The theoretical and practical aspects<br />

of the proposed approach are introduced, and<br />

its validity was evaluated by the rate of correct<br />

classifications. A Matlab macro program was<br />

written for automatic classification using an<br />

artificial neural network for linear and<br />

nonlinear classification problems. The<br />

network is designed, trained and tested with<br />

different sample sizes. The results were<br />

compared to those obtained from Fisher<br />

discriminant function. In contrast with the<br />

classical classification procedures, ANNs do<br />

not require any pre assumptions about types of<br />

data, distributions or the variance covariance<br />

matrices. The numerical results illustrate the<br />

capabilities of ANNs in solving linear and<br />

nonlinear classification problems.<br />

Keywords: <strong>Automatic</strong> <strong>Classification</strong>, Artificial<br />

<strong>Neural</strong> <strong>Networks</strong>, Fisher Discriminant<br />

Function.<br />

1. INTRODUCTION<br />

<strong>Classification</strong> is a multivariate technique<br />

concerned with allocating new objects (or<br />

observations) into previously defined groups<br />

(populations). A distinction should be made<br />

between 'classification' and ‘discriminant<br />

analysis’. Discriminant analysis is a<br />

multivariate technique concerned with<br />

separating distinct sets of objects and often<br />

employed on a one time basis in order to<br />

investigate observed differences when causal<br />

relationships are not well understood (Johnson<br />

and Wichern, 1992). On the other hand,<br />

classification is the problem in which an object<br />

is assigned to one of several classes and<br />

usually requires supervised learning methods<br />

in which objects are assigned to known groups<br />

(Fausett, 1994).<br />

There are many studies related to<br />

classification and neural network fields.<br />

Wernecke et al. (1995) discussed the<br />

validation of <strong>Classification</strong> Trees. Arminger<br />

and Enache (1995) illustrated the relation<br />

between statistical models and artificial neural<br />

networks. More details on fundamentals of<br />

neural networks, architectures, algorithms,<br />

72


<strong>Automatic</strong> <strong>Classification</strong> <strong>Using</strong> <strong>Neural</strong> <strong>Networks</strong><br />

applications and classification can be found in<br />

Fausett (1994). The classical classification<br />

approaches are found in some text books, see<br />

for example, Johnson and Wichern (1992) and<br />

Stevens (1992). Some neural network related<br />

concepts and applications are found, for<br />

example, in Elman (1993), Hertz et al. (1991),<br />

Nerrand et al. (1993) and Tsoi and Tan (1997).<br />

In recent years, the use of artificial neural<br />

networks has increased for solving a wide<br />

range of problems including pattern<br />

recognition, classification and functional<br />

approximation. An artificial neural network<br />

consists of a number of simple processing<br />

units or neurons and connections. Each simple<br />

processing unit is associated with two major<br />

functions: a summation function and an<br />

activation function. The summation function<br />

sums the inputs coming to the neuron from<br />

other neurons or from the outside<br />

environment. The activation function<br />

determines the activation level of the neuron<br />

based on the inputs it receives. Connections<br />

represent the links between nodes and each<br />

connection is associated with a weight that<br />

reflects the strength of the relationship<br />

between the connected nodes. Training the<br />

network is the process of developing a<br />

function that maps input vectors to output<br />

vectors with minimum errors using a set of<br />

training examples that includes input and<br />

output data pairs.<br />

For many practical cases, a feed forward<br />

neural network with three different layers is<br />

appropriate because it has been proven to be<br />

an efficient system for representing nonlinear<br />

relationships between a set of input and output<br />

vectors to a high degree of accuracy. The first<br />

layer is the input layer whose nodes receive<br />

the inputs from outside. The second layer is<br />

the hidden layer whose nodes receive inputs<br />

from the input neurons and propagates them to<br />

the output nodes. The third layer is the output<br />

layer whose nodes determine the outputs of the<br />

neural network.<br />

In this paper, we propose a neural network<br />

approach to perform automatic classification<br />

and compare its performance with the classical<br />

classification method. With the classical<br />

classification method, all groups are assumed<br />

to have the same variance-covariance matrix<br />

and the distributions are normal. In the<br />

absence of these two assumptions, linear or<br />

quadratic classification rules are inadequate<br />

(Johnson and Wichern, 1992). The neural<br />

network classification technique does not<br />

impose such assumptions.<br />

The performance of the two classification<br />

techniques is illustrated with three data sets<br />

that use two variables in classifying objects<br />

into three categories. The first data set<br />

describes the case of linear classification. The<br />

second and third describe the case of nonlinear<br />

classification with varying degrees of<br />

nonlinearity.<br />

The rest of the paper is organized as<br />

follows. The second section presents the<br />

proposed automatic classification method. The<br />

third section presents the simulated<br />

classification problems to test the proposed<br />

approach. The fourth section summarizes this<br />

paper.<br />

2. PROPOSED ANN <strong>AU</strong>TOMATIC<br />

CLASSIFICATION METHOD<br />

Let us start with a description of the kind<br />

of data for which the proposed classification<br />

method is appropriate. The data consists of<br />

objects, and their corresponding descriptions.<br />

The objects may be quantitative, for example<br />

students cumulative rates, or qualitative, for<br />

example documents, keywords, hand written<br />

characters and species. Qualitative data may<br />

be replaced with quantitative data according to<br />

suitable coding scheme. Then the<br />

classification method is used to summarize and<br />

simplify the data.<br />

International <strong>Journal</strong> of The Computer, The Internet and Management, Vol. 11, No.3, 2003, pp. 72 - 82<br />

73


Gamal A. M. Al-Shawadfi, Hindi A. Al-Hindi<br />

<strong>Classification</strong> rules are developed from<br />

training samples. Usually, the data are<br />

separated into two subsets, where the first,<br />

called training set, is used to fit a suitable<br />

discriminant function, and the second, called<br />

the test set, is used to check the validity of the<br />

discriminant function for classification. In the<br />

ANN model, the data need to be normalized<br />

usually into the range [0,1] before training. In<br />

this research, the following equation is used to<br />

normalize the data:<br />

? y - y ?<br />

x = 0.8<br />

max i<br />

+ 0.1 (1)<br />

i ? y - y ?<br />

? max min ?<br />

where y max and y min are the respective<br />

maximum and minimum values of all<br />

observations.<br />

Training in neural networks is the process<br />

of adjusting the weights of the network<br />

connections in order to obtain the desired<br />

outputs (z i ) given a set of inputs (x i1 , x i2 , … ,<br />

x im ). The discriminant function is:<br />

z = f<br />

) +<br />

i<br />

i( xi<br />

1<br />

, xi<br />

2,<br />

. . . . ., xim<br />

ui<br />

(2)<br />

where z i represents the classes (outputs),<br />

i=1,2,…..,k and fi( xi<br />

1,<br />

xi<br />

2,<br />

. . . . ., xim)<br />

is a<br />

linear or nonlinear function in the input<br />

variables x<br />

i<br />

, x , 1 i 2<br />

. . . . ., x im<br />

and ui represents<br />

errors between the input and the output values.<br />

Equation (2) represents a class of artificial<br />

neural networks which may be interpreted as a<br />

complex multivariate statistical model for the<br />

approximation of an unknown expectation<br />

function of a random variable z given an<br />

explanatory variables x i ’s, i=1,2,….,m This<br />

class of models represents a discriminant<br />

function.<br />

The training process includes two stages:<br />

forward and backward pass. In the forward<br />

pass, the input neurons receive input examples<br />

and pass them to hidden layer. Each neuron<br />

calculates its activation level using the<br />

summation function to sum the weighted<br />

inputs. This sum is then used by the activation<br />

function to determine the output of the neuron.<br />

Usually, the activation function is a linear<br />

combinations of parameters and inputs. In<br />

practice, many types of transfer functions may<br />

be used, which include: linear combination,<br />

normal distribution, logistic distribution,<br />

hyperbolic tangent, indicator function and<br />

threshold functions. The selected activation<br />

function here is a sigmoidal function. Sigmoid<br />

activation functions are easy to differentiate<br />

and make computation of the gradient easy.<br />

The sigmoid function has the form:<br />

1<br />

f<br />

j<br />

( x 1<br />

,..., x n<br />

) =<br />

n<br />

(3)<br />

? w ij + b<br />

1+<br />

j<br />

i = 1<br />

e<br />

where f j<br />

( x 1<br />

,..., x n<br />

) is the output, x is the input<br />

observations, w called weights, and b is the<br />

bias term. The values of w and b should be<br />

initialized before the training phase. In Matlab<br />

package, the function INITFF is used to<br />

determine initial values for w and b.<br />

The outputs of the hidden neurons are<br />

used as input to neurons in the output layer.<br />

Neurons in the output layer use the summation<br />

and activation functions to determine the<br />

outputs of the neural network. The generated<br />

outputs is then compared to the actual outputs<br />

and the resulting errors is propagated back<br />

through the network in the backward pass of<br />

the training algorithm.<br />

The weights w jk that connect neurons in<br />

the hidden layer with neurons in the output<br />

layer are updated first. Then, the weights w ij<br />

that connect neurons in the input layer with<br />

neurons in the hidden layer are updated. There<br />

74


<strong>Automatic</strong> <strong>Classification</strong> <strong>Using</strong> <strong>Neural</strong> <strong>Networks</strong><br />

are two strategies that can be used to update<br />

the connection weights. The first is to<br />

calculate the error for each training example<br />

and propagate this error back to update the<br />

weights. The second strategy is to calculate the<br />

errors for all training examples and use this<br />

sum to update the weights. The iterative<br />

process of the back propagation training<br />

algorithm continues until the error function<br />

reaches a predetermined level (say 0.001), or<br />

the number of iterations is satisfied. The error<br />

function is:<br />

1<br />

2<br />

MSE = ? ( c<br />

j<br />

− c<br />

j<br />

( NNC))<br />

(4)<br />

n<br />

where MSE is the mean of sum of squares of<br />

the difference between the actual output (c j )<br />

and NN estimated output c j (NNC).<br />

Due to their capabilities, neural networks<br />

may be used to approximate continuous<br />

mapping functions. The following theorem of<br />

Kolmogorov shows the existence of the<br />

mapping neural network.<br />

Theorem: Any continuous function<br />

f ( xi ), i = 1,2, . . n of several variables defined<br />

on I<br />

n , where n ≥ 2 and I =[0,1] , can be<br />

represented in the form :<br />

2n<br />

? + 1<br />

j<br />

j = 1<br />

f ( x)<br />

= b w x (5)<br />

ij<br />

i<br />

where b j and w ij are continuous functions of<br />

one variable and w ij are monotonic functions<br />

that do not depend on f.<br />

The theorem states that: “a feed forward<br />

neural network with three layers of neurons<br />

(input, hidden and output units) can represent<br />

any continuous function exactly”, see Fausett<br />

(1994) pp.328-329.<br />

A good classification procedure should<br />

result in few misclassifications. The validity of<br />

the proposed NN classification method may be<br />

checked using the apparent error rate (APER).<br />

The apparent error rate is the fraction of<br />

observations in the test set that are<br />

misclassified by the discriminant function. The<br />

apparent error rate is calculated according to<br />

the equation:<br />

APER<br />

k<br />

?<br />

i=<br />

= k<br />

?<br />

i=<br />

1<br />

ni<br />

1<br />

(6)<br />

N<br />

i<br />

where N i is the number of observations in<br />

group i, i = 1,2,….,k , and n i is the number of<br />

items misclassified in another group other<br />

than i. However, we can measure the validity<br />

of the neural network classification procedure<br />

by calculating the rate of correct classification<br />

from the following equation:<br />

CCR<br />

k<br />

?<br />

ni<br />

1 (7)<br />

N<br />

i=<br />

1<br />

= −<br />

k<br />

?<br />

i=<br />

1<br />

i<br />

The correct classification rate (CCR)<br />

represents the item in the test set that are<br />

correctly classified. This measure does not<br />

depend on the form of the parent population,<br />

also it can be calculated for any classification<br />

procedure.<br />

To do automatic classification for data in<br />

several classes, we designed a Matlab macro<br />

program presented in the appendix. In the<br />

program we use essentially three functions<br />

from neural network toolbox. INITFF to<br />

initialize feed-forward network up to three<br />

layers. TRAINBPX to train a feed-forward<br />

network with fast back propagation and can be<br />

invoked with 1, 2, or 3 sets of weights.<br />

SIMUFF to simulate a feed-forward network<br />

with up to 3 layers.<br />

International <strong>Journal</strong> of The Computer, The Internet and Management, Vol. 11, No.3, 2003, pp. 72 - 82<br />

75


Gamal A. M. Al-Shawadfi, Hindi A. Al-Hindi<br />

In summary, the steps of the proposed NN<br />

classification approach are:<br />

1. Designing the network: Use equation 1 to<br />

scale the data between 0, 1 interval, select<br />

a suitable discriminant function, determine<br />

the number of training epochs, and error<br />

level, and select initial values for the<br />

weights and biases.<br />

2. Training the network: Spilt the data into<br />

two groups, and use the first group to train<br />

the network and estimate the parameters<br />

w i , b i of the discriminant function.<br />

3. Checking the validity: Check the validity of<br />

the proposed discriminant function, using<br />

the second group of data with equation (7)<br />

which gives the rate of the correct<br />

classifications.<br />

4. Classifying new observations: If step 3 is<br />

satisfied, use the proposed discriminant<br />

function in classifying new observations<br />

into suitable groups, otherwise repeat the<br />

training process with other function and<br />

weights to obtain a more efficient<br />

discriminant function.<br />

The following sets of examples are used to<br />

compare the performance of neural networks<br />

with classical classification.<br />

3. SOME EXAMPLES<br />

Three different groups, with different<br />

properties regarding the relationships between<br />

variables, are used in the comparative<br />

procedure. The first group shows the<br />

classification in case of linear relationship<br />

between inputs and outputs. The second and<br />

the third groups show the classification in the<br />

case of nonlinear relationships between the<br />

inputs and the outputs. In all examples, both<br />

the proposed neural network and Fisher<br />

classification approaches are used in solving<br />

the problem of classifying objects into three<br />

categories according to some inputs. The<br />

inputs are two variables related linearly or<br />

nonlinearly to the output variable. For each<br />

group, different sample sizes are used to show<br />

the effect of increasing the sample size on<br />

classification accuracy. The selected samples<br />

sizes are 10, 30, 50, 70, 90, 110, 130 and 150.<br />

The validity of the classification results are<br />

tested by the rate of correct classifications<br />

(CCR).<br />

The four steps of the proposed neural<br />

network classification mentioned in section (2)<br />

were applied here: designing the network,<br />

training the network, checking the validity,<br />

and classifying new observations.<br />

3.1 The Case of Linear <strong>Classification</strong><br />

In this case, the output variable (or object)<br />

is classified into three categories according to<br />

two variables (inputs). The data samples are<br />

generated according to the following linear<br />

equation:<br />

y 3 = 0.5 y 1 + .5 y 2 (8)<br />

where y 1 and y 2 are input variables and y 3 is<br />

the output. Our aim is to use the proposed NN<br />

procedure to find a function to allocate an<br />

observation from y 3 into class 1, 2 or 3<br />

according to the values of y 1 and y 2 .<br />

In Fisher classification procedure, sample<br />

discriminants are used to allocate x i into<br />

population k if:<br />

r<br />

?<br />

j = 1<br />

r<br />

?<br />

j = 1<br />

( y<br />

j<br />

− y<br />

=<br />

( λ ( xˆ<br />

− x ))<br />

j<br />

kj<br />

)<br />

i<br />

2<br />

2<br />

r<br />

?<br />

j = 1<br />

( λ ( xˆ<br />

− x ))<br />

j<br />

k<br />

2<br />

?<br />

(9)<br />

76


<strong>Automatic</strong> <strong>Classification</strong> <strong>Using</strong> <strong>Neural</strong> <strong>Networks</strong><br />

For all i ? k , where λj is the eigenvectors of<br />

w -1 B 0 and r ? s .<br />

Table 1:<br />

The NN and classical classifications of<br />

objects into 3 classes<br />

y 3 = 0.5 y 1 + 0.5 y 2<br />

N NN CCR F. CCR<br />

10 0.9000 0.8000<br />

30 0.8333 0.6667<br />

50 0.7400 0.7600<br />

70 0.9000 0.7571<br />

90 0.7778 0.9333<br />

110 0.9727 1.0000<br />

130 0.8385 0.9923<br />

150 0.8600 0.9667<br />

average 0.8528 0.8595<br />

Table 1 shows the rate of correct<br />

classification (CCR) for each method. The first<br />

column represents the sample size. The second<br />

column shows the rate of correct classification<br />

(CCR) for the neural network approach.<br />

Column three shows the rate for the Fisher<br />

classification method. As it is shown in the<br />

table, the rates of correct classification for the<br />

two methods are very close in average, the<br />

average of correct classification for the neural<br />

network approach is 85.29% and 85.95% for<br />

the classical classification approach.<br />

3.2 The Case of Nonlinear <strong>Classification</strong><br />

For this case, data samples were generated<br />

to show the capabilities of the proposed neural<br />

network approach in solving nonlinear<br />

classification problems. Each new observation<br />

is classified into one of three categories<br />

according to two input variables. The data<br />

were generated according to the equation:<br />

y 3 = 0.5 y 1 + 0.5 y 2<br />

2<br />

(10)<br />

Table 2 shows the NN and Fisher<br />

classifications of objects into 3 classes for the<br />

nonlinear case. We find that the overall<br />

performance of the neural network nonlinear<br />

classification approach is fairly very good, in<br />

the sense that the average rate of correct<br />

classifications is 84.74%. With respect to the<br />

sample sizes, these rates fluctuate between<br />

63.33% and 100%. On the other hand, the<br />

rates of Fisher classification approach<br />

fluctuate between 60% and 84.29%.<br />

Table 2:<br />

The NN and classical classification of<br />

objects into 3 classes<br />

y 3 = .5y 1 + .5y 2<br />

2<br />

N<br />

NN CCR F. CCR<br />

10 1.0000 0.7000<br />

30 0.6333 0.6000<br />

50 0.8400 0.8200<br />

70 0.7000 0.8429<br />

90 0.9556 0.7556<br />

110 0.9200 0.7200<br />

130 0.8700 0.8300<br />

150 0.8600 0.8100<br />

average 0.8474 0.7598<br />

The other set of data samples for the nonlinear<br />

classification was generated according to the<br />

following equation:<br />

y 3 = 0.5 y 1 + 0.5 y 2<br />

3<br />

(11)<br />

From table 3, we find that the performance of<br />

the NN nonlinear classification approach is<br />

extremely high where the average of correct<br />

classification rate is 97.5%. The rates fluctuate<br />

between 80% and 100%. On the other hand,<br />

the rates of the classical classification<br />

approach fluctuate between 72.86% and<br />

96.67%. Also, the average rate is only 81.4%.<br />

International <strong>Journal</strong> of The Computer, The Internet and Management, Vol. 11, No.3, 2003, pp. 72 - 82<br />

77


Gamal A. M. Al-Shawadfi, Hindi A. Al-Hindi<br />

Table 3:<br />

The NN and classical classifications of<br />

objects into 3 classes<br />

y 3 = .5y 1 + .5y 2<br />

3<br />

N NN CCR F. CCR<br />

10 0.8000 0.8000<br />

30 1.0000 0.9667<br />

50 1.0000 0.8200<br />

70 1.0000 0.7286<br />

90 1.0000 0.7667<br />

110 1.0000 0.7600<br />

130 1.0000 0.8200<br />

150 1.0000 0.8500<br />

average 0.9750 0.8140<br />

complexity of the discriminant function<br />

depends on the number of layers as well as the<br />

number of units in the hidden layer (hidden<br />

units). Another advantage is that the method<br />

does not put any constraints on population<br />

distribution or require the equivalence of the<br />

population variance-covariance matrices. The<br />

numerical examples support the suitability of<br />

the proposed approach in data classification.<br />

The findings of the study draw several<br />

distinctions between neural network and<br />

classical classification: i) for linear<br />

classification, the performance of the two<br />

methods is equivalent, ii) for nonlinear<br />

classification, neural networks outperform the<br />

classical classification method and iii) the<br />

relative performance of the neural network<br />

increases as the level of nonlinearity increases.<br />

4 CONCLUSION<br />

This paper proposes a neural network<br />

method to do nonparametric automatic<br />

classification. The proposed method may be<br />

used in classifying a sample of n objects into k<br />

classes according to linear or nonlinear<br />

discriminant function. The discriminant<br />

function is obtained from a training process of<br />

the neural network. The selected discriminant<br />

function is the one which minimizes the<br />

difference between the actual and the expected<br />

numbers of classes. The validity of the<br />

proposed discriminant function is measured by<br />

the rate of correct classifications (CCR), where<br />

the discriminant function is acceptable if CCR<br />

is more than some ratio (say 50%).<br />

<strong>Classification</strong> using NN has several<br />

advantages compared to the classical methods.<br />

One advantage is the possibility of<br />

constructing nonlinear discriminant function<br />

for solving sophisticated classification<br />

problems. <strong>Using</strong> the proposed approach, a very<br />

complex nonlinear approximation functions<br />

can be built from simple components. The<br />

REFERENCES<br />

Arminger, G. and D. Enache (1995),<br />

“Statistical Models and Artificial <strong>Neural</strong><br />

<strong>Networks</strong>”, In Proceedings of the 19th<br />

annual conference of the Gesellschaft fur<br />

classification e. V., University of Basel,<br />

H.-H. Bock and W. Polasek Editors ,<br />

Springers , Germany.<br />

Cox, Hinkly (1979), Theoretical Statistics,<br />

Chapman and Hall: London, U.K.<br />

David, K. Hildebrand and Lyman oat (1991)<br />

Statistical Thinking for Managers, Fourth<br />

Edition. Wadsworth Publishing Company:<br />

London, England.<br />

Devillers , J. and W. Karcher (1991), Applied<br />

Multivariate Analysis in SAR and<br />

Environmental Studies, Kluwer academic<br />

Publishers: London , U.K.<br />

Elman, J. L. (1993), “Learning and<br />

Development in <strong>Neural</strong> <strong>Networks</strong>: the<br />

78


<strong>Automatic</strong> <strong>Classification</strong> <strong>Using</strong> <strong>Neural</strong> <strong>Networks</strong><br />

Importance of Starting Small”, Cognition,<br />

48, pp.71-99.<br />

Fausett, Laurene (1994), Fundamentals of<br />

neural networks, Architectures,<br />

algorithms And Applications, Prentice<br />

Hall Inc.: New York.<br />

Hertz J. and et al. (1991), Introduction to the<br />

theory of neural computation, Lecture<br />

notes of the Santa-Fe Institute, vol. 1,<br />

Reading MA: Addison-Wesley.<br />

Iman, Ronald L. and W. J. Conover (1983), A<br />

Modern Approach to Statistics, John<br />

Wiley & sons: New York, U.S.A.<br />

Johnson, Richard A. and Dean, W. Wichern<br />

(1992), Applied Multivariate Statistical<br />

Analysis, Third Edition, Prentice Hall,<br />

Englewood Cliffs: New Jersey, U.S.A.<br />

Matlab Reference Manual (1997), Version 5.3,<br />

Mathworks, U.S.A.<br />

439, pp 1820-1829.<br />

Stevens, James (1992), Applied Multivariate<br />

Statistics for the Social, Sciences, Second<br />

Edition, Lawrence, Erlbaum Associates<br />

Publishers: Hillsdale, New Jersey, U.S.A.<br />

Tsoi A.C. and Tan S. (1997), “Recurrent<br />

neural networks: A constructive algorithm<br />

and its properties”, Neurocomputing, 15,<br />

pp.309-326.<br />

Wernecke, K. D.; Possinger, K. and Kalb,<br />

G. (1995), “On the Validation of<br />

<strong>Classification</strong> Trees”, Proceedings of the<br />

19th Annual Conference of the<br />

Gesellschaft fur classification e.V.,<br />

University of Basel, March 8-10 , 1995 ,<br />

Bock and Polasek Editors, Springer:<br />

U.S.A.<br />

Zar, Jerrold H. (1984), Bio-statistical Analysis,<br />

Second Edition, Prentice – Hall<br />

International, Inc.: London, U.K.<br />

Naftaly, U.; Intrator, N. and Horn, D., (1997),<br />

“Optimal Ensemble Averaging of <strong>Neural</strong><br />

<strong>Networks</strong>”, Network: Computation in<br />

<strong>Neural</strong> Systems, 8, pp. 283-296.<br />

Nerrand, O.; Roussel-Ragot, P.; Personnaz, L.;<br />

Dreyfus, G. and Marcos, S. (1993),<br />

“<strong>Neural</strong> <strong>Networks</strong> and Nonlinear<br />

Adaptive Filtering: Unifying Concepts<br />

and New Algorithms”, <strong>Neural</strong><br />

Computation, 5, pp.165-199.<br />

Sahay, Surottam N. and et al. (1996),<br />

“Software Review”, The <strong>Journal</strong> of the<br />

Royal Economic Society, Vol. 106, Isis.<br />

International <strong>Journal</strong> of The Computer, The Internet and Management, Vol. 11, No.3, 2003, pp. 72 - 82<br />

79


Gamal A. M. Al-Shawadfi, Hindi A. Al-Hindi<br />

APPENDIX<br />

% ... Matlab Macro Program for <strong>Automatic</strong> <strong>Classification</strong> of Data ...<br />

%... ...file name : train1394 ... output file oo1394.mat... ...<br />

disp '... ...Examples of NN and Fisher <strong>Classification</strong>'<br />

disp '... ...for Data Generated From Binomial Distribution '<br />

diary ('oo1394')<br />

clc;clear all;an=320;mm=an/40;aa(mm,4)=0;nn=15;<br />

r1=binornd(nn,.5,an,1);r2=binornd(nn,.3,an,1);r3=.5*r1+.5*(r2);<br />

for j =1:mm<br />

j<br />

n = 20+40*(j-1);<br />

n1=0;n2=0;n3=0;nn1=0;nn2=0;nn3=0;<br />

c1 = round(n/2);<br />

if n< 100<br />

c2=n-c1 ;<br />

else c2=50;<br />

end<br />

m = 5;k =3 ;d(k)=0;<br />

ss(k,1)= 0 ; ss0(k,1) = 0 ; ss1(k,1) = 0 ;st0(4)=0 ;st(k)=0;st1(4)=0;<br />

x1(k)=0;x2(k)=0;y1(c1)=0;y2(c1)=0;z3(c1) = 0 ;<br />

r03=r3(1:c1) ; z1=r1(1:c1) ; z2=r2(1:c1) ;<br />

%1... ... ... ... INITIALIZATAION OF DATA ... ... ... ...<br />

r =(max(r03) - min(r03))/3;<br />

r0 = min(r03)+r;<br />

for i = 1 : c1<br />

if r03(i) < r0<br />

z3(i) = 1 ;<br />

n1 = n1+1 ;<br />

x1(1)= x1(1)+z1(i);x2(1)=x2(1)+z2(i);<br />

elseif r3(i) > r0+r;<br />

z3(i)=3 ;<br />

n3=n3+1;<br />

x1(3)=x1(3)+z1(i);x2(3)=x2(3)+z2(i);<br />

else<br />

z3(i)= 2;<br />

n2=n2+1;<br />

x1(2)=x1(2)+z1(i);x2(2)=x2(2)+z2(i);<br />

end<br />

end<br />

% ... Fisher method for classification ... ... ...<br />

x01= ones(1,c1)*z1/c1;x02=ones(1,c1)*z2/c1;<br />

x11= x1(1)/n1 ; x21=x1(2)/n2 ; x31=x1(3)/n3;<br />

x12= x2(1)/n1 ; x22=x2(2)/n2 ; x32=x2(3)/n3;<br />

xxx= [x11 x12;x21 x22; x31 x32];<br />

xx1= [x11-x01;x21-x01;x31-x01];<br />

xx2= [x12-x02;x22-x02;x32-x02];<br />

for i = 1 : c1<br />

if r03(i) < r0<br />

y1(i)= z1(i)-x11;<br />

y2(i)= z2(i)-x12;<br />

elseif r3(i) > r0+r;<br />

y1(i)= z1(i)-x31;<br />

y2(i)= z2(i)-x32;<br />

else<br />

y1(i)= z1(i)-x21;<br />

y2(i)= z2(i)-x22;<br />

end<br />

end<br />

x =[ xx1 xx2] ;<br />

80


<strong>Automatic</strong> <strong>Classification</strong> <strong>Using</strong> <strong>Neural</strong> <strong>Networks</strong><br />

B0= transpose(x)*x;<br />

y=[y1 ; y2] ;<br />

yy= y*transpose(y);<br />

ww=inv(yy);<br />

ww1=ww*B0;<br />

[cv a] = eig(ww1);<br />

v01=cv(:,1);v02=cv(:,2);<br />

cc1=transpose(v01)*yy*v01;<br />

cc2=transpose(v02)*yy*v02;<br />

c01=sqrt((n-3)/cc1);<br />

c02=sqrt((n-3)/cc2);<br />

cv1=v01*c01;cv2=v02*c02;<br />

v=[cv1 cv2];<br />

% ... Proposed NN method for classification ... ... ...<br />

z3=transpose(z3);<br />

z=[z1 z2 z3];<br />

z0 = (0.8*(z-ones(c1,1)*min(z))./(ones(c1,1)*(max(z)-min(z))))+0.1;<br />

z01 = transpose(z0(:,1:2)); z02 = transpose(z0(:,3));<br />

k1='purelin' ; k2='logsig' ; k3='tansig';<br />

[w1,b1,w2,b2]=initff(z01,m,k2,z02,k2) ;<br />

[w1 , b1 ,w2, b2,TE,TR]=trainbpx(w1,b1,k2,w2,b2,k2,z01,z02,[50,5000<br />

,.001 ,.001]);<br />

%2... ... ... ...comparison and testing phase ... ... ... ...<br />

zz1=r1((c1+1):c1+c2,:); zz2=r2((c1+1):c1+c2,:); rr03=r3((c1+1):c1+c2,:);<br />

zzz=[zz1 zz2];<br />

v1 = zzz*v;<br />

v2 = xxx*v;<br />

%1... ... ... ... INITIALIZATAION OF DATA ... ... ... ...<br />

rr =(max(rr03) - min(rr03))/3;<br />

rr0=min(rr03)+rr;<br />

for i = 1:c2<br />

if rr03(i)< rr0<br />

zz3(i) =1;nn1=nn1+1;<br />

elseif rr03(i) > rr0+rr<br />

zz3(i)=3;nn3=nn3+1;<br />

else<br />

zz3(i)= 2; nn2=nn2+1;<br />

end<br />

end<br />

zz3=transpose(zz3);<br />

zz=[zz1 zz2 zz3];<br />

zz0 = (0.8*(zz-ones(c2,1)*min(zz))./(ones(c2,1)*(max(zz)-<br />

min(zz))))+0.1;<br />

zz01 = transpose(zz0(:,1:2));<br />

zz02 = transpose(zz0(:,3));<br />

for i=1:c2<br />

d(1)= (v1(i,:)-v2(1,:))*transpose((v1(i,:)-v2(1,:)));<br />

d(2)= (v1(i,:)-v2(2,:))*transpose((v1(i,:)-v2(2,:)));<br />

d(3)= (v1(i,:)-v2(3,:))*transpose((v1(i,:)-v2(3,:)));<br />

[d1,dd1] = min(d);<br />

yf(i)=dd1;<br />

if dd1 < 1.5 & zz3(i)==1 ;st1(1)=st1(1)+1;<br />

elseif dd1>= 1.5&dd1 < 2.5 & zz3(i)==2 ;st1(2)=st1(2)+1;<br />

elseif dd1>=2.5<br />

& zz3(i)==3 ;st1(3)=st1(3)+1;<br />

else<br />

;st1(4)=st1(4)+1;<br />

end<br />

s = simuff(zz01(:,i),w1,b1,k2,w2,b2,k2);<br />

s1= (((s- 0.1)*(max(zz3)-min(zz3)))/0.8 )+ min(zz3);<br />

y0(i)=s1;<br />

if s1 < 1.5 & zz3(i)==1 ;st0(1)=st0(1)+1;<br />

elseif s1 >= 1.5&s1 < 2.5 & zz3(i)==2 ;st0(2)=st0(2)+1;<br />

elseif s1 >=2.5 & zz3(i)==3 ;st0(3)=st0(3)+1;<br />

International <strong>Journal</strong> of The Computer, The Internet and Management, Vol. 11, No.3, 2003, pp. 72 - 82<br />

81


Gamal A. M. Al-Shawadfi, Hindi A. Al-Hindi<br />

else<br />

;st0(4)=st0(4)+1;<br />

end<br />

end<br />

ss1 = (st0(1)+st0(2)+st0(3))/c2;<br />

ss2 = (st1(1)+st1(2)+st1(3))/c2;<br />

%3...Results ... ...<br />

aa(j,:)=[j c1 ss1 ss2];<br />

% [zz3 transpose(yf) transpose(y0)]<br />

clear b* ; clear B* ; clear c* ;clear d* ; clear n* ;<br />

clear s* ;clear v* ; clear w* ;clear x* ; clear y* ;clear z*;<br />

end<br />

disp ' HIT RATE RESULTS Fisher class. & <strong>Neural</strong> class.'<br />

aa a1= mean(aa)<br />

save o1394;<br />

diary off<br />

82

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!