Rule Extraction from Support Vector Machine - Department of ...

Synopsis 

of the Ph.D. thesis on 

Rule Extraction from Support 

Vector Machine 

Submitted by 

Mohammad Abdul Haque Farquad 

Reg. No: 04MCPC03 

for the degree of 

Doctor of Philosophy 

Under the Guidance of 

Prof. S. Bapi Raju, 

University of Hyderabad, Hyderabad 

Dr. V. Ravi, 

IDRBT, Hyderabad 

Submitted to the 

Department of Computer and Information Sciences 

University of Hyderabad 

Hyderabad, Andhra Pradesh, India 

1

Abstract 

Although Support Vector Machines have been used to develop highly accurate 

classification and regression models in various real-world problem domains, the most 

significant barrier is that, they generate models that are difficult to understand. The procedure 

to convert these opaque models into transparent models is called rule extraction. This thesis 

investigates the task of extracting comprehensible models from trained SVMs, thereby 

alleviating this limitation. The primary contribution of the thesis is the proposal of various 

hybrid algorithms to overcome the significant limitations of SVM by taking a novel approach 

to the task of extracting comprehensible models. This thesis investigates various ways to 

extract the knowledge learnt by SVM during training. The basic contribution of the thesis is 

to extract rules using SVM and from SVM. During rule extraction using SVM, SVM is used 

as a pre-processor only, where only support vectors are extracted resulting in Case-SA 

dataset. During rule extraction from SVM, the trained SVM is used to predict support vector 

instances and the training instances, where again two variants are proposed those are Case-SP 

and Case-P, respectively. Hence, the modified data is the replica of the knowledge learnt by 

SVM during training. 

This thesis also investigates the efficiency of our proposed rule extraction approach in 

solving Bankruptcy Prediction in Banks problem. Bankruptcy is a legally declared inability 

or impairment of ability to pay its creditors. Bankruptcy prediction in banks and corporate 

firms is the most researched area in the field of statistics and machine learning. Bank 

management would be interested in the comprehensibility of the algorithms used for 

predictions. We extracted fuzzy rules for bankruptcy prediction problems using fuzzy rule 

based systems and the efficiency of the fuzzy rules is then compared with the rules extracted 

using Decision Tree. Further, this thesis investigates the efficiency of rules extracted using 

our proposed approaches to solve real time data mining problems. In real time data mining 

applications, either almost all or more than 90% of the instances belong to one class, while a 

very few instances belong to the other class which is usually the more important class. In that 

sense, the datasets are termed as unbalanced. The class imbalance problem has been an 

evolving topic of research in data mining. It is observed from the literature that machine 

learning techniques tend to be biased towards majority class, thus producing poor prediction 

accuracy over the minority class. We proposed a rule extraction approach to extract rules for 

solving these problems. Furthermore, this thesis also presents the rule extraction approach for 

solving regression problems as well. For the first time, we proposed rule extraction approach 

for solving regression problems. Adaptive Network based Fuzzy Inference System, Dynamic 

Evolving Fuzzy Inference System and Classification and Regression Tree are employed for 

rule generation purpose. Later, modifications to Active Learning Based Approach (Martenes 

et al., 2009) are proposed by us, where extra instances are generated using various 

distributions such as Normal, Logistic and Gaussian. Data mining problems such as Churn 

prediction in bank credit card customers and fraud detection in Insurance are solved using 

mALBA. 

1. Introduction 

Artificial neural networks (ANNs) and SVMs are amongst the most successful machine 

learning techniques used in the area of data mining. But, they produce black box models that 

are difficult to understand for the end user. These models do not explicitly tell the end user 

the knowledge learnt by tem during the training phase. Predictive accuracy and the 

2

comprehensibility are two main driving factors to evaluate any learning system. It is observed 

that the learning method which constructs the model with the best predictive accuracy is not 

necessarily best method that produces the most comprehensible model. This thesis explores 

the following question: can we take the incomprehensible model produced by SVM, and 

closely approximate it in a language that better facilitates comprehensibility? 

1.1 Motivation 

The process of converting the opaque models (SVM in our research) into transparent 

models is often called Rule Extraction. Using the rules extracted one can certainly understand 

in a better way, how a prediction is made. Rule extraction from SVMs follows the footsteps 

of the earlier effort to obtain human-comprehensible rules from ANNs in order to explain the 

knowledge learnt by ANN during training. Much attention has been paid during last decades 

to find effective ways of extracting rules from ANNs and very less work has been reported 

towards representing the knowledge learnt by SVM during training. 

1.2 Significance of rule extraction 

Andrews et al. (1995) presented the motivation behind rule extraction from neural 

networks. A brief overview of their study will help us establish aim and significance of rule 

extraction from SVM techniques. 

 

 

 

 

Extracted rules provide the user explanation capability to the opaque model from 

which they are extracted. Gallent (1988) reported that rule extraction enabled a novice 

user to gain more insight into the problem at hand. Davis et al., (1977) and Gilbert, 

(1989) argues that even limited explanation can positively influence the system 

acceptance by the user. 

Rule extraction procedures enable the transparency of the internal states of a system. 

Transparency means that internal states of the machine learning system are both 

accessible and can be interpreted unambiguously. Such capability is mandatory for 

safety critical applications such as, air traffic control, operation of power plants, 

medical applications etc. 

Rule extraction improves generalisation ability of the model. It is difficult to 

determine if and when generalisation fails for specific cases even with evaluation 

methods as cross validation. By expressing learned knowledge as set of rules, an 

experienced user can anticipate or predict a generalisation failure. 

A learning system (i.e. rule extraction) might discover salient features in the input 

data whose importance was not previously recognised and new scientific theories can 

be induced (Craven and Shavlik, 1994). 

1.3 Rule Quality 

The quality of the extracted rules is a key measure of the success of the rule extraction 

algorithm. Four rule quality criteria were suggested for rule extraction algorithm (Andrews et 

al. 1995; Tickle et al. 1998). They are rule accuracy, fidelity, consistency and 

comprehensibility. In this context, a rule set is considered to be accurate if it can correctly 

classify previously unseen examples. 

# of 

Accuracy 

test patternscorrectly classified by rules 

Totalnumber of patternson test data 

3 

100

Similarly a rule set is considered to display a high level of fidelity if it can mimic the 

behaviour of the machine learning technique from which it was extracted. 

# of 

Fidelity 

patternswhere classification of rules AGREE with theclassification of SVM 

Totalnumber of patternson data 

An extracted rule set is deemed to be consistent if, under different training sessions the 

machine learning technique generates same rule sets that produce the same classifications of 

unseen examples. Finally the comprehensibility of a rule set is determined by measuring the 

size of the rule set (in terms of number of rules) and the number of antecedents per rule. 

1.4 Experimental Setup 

Empirical analysis in this thesis is carried out in a little different fashion. We first 

divided the dataset into 80:20 ratios. 20% data is then named validation set and stored aside 

for later use. Then 10 fold cross validation was performed on the 80% of the data for training 

and extracting rules. Later the efficiency of the rules is evaluated against validation set. 

Figure 1 presents the experimental setup followed throughout the research work presented in 

this thesis. In this class of research the experimental setup followed by me is unique and also 

my contribution. 

Data Set 100% 

80% Data for 10-fold Cross Validation Validation Set 20% 

1 2 3 10 

Rules extracted during 10-fold cross validation 

are tested against validation set later 

Figure 1: Experimental Setup Followed in this Thesis 

2. Research Objective 

In this thesis, I present and evaluate novel algorithms for the task of extracting 

comprehensible descriptions from SVM. The hypothesis advanced by this research is that it is 

possible to develop algorithms for extracting symbolic descriptions from trained SVMs that: 

(i) Produce more comprehensible, high-fidelity descriptions of trained SVMs using 

fuzzy logic approaches. 

(ii) Application of various intelligent techniques with explanation capability viz., 

FRBS, CART, ANFIS, DENFIS and NBTree for rule generation purpose. 

(iii) Rule extraction approach for solving regression problems as well. 

(iv) Scale to analyze medium scale and unbalanced datasets. 

Applications tested for solving classification problems include; benchmark datasets viz., 

Iris, Wine and WBC; extended to Bankruptcy Prediction in Banks using Spanish, Turkish, US 

4

and UK banks data; and Analytical CRM applications viz., Churn Prediction in Bank Credit 

Card Customers and Insurance Fraud Detection. 

Regression datasets analysed during the research study in this thesis includes; Auto 

MPG, Body Fat, Boston Housing, Forest Fires and Pollution. 

3. Rule Extraction from SVM 

Translucency refers to the extent to which the details of the ANN internal model 

structure are utilized by rule extraction algorithm. Based on the Translucency criteria the rule 

extraction techniques are classified into two major categories; Decompositional and 

Pedagogical. Third category is Eclectic (i.e. Hybrid), which incorporates the elements of both 

the decompositional and pedagogical approaches. 

3.1 Decompositional Approach 

A decompositional approach is closely intertwined with internal workings of SVM and 

its constructed hyperplane. Nunez et al. (2002) proposed a decompositional rule extraction 

approach wherein prototypes extracted from k-means clustering algorithm are combined with 

support vectors from SVM and then rules are extracted. k-means clustering algorithm is used 

to determine prototype vectors for each input class. An ellipsoid is defined in the input space 

combining these prototypes with support vectors and mapped to if-then rules. The main 

drawback of this algorithm is that the extracted rules are neither exclusive nor exhaustive 

which results in conflicting or missing rules for the classification of new data instances. 

RulExtSVM (Fu et al. 2004) is proposed for extracting if-then rules using intervals defined 

by hyperrectangular forms, which are generated using the intersection of the support vectors 

with the decision boundary. The disadvantage of this algorithm is the construction of 

hyperrectangles based on the number of support vectors. 

Hyper rectangle Rules Extraction (HRE) (Zhang et al. 2005) first constructs hyper 

rectangles according to the prototypes and the support vectors, then these hyper rectangles are 

projected onto coordinate axes and if-then rules are formed. Fung et al. (2005) proposed a 

rule extraction technique similar to SVM+Prototype but did not include computationally 

expensive clustering. Instead, the algorithm transforms the problem to a simpler, equivalent 

variant and constructs hyper cubes by solving linear programming problems. Each hypercube 

is then transformed to a rule. Chaves et al. (2005) proposed a decompositional Fuzzy Rule 

Extraction (FREx) approach, which applies triangular fuzzy membership function and 

determines the projection of the support vectors in the coordinate axes. Then each support 

vector is transformed into fuzzy if-then rule. 

Barakat and Bradely (2007) proposed Modified sequential covering algorithm termed 

SQRex-SVM to directly extract the rules from support vectors. Rule set performance is then 

evaluated using the true positives (TPs) and false positives (FPs), and AUC. A Multiple 

Kernel-Support Vector Machine (MK-SVM) (Chen et al. 2007) scheme is proposed for 

feature selection, rule extraction and prediction modelling and the extracted rules are tested 

for predicting the cancer tissue in gene expression data. It is observed that rules extracted 

using MK-SVM improves the explanation capacity of SVM. Recently, Martens et al. (2009) 

proposed a new active learning-based approach (ALBA) to extract rules from SVM models. 

ALBA makes use of the support vectors which are typically close to decision boundary to 

generate additional samples and extracts rules from all labelled samples of trained SVM. 

5

3.1.1 Gaps Observed 

It is observed from the literature that researchers have focused on extracting rules for 

solving benchmark classification problems only. Also, only Decision Tree algorithm was 

employed for generating rules. 

The efficiency of the fuzzy logic was ignored in the earlier research for extracting 

fuzzy rules from SVM. 

Importance of support vectors extracted using SVM was ignored totally. 

Generation of extra instances for small scale problems may improve the accuracy of 

the problem but because of the number of extra instances it may generate more number of 

rules, which in turn affects the comprehensibility of the rules extracted. Efficiency of ALBA 

(Martenes et al., 2009) was analysed using benchmark and small datasets only. 

3.1.2 Proposed approaches and contributions 

First, we proposed a novel hybrid fuzzy rule extraction approach by using SVM and 

Fuzzy Rule Based System in tandem. The proposed hybrid rule extraction approach consists 

of two major steps. During first step SVM is trained and support vectors are extracted 

resulting in Case-SA dataset (i.e. SVs set with corresponding actual target values). Later, 

during second step, FRBS and DT are employed separately to generate rules. The proposed 

approach is first applied on benchmark datasets viz., Iris and Wine and later it is tested in 

solving bankruptcy prediction in banks. Spanish, Turkish and US banks datasets are analysed 

and it is observed that the proposed hybrid fuzzy rule extraction approach generates not only 

fuzzy rules but also improves generalisation without compromising the accuracy of the 

system. It is observed that proposed approach yielded best accuracy of 92.31% with Spanish 

banks data. Whereas our proposed approach stand second in the list of classifiers with 87.5% 

accuracy using Turkish banks data and 96.15% accuracy using US banks data. Figure 2 

presents the overall architecture of the approaches proposed during this thesis work, feature 

selection step is not invoked when it is not mentioned in the proposed approach. During this 

proposed approach full feature data only is analyzed. 

3.2 Pedagogical Approach 

A pedagogical algorithm considers the trained model as a black box. Instead of looking 

at the internal structure, these algorithms directly extract rules which relate the inputs and 

outputs of the SVM. These techniques typically use the trained SVM model as an oracle to 

label or classify artificially generated training examples that are later used by a symbolic 

learning algorithm. The idea behind these techniques is the assumption that the trained model 

can better represent the data than the original data set. Trepan (Craven and Shavlik, 1996) 

and REX (Markowska-Kaczmar and Trelak, 2003) are some of the pedagogical approaches 

used for rule extraction from ANNs. 


Researchers argue that when the dataset is modified with the predictions of the SVM, 

the resulting modified data represents the knowledge of SVM. In this category SVM’s 

efficiency for predictions is mostly analysed by the researchers, whereas the efficiency of 

SVM for feature selection was totally ignored. 

6

Further, rule extraction from SVM for solving regression problems was never 

reported in literature. SVM’s efficiency of feature selection for regression problem also was 

also studied in the earlier research. 


Further, we proposed a hybrid rule extraction algorithm for solving classification and 

regression problems as well. Where feature selection using SVM-RFE is first employed and 

the actual target values of training instances are replaced by the predictions of SVM/SVR 

models and Case-P (i.e. training instances with corresponding predicted target values) 

datasets are generated. Later, using Case-P dataset with reduced features rule are extracted. 

For classification, benchmark problems viz., iris, wine and WBC and Bankruptcy prediction 

problems viz., Spanish, Turkish, US and UK banks are analysed. It is observed that reduced 

features reduce the complexity of the system and increases the comprehensibility of the rules. 

For regression analysis, efficiency of the rules is evaluated for solving benchmark regression 

problems. Empirical results show that the accuracy yielded using the proposed approach i.e. 

with less features is better than that of the accuracy yielded using full feature data. It is 

observed that the number of rules extracted using reduced features is very much less which 

results in better comprehensibility of the black box model from which they are extracted. The 

architecture of the approaches proposed is shown in Figure 2 below. 

Phase 1 

Data set 

Full Attributes 

Feature selection 

SVM-RFE 

Data set 

Reduced 

Attributes 

Support Vectors 

SVM/SVR 

Phase 2 

Modified Data 

Case-SA, Case-SP 

Case-P 

Case-A 

Phase 3 

DT/FRBS/CART/ANFIS/ 

DENFIS/NBTree 

Test set 

Rules 

Predictions 

Figure 2: Overall Architecture of the Proposed Rule Extraction Approaches 

Note: Case-A represents the Training set with corresponding Actual target values. 

Case-P represents the Training set with corresponding Predicted target values. 

Case-SA represents the Support vector set with corresponding Actual target values. 

Case-SP represents the Support vector set with corresponding Predicted target values. 

The blue coloured process in the figure 2 represents our contributions. 

7

We employed feature selection using SVM-RFE and empirical analysis is carried out 

using reduced feature data also. It is observed that rules extracted using reduced feature data 

produce less number of rules and the length of the rule also become less, resulting in 

improved comprehensibility of the rules extracted. 

3.3 Eclectic Approach 

Eclectic rule extraction techniques incorporate the elements of both the 

decompositional and pedagogical approaches (Andrews et al., 1995; Barakat and Diederich, 

2004 and 2005). A hybrid rule extraction technique is proposed by Barakat and Diederich 

(2004 and 2005). After developing the SVM model using training set, they used the 

developed model to predict the output class labels for training instances and support vectors. 

Later they used decision tree for generating rules. The quality of the extracted rules is then 

measured using AUC (Area under Receiving Operators Characteristics Curve) (Barakat, & 

Bradely, 2006). They extracted crisp rules from the data. 


Efficiency of regression rules using SVM was not analyzed before and no rule 

extraction procedure was proposed to extract rules to solve regression problems. 

SVM’s efficiency for feature selection also was ignored in this category of the rule 

extraction approaches. The number of rules extracted using all the features of the dataset is 

huge, resulting in less comprehensible system. Only benchmark problems were solved and 

analyzed in the previous research. 

Efficiency of rules extracted from SVM for solving unbalanced, medium scale data 

mining problems was never studied or reported earlier. 


In this category, we proposed a hybrid rule extraction algorithm for solving regression 

problems. For extracting rules we employed CART, ANFIS and DENFIS algorithms 

separately. The proposed regression rule extraction approach is consists of three major steps. 

(i) SVR model is trained and support vectors are extracted. 

(ii) Actual target values of the extracted support vectors are then replaced by the 

corresponding predictions of the developed SVR model resulting in Case-SP 

dataset (i.e. SVs set with corresponding predicted target values). 

(iii) This modified data is then used to generate rules using CART, ANFIS and 

DENFIS. 

Various benchmark regression problems were solved to evaluate the efficiency of the 

proposed approach. Empirical study shows that the efficiency of the rules increased in the 

form of least error (i.e. high accuracy) with proposed approach. It is also observed that the 

hybrid SVR+CART, SVR+ANFIS and SVR+DENFIS yielded better results compared to the 

stand alone CART, ANFIS and DENFIS. 

Further, we proposed a novel hybrid approach of rule extraction to solve unbalanced, 

medium scale problems in data mining. The proposed approach is carried out in three steps. 

Feature selection using SVM-RFE is carried out during first steps. During second step, 

support vectors are extracted. Further, predictions of these SVs are obtained using developed 

SVM model and the corresponding actual target values are replaced by the predictions, 

8

esulting in Case-SP datasets. Later during final step, Case-SP (i.e. SVs set with 

corresponding predicted target values) dataset is used to train NBTree and rules are 

generated. The proposed approach is then applied to solve Churn Prediction in Bank Credit 

Card customers. As the problem at hand is unbalanced, we employed various balancing 

techniques viz., Undersampling, Oversampling, SMOTE and combination of Undersampling 

and Oversampling. Later, using this modified data rules were extracted using NBTree. It is 

observed that the more generalised rules are obtained using NBTree. It is also observed that 

rules extracted using NBTree are efficient for solving unbalanced and medium scale 

problems. Feature selection was also performed using SVM-RFE (Guyon, 2002) during the 

first step of the proposed approach. Later, the dataset with reduced features has been used to 

generate rules. It is observed that once again feature selection using SVM outperformed the 

case where feature selection was not used. Figure 2 shows the overall architecture of the 

approaches proposed. 

Furthermore, we proposed an extension to ALBA (Martenes et al., 2009) and called it 

mALBA (modified ALBA). The proposed mALBA comprises three phases. Feature selection 

phase, Active learning phase and rule generation phase. During feature selection phase, 

SVM-RFE is employed for feature selection. Active learning phase of mALBA consists of 

four steps. (i) SVM model is trained and SVs are obtained. (ii) Distance is calculated between 

SVs and training instances. 

Feature Selection Phase 

Training set 

Full features 

SVM-RFE 

Training set 

Reduced features 

Active Learning Phase 

SVM 

Step 1 

Step 2 

Support Vectors 

Step 3 

Step 4 

Step 4 

Data generated 

using mALBA 

SVs + Generated 

data 

Modified Training 

set 

NBTree 

Test / Validation 

Tree / Rules 

Rule Generation Phase 

Predictions 

Figure 3: Architecture of the proposed rule extraction approach 

The blue coloured process in figure 3 represents our contributions. 

9

(iii) Extra instances are artificially generated using Uniform, Normal and Logistic distribution 

separately. (iv) The predictions of these generated instances are then obtained using the 

trained SVM model and Case-P and Case-SP datasets are obtained. Later, during rule 

generation phase, this modified data is used to train NBTree (Naive Bayes Tree) and rules are 

generated. The application of the proposed mALBA is also extended to data mining problem 

in finance viz., Churn prediction in bank credit card customers and Fraud detection in 

insurance. The datasets analysed during this research study are medium scale in size and 

highly unbalanced in nature. It is observed that our proposed mALBA extracted more 

generalised rules on unbalanced datasets. Feature selection preceding mALBA resulted in 

less number of rules thereby improving comprehensibility. Figure 3 presents the overall 

architecture of mALBA approach. 

4. Organisation of the Thesis 

In this thesis, I present and evaluate novel algorithms for the task of extracting 

comprehensible descriptions from hard-to-understand learning systems i.e. SVM. The 

hypothesis advanced by this research is that it is possible to develop algorithms for extracting 

symbolic descriptions from trained SVM 

Chapter 1: Introduction. This chapter provides the details about the issues involved in 

rule extraction. The rule extraction from SVM follows the footstep of the rule extraction from 

ANNs. The taxonomy proposed by Andrews et al., 1995 for rule extraction techniques in 

general is presented which is also followed during the research presented in this thesis. This 

chapter also provides the details about the quality measure of the rules extracted from black 

box techniques in general. 

Chapter 2: Rule Extraction from SVM: an Introduction. This chapter provides 

background material for the rest of the thesis. Support Vector Machine and Support Vector 

Regression are first presented in detail. Literature survey of the rule extraction from SVM 

and the gaps/shortcomings identified during the survey are presented. This chapter also 

provides overviews of various machine learning (intelligent techniques) used for rule 

generation purpose. They are, Fuzzy Rule Based Systems (FRBS), Decision Tree (DT), 

Classification and Regression Tree (CART), Adaptive Network based Fuzzy Inference 

Systems (ANFIS), Dynamic Evolving Neuro-Fuzzy Inference System (DENFIS) and Naive 

Bayes Tree (NBTree). 

Chapter 3: Fuzzy Rule Extraction using SVM for Solving Classification Problems. 

This chapter presents the proposed decompositional rule extraction approach using SVM. In 

this chapter, the advantage of fuzzy rule based classification systems over crisp systems is 

analysed. In this chapter, the details of the proposed approach are first described with the 

empirical analysis as well. During the research study presented in this chapter, fuzzy rules are 

extracted using Case-SA dataset i.e. support vectors set with actual corresponding target 

values. 

The proposed rule extraction hybrid approach is first tested on benchmark datasets viz., 

Iris and Wine and the efficiency of the proposed rule extraction approach is extended to solve 

bankruptcy prediction in banks. Spanish, Turkish and US banks datasets were used for this 

study. It is observed that fuzzy rules provide better understanding and also outperform other 

techniques tested. 

10

Chapter 4: Rule Extraction from SVR for Solving Regression Problems. This 

chapter presents first ever rule extraction approach from SVM for solving regression 

problems. The proposed rule extraction approach is a decompositional approach, which is one 

of the main contributions of this thesis. Intelligent techniques such as ANFIS, DENFIS and 

CART are employed to extract rules. During this research study, various benchmark 

regression datasets viz., Auto MPG, Body Fat, Boston Housing, Forest Fires and Pollution, 

are analysed and the efficiency of the rules is evaluated in the form of RMSE i.e. root mean 

squared error. It is observed that the proposed hybrid rule extraction approach yielded better 

results and outperformed the stand alone ANFIS, DENFIS and CART. 

Chapter 5: Rule Extraction from SVM using Feature Section. This chapter presents 

a pedagogical rule extraction technique from SVM, which also SVM as feature selection 

algorithm and the actual target values of the training set are then replaced by the predictions 

of SVM resulting in Case-P dataset. By employing Case-P dataset with reduced feature data, 

rules are extracted using CART, DT, ANFIS and DENFIS. Researchers argued that the 

knowledge of the trained SVM can be represented in the form of support vectors or the 

predictions of the developed SVM. This chapter presents a hybrid rule extraction approach 

where we argue that feature selection using SVM also represents the knowledge learnt by 

SVM during training. 

Using the proposed hybrid rule extraction approach for rule extraction, both 

classification and regression problems are solved and the empirical study is presented in this 

chapter. Datasets analysed for classification analysis are, benchmark datasets viz., Iris, Wine, 

WBC; Bankruptcy prediction datasets viz., Spanish, Turkish, US and UK banks data. Datasets 

analysed for regression analysis are, Auto MPG, Body Fat, Boston Housing, Forest Fires and 

Pollution. It is observed that dataset with reduced features tend to extract smaller rules and 

the less number of rules are extracted resulting in improved comprehensibility. 

Chapter 6: Rule Extraction from SVM for Data Mining on Unbalanced datasets. 

This chapter presents the proposed eclectic rule extraction technique, which is used to 

analyze medium scale unbalanced dataset. During the proposed hybrid rule extraction 

approach feature selection using SVM is first employed. Later, support vectors are obtained 

and the actual target values of the support vectors are then replaced by the corresponding 

predictions of the SVM resulting in Case-SP dataset. Case-SP dataset is then employed to 

train NBTree classifier and rules are extracted. The proposed rule extraction approach 

simplifies the problem with reduction in features (i.e. horizontal) and reduction in sample size 

in the form of support vectors (i.e. vertical). Dealing with such unbalanced datasets is an 

emerging area of research in computer science and statistics community. This chapter also 

presents the overview of the problem faced by unbalanced datasets and the approaches 

proposed to deal with unbalanced datasets in the literature. 

One of the most important financial problems analyzed during this research study is 

related to customer relationship management (CRM) and the dataset analysed is concerned to 

churn prediction in bank credit card customers. Empirical results show that using our 

proposed hybrid rule extraction approach the complexity of the system is reduced and during 

the process most comprehensible rules are extracted without compromising the accuracy of 

the classifier. 

Chapter 7: Modified Active Learning Based Approach for Rule Extraction from 

SVM. In this chapter a new modified active learning based approach for rule extraction from 

11

SVM is proposed, which is a decompositional approach. During this proposed approach 

support vectors are extracted and the distance between support vectors set and training set is 

calculated and using various distributions viz., Normal, Gaussian and Logistic artificial data 

is generated, which is supposed to be near support vector instances. mALBA is also preceded 

by feature selection using SVM. This chapter presents the applications analysed during this 

research study. 

Two most important problems in finance were solved using the proposed approach, 

viz., Churn Prediction in Bank Credit Card Customers and Fraud Detection in Insurance. The 

datasets analysed are medium scale in size and unbalanced in nature. In this chapter we 

presented the benefits of the proposed approach towards dealing with unbalanced problems 

occurring in banking and finance. 

Chapter 8: Overall Conclusions. This chapter presents the overall conclusion made 

out of the various proposed hybrid rule extraction approaches. In this chapter, conclusions 

made for various proposed rule extraction approaches applied for solving classification 

problems, regression problems and data mining problems are presented separately. 

Research Publication out of the Thesis 

1. M.A.H. Farquad, V. Ravi and S.B. Raju, “Support vector regression based hybrid rule 

extraction methods for forecasting”. Expert Systems with Applications, 37(8), 5577- 

5589, 2010. 

2. M.A.H. Farquad, V. Ravi and S.B. Raju, “Rule Extraction from Support Vector 

Machines: A Hybrid Approach for classification and regression problems”, 

International Journal of Information and Decision Sciences (IJIDS), 2010. (In Press) 

3. M.A.H. Farquad, V. Ravi and S.B. Raju, “Rule Extraction from Support Vector 

Machine using modified Active Learning Based Approach: An application to CRM”, 

Setchi et al. (Eds.): 14th International Conference on Knowledge-Based and 

Intelligent Information & Engineering Systems, KES 2010, Part I, LNAI 6276, pp. 

461–470, September 8-10, 2010, Cardiff, Wales, UK. 

4. M.A.H. Farquad, V. Ravi and S.B. Raju, “Support Vector Machine based Hybrid 

Classifiers and Rule Extraction Thereof: Application to Bankruptcy Prediction in 

Banks”, In Soria, E., Martín, J.D., Magdalena, R., Martínez, M., Serrano, A.J., 

editors, Handbook of Research on Machine Learning Applications and Trends: 

Algorithms, Methods and Techniques, Vol. II, pp. 404-426, 2010, IGI Global, USA. 

5. M.A.H. Farquad, V. Ravi and S.B. Raju, “Data Mining using Rules Extracted from 

SVM: an Application to Churn Prediction in Bank Credit Cards”, Presented in 12th 

International Conference on Rough Sets, Fuzzy Sets, Data Mining & Granular 

Computing (RSFDGrC’09), December 16-18, 2009, LNAI 5908, pp. 390-397, New 

Delhi, India. 

6. M.A.H. Farquad, V. Ravi and S.B. Raju, “Rule Extraction using Support Vector 

Machine Based Hybrid Classifier”, Presented in TENCON-2008, IEEE Region 10 

Conference, 19-21 November, Hyderabad, India, 2008. 

12

7. M.A.H. Farquad, V. Ravi and S. Bapi Raju, “Rule Extraction from SVM for Analytical 

CRM: an Application to Predict Churn in Bank Credit Cards”, Decision Support 

Systems. (Under Review) 

8. M.A.H. Farquad, V. Ravi and S. Bapi Raju, “Analytical CRM using SVM: a Modified 

Active Learning Based Rule Extraction approach”, Information Sciences. (Under Review). 

References 

Andrews, R. Diederich, J. and Tickle, A., “Survey and Critique of Techniques for Extracting 

Rules from Trained Artificial Neural Networks,” Knowledge Based Systems, vol. 8, no. 

6, pp. 373-389, 1995. 

Barakat, N.H. and Bradley, A.P.,“Rule Extraction from Support Vector Machines: Measuring 

the Explanation Capability Using the Area under the ROC Curve”, Proceedings of the 

18th International Conference on Pattern Recognition (ICPR'06), Hong Kong, 2006. 

Barakat, N.H. and Bradley, A.P., “Rule Extraction from Support Vector Machines: A 

Sequential Covering Approach,” IEEE Trans. Knowledge and Data Eng., vol. 19, no. 6, 

pp. 729-741, June 2007. 

Barakat, N.H. and Diederich, J., “Learning-based Rule-Extraction from Support Vector 

Machines”, In proceedings of the 14th International Conference on Computer Theory 

and applications ICCTA'2004, Alexandria, Egypt, 2004. 

Barakat, N.H. and Diederich, J., “Eclectic Rule-Extraction from Support Vector Machines,” 

Int’l J. Computational Intelligence, vol. 2, no. 1, pp. 59-62, 2005. 

Breiman, L., Friedman, J., Olsen, R. and Stone, C., “Classification and Regression Trees”, 

Wadsworth and Brooks, 1984. 

Chaves, Ad.C.F., Vellasco, M.M.B.R. and Tanscheit, R., “Fuzzy Rule Extraction from 

Support Vector Machines”, Fifth International Conference on Hybrid Intelligent 

Systems, Rio de Janeiro, Brazil, November 06-09, 2005. 

Craven, M.W., “Extracting Comprehensible Models from Trained Neural Networks”, PhD 

thesis, Department of Computer Science, University of Wisconsin-Madison, 1996. 

Clark, P. and Niblett, T., “The CN2 Induction Algorithm”, Machine Learning, vol. 3, no. 4, 

pp. 261-283, 1989. 

Craven, M. and Shavlik, J., “Extracting Tree-Structured Representations of Trained 

Networks”, Advances in Neural Information Processing Systems, vol. 8, D. Touretzky, 

M. Mozer, and M. Hasselmo, eds., pp. 24-30, The MIT Press, citeseer.ist. 

psu.edu/craven96extracting.html, 1996. 

Fu, X., Ong, C.J., Keerthi, S., Hung, G.G. and Goh, L., “Extracting the Knowledge 

Embedded in Support Vector Machines”, In International Joint Conference on Neural 

Networks (IJCNN’04), Budapest, Hungary, 2004. 

Fung, G., Sandilya, S. and Rao, R., “Rule Extraction from Linear Support Vector Machines,” 

Proc. 11th ACM SIGKDD International Conference on Knowledge Discovery in Data 

Mining (KDD ’05), pp. 32-40, 2005. 

13

Markowska-Kaczmar, U. and Trelak, W., “Extraction of Fuzzy Rules from Trained Neural 

Network Using Evolutionary Algorithm,” Proc. European Symp. Artificial Neural 

Networks (ESANN ’03), pp. 149-154, 2003. 

Martens, D., Baesens, B. and Gestel, T.V., “Decompositional Rule Extraction from Support 

Vector Machines by Active Learning”, IEEE Transactions on Knowledge and Data 

Engineering, 21(2), 178-191, 2009. 

Martens, D., Baesens, B., Gestel, T.V. and Vanthienen, J., “Comprehensible credit scoring 

models using rule extraction from support vector machines”, European Journal of 

Operational Research 183 (2007) 1466–1476 

Martens, D., De Backer, M., Haesen, R., Snoeck, M., Vanthienen, J. and Baesens, B. 

“Classification with Ant Colony Optimization,” IEEE Trans. Evolutionary Computation, 

vol. 11, no. 5, pp. 651-665, 2007. 

Nunez, H., Angulo, C. and Catala` , A., “Rule Extraction from Support Vector Machines,” 

Proc. European Symp. Artificial Neural Networks (ESANN ’02), pp. 107-112, 2002. 

Nunez-Castro, H., Angulo-Bahon, C., Catala-Mallofre, A., “Rule Based Learning Systems 

from SVM and RBFNN”, TENDENCIAS DE LA MINERIA DE DATOS EN ESPAÑA. 

Red Española de Minería de Datos. 1 ed. pp. 13-24, 2004. (available online in English at 

http://www.lsi.us.es/redmidas/Capitulos/LMD02.pdf.) 

Quinlan, J. , “C4.5 Programs for Machine Learning”, Morgan Kaufmann, 1993. 

Zhang, Y., Su, H., Jia, T. and Chu, J., “Rule Extraction from Trained Support Vector 

Machines”, Lecture Notes in Computer Science, Springer Berlin / Heidelberg, vol. 3518, 

pp. 61-70, 2005. 

14

Rule Extraction from Support Vector Machine - Department of ...

Create successful ePaper yourself

Delete template?

Save as template?