10.02.2013 Views

Classification of Intrusion Detection Dataset using machine learning ...

Classification of Intrusion Detection Dataset using machine learning ...

Classification of Intrusion Detection Dataset using machine learning ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

International Journal <strong>of</strong> Electronics and Computer Science Engineering 1044<br />

ISSN 2277-1956/V1N3-1044-1051<br />

Available Online at www.ijecse.org ISSN- 2277-1956<br />

<strong>Classification</strong> <strong>of</strong> <strong>Intrusion</strong> <strong>Detection</strong> <strong>Dataset</strong> <strong>using</strong><br />

<strong>machine</strong> <strong>learning</strong> Approaches<br />

Neethu B,<br />

Department <strong>of</strong> Computer Science, Amrita University<br />

Email id- 1 neethu_b_88@yahoo.co.in<br />

Abstract— The paper describes about a method <strong>of</strong> intrusion detection that uses <strong>machine</strong> <strong>learning</strong> algorithms. Here we<br />

discuss about the combinational use <strong>of</strong> two <strong>machine</strong> <strong>learning</strong> algorithms called Principal Component Analysis and Naive<br />

Bayes classifier. The dimensionality <strong>of</strong> the dataset is reduced by <strong>using</strong> the principal component analysis and the<br />

classification <strong>of</strong> the dataset in to normal and attack classes is done by <strong>using</strong> Naïve Bayes Classifier. The experiments were<br />

conducted on the intrusion detection dataset called KDD’99 cup dataset. The comparison <strong>of</strong> the results with and without<br />

dimensionality reduction is also done.<br />

Keywords: Naïve Bayes classifier, Principal component analysis<br />

I. INTRODUCTION<br />

With the growth <strong>of</strong> Internet there have seen a tremendous increase in the number <strong>of</strong> attacks, the intrusion detection<br />

system has become a main stream <strong>of</strong> information security. With the firewall we can provide some protection but<br />

they do not provide full protection [1] . The purpose <strong>of</strong> the intrusion detection system is to help computer systems to<br />

deal with attacks. There are two types <strong>of</strong> intrusion detection system based on the types <strong>of</strong> operations used to detect<br />

intrusions. Anomaly detection system creates a database <strong>of</strong> normal behavior and any deviations from the normal<br />

behavior are occurred an alert is triggered regarding the occurrence <strong>of</strong> intrusions. Misuse <strong>Detection</strong> system stores the<br />

predefined attack patterns in the database if a similar data and if similar situations occur it is classified as attack.<br />

Based on the source <strong>of</strong> data the intrusion detection system are classified to Host based IDS and Network based IDS.<br />

In network based IDS the individual packet flowing through the network are analyzed. The host based IDS analyzes<br />

the activities on the single computer or host.<br />

The main disadvantage <strong>of</strong> the misuse detection (signature detection) method is that it cannot detect novel attacks and<br />

variation <strong>of</strong> known attacks. Even if the novel attack is detected and its signature is developed there will be a<br />

substantial latency in the deployment <strong>of</strong> signature over the network. To avoid these drawbacks we go for anomaly<br />

based detection methods. With this approach, known and novel attacks can be detected. The problem is that it will<br />

generate more false alarms.<br />

Another challenge for the IDS is to generalize from the previously observed behavior to recognize similar future<br />

behavior. This method will aids the Anomaly detection system more compared to signature based detection module<br />

since it will be able to detect the future behavior that is identical to past abnormal behavior in order to avoid false<br />

positives.<br />

We developed an anomaly detection system that uses <strong>machine</strong> <strong>learning</strong> algorithms for <strong>learning</strong> the normal behavior<br />

<strong>of</strong> the data.<br />

II. RELATED WORKS<br />

In 1980, the concept <strong>of</strong> intrusion detection began with Anderson’s seminal paper [3]; he introduced a threat<br />

classification model that develops a security monitoring surveillance system based on detecting anomalies in user<br />

behavior.. In 2000, Valdes et al. [19] developed an anomaly based IDS that employed naïve Bayesian network to<br />

perform intrusion detecting on traffic bursts. In the same year, Shyu et al. [20] proposed an anomaly based intrusion<br />

detection scheme <strong>using</strong> principal components analysis (PCA), where PCA was applied to reduce the dimensionality<br />

<strong>of</strong> the audit data and arrive at a classifier that is a function <strong>of</strong> the principal components. In 2003, Yeung et al. [22]<br />

proposed an anomaly based intrusion detection <strong>using</strong> hidden Markov models that computes the sample likelihood <strong>of</strong><br />

an observed sequence <strong>using</strong> the forward or backward algorithm for identifying anomalous. A hybrid DTNB


<strong>Classification</strong> <strong>of</strong> <strong>Intrusion</strong> <strong>Detection</strong> <strong>Dataset</strong> <strong>using</strong> <strong>machine</strong> <strong>learning</strong> Approaches<br />

ISSN 2277-1956/V1N3-1044-1051<br />

1045<br />

approach is used in [38] by combining Decision Table (DT) with Naïve Bayes to design an efficient intrusion<br />

detection model. The authors used Neuro-Fuzzy techniques (NEFCLASS) and JRip classifier to reduce false alerts<br />

in [39]. They take SNORT alerts as input and <strong>learning</strong> from training in order to achieve their goal. All the above<br />

papers used the KDD Cup 1999 dataset for their experimentation.<br />

III. INTRUSION DETECTION SYSTEMS<br />

<strong>Intrusion</strong> detection system is s<strong>of</strong>tware that detects attacks on a network or computer system [3].<strong>Intrusion</strong> <strong>Detection</strong><br />

Systems are normally categorized into misuse detection and anomaly detection. In misuse detection systems,<br />

previous attack patterns are stored in a database. Any data similar to this data is classified as attack. Anomaly<br />

detection refers to statistical knowledge about normal activity. <strong>Intrusion</strong>s correspond to deviations from the normal<br />

activity <strong>of</strong> system. The anomaly detection system has high false positive/ negative alarm rate compared to misuse<br />

detection systems. However, it is more effective in detecting new attacks or deviation from the nominal usage.<br />

Furthermore, the IDS is classified based on the data source: Network IDS (NIDS) and Host-based IDS (HIDS)<br />

systems. The NIDS analyses the data through the network. The HIDS check for attacks or intrusions in the single<br />

host only. It does this by analyzing the audit logs in the system.<br />

Current intrusion detection systems are futile to cope with new, elegant and structured attacks, due to several<br />

practical and theoretical limitations. These limitations have lead many researchers to apply different <strong>machine</strong><br />

<strong>learning</strong> approaches for detecting anomalies.<br />

IV. MACHINE LEARNING<br />

One <strong>of</strong> the central problems for IDSs is to build effective behavior models or patterns to distinguish normal<br />

behaviors from abnormal behaviors by observing collected audit data. To solve this problem, earlier IDSs usually<br />

rely on security experts to analyze the audit data and construct intrusion detection rules manually. However, since<br />

the amount <strong>of</strong> audit data, including network traffic, process execution traces and user command data, etc., increases<br />

vary fast, it has become a time-consuming, tedious and even impossible work for human experts to analyze and<br />

extract attack signatures or detection rules from dynamic, huge volumes <strong>of</strong> audit data. Furthermore, detection rules<br />

constructed by human experts are usually based on fixed features or signatures <strong>of</strong> existing attacks, so it will be very<br />

difficult for these rules to detect deformed or even completely new attacks [41].<br />

Due to the above deficiencies <strong>of</strong> IDSs based on human experts, intrusion detection techniques <strong>using</strong> <strong>machine</strong><br />

<strong>learning</strong> have attracted more and more interests in recent years. Machine <strong>learning</strong> is a field <strong>of</strong> study which provides<br />

the computers with the ability <strong>of</strong> <strong>learning</strong> from previous experience. Machine <strong>learning</strong> is based heavily on statistical<br />

analysis <strong>of</strong> data and some algorithms can use patterns found in previous data to make decisions about new data.<br />

V. SYSTEM FRAMEWORK<br />

The proposed IDS in this paper in have a dimensionality reduction module <strong>using</strong> Principal Component Analysis<br />

(PCA) and a classification module <strong>using</strong> naïve bayes classifier for detection <strong>of</strong> intrusions. The input data used in the<br />

system is the intrusion detection dataset called KDD’99 cup dataset. The classifier is first trained with the labeled<br />

data and use the trained classifier for detection <strong>of</strong> intrusions in unlabeled testing data.<br />

VI. DIMENSIONALITY REDUCTION<br />

Effective input attributes selection from intrusion detection datasets thereby reducing the input data dimension is<br />

one <strong>of</strong> the important research challenges for constructing high performance IDS. Irrelevant and redundant attributes<br />

<strong>of</strong> intrusion detection dataset may lead to complex intrusion detection model as well as reduce detection accuracy.<br />

The attribute selection methods <strong>of</strong> data mining algorithms identify some <strong>of</strong> the important attributes for detecting<br />

anomalous network connections. The main advantage <strong>of</strong> doing dimensionality reduction is that it will reduce the


ISSN 2277-1956/V1N3-1044-1051<br />

IJECSE,Volume1,Number3<br />

Neethu B et al.<br />

memory requirement and increases the speed <strong>of</strong> execution thereby increases the overall performance. The<br />

dimensionality reduction here is done by <strong>using</strong> Principal Component Analysis (PCA).<br />

A Principal Component Analysis<br />

Principal Component Analysis is one <strong>of</strong> the most widely used dimensionality reduction techniques for data analysis<br />

and compression. It is based on transforming a relatively large number <strong>of</strong> variables into a smaller number <strong>of</strong><br />

uncorrelated variables by finding a few orthogonal linear combinations <strong>of</strong> the original variables with the largest<br />

variance. The first principal component <strong>of</strong> the transformation is the linear combination <strong>of</strong> the original variables with<br />

the largest variance; the second principal component is the linear combination <strong>of</strong> the original variables with the<br />

second largest variance and orthogonal to the first principal component and so on. In many data sets, the first several<br />

principal components contribute most <strong>of</strong> the variance in the original data set, so that the rest can be disregarded with<br />

minimal loss <strong>of</strong> the variance for dimension reduction <strong>of</strong> the data [42] .<br />

PCA reduces the amount <strong>of</strong> dimensions required to classify new data and produces a set <strong>of</strong> principal components,<br />

which are orthonormal eigenvalue/eigenvector pairs[13]. The steps for doing principal component analysis are given<br />

below.<br />

Algorithm<br />

1. Get input data<br />

2. Subtract the mean<br />

3. Calculate the covariance matrix<br />

4. Calculate the eigenvectors and eigenvalues <strong>of</strong> the covariance matrix.<br />

5. Sort the Eigen values in descending order.<br />

6. Calculate the feature vectors.<br />

Final Data=Row Feature vectors x Row Data Rowfeaturevector is the matrix in which eigenvectors in the<br />

columns transposed and RowDataAdjust is the mean adjusted input data<br />

The input to the principal component analysis program is the KDD’99 dataset. The dataset lists 41 attributes and<br />

their values. Some <strong>of</strong> the attributes seems redundant and not necessary for the calculation .So we select the relevant<br />

attributes for the detection <strong>of</strong> intrusion by <strong>using</strong> principal component analysis. The PCA will first calculate mean <strong>of</strong><br />

the data at each column. This mean will be subtracted from each <strong>of</strong> the data points to get a dataset with mean 0.<br />

With the mean subtracted dataset we will find the covariance matrix. For the covariance matrix we will find the<br />

Eigen vector and Eigen values by <strong>using</strong> the equation.<br />

Ax = x (1)<br />

And<br />

A –λI =0 (2)<br />

After getting the Eigen vectors and Eigen values. Sort the Eigen vectors in descending order corresponds to the<br />

Eigen value. The Eigen vector with the highest Eigen value represents the first principal component <strong>of</strong> the data. The


<strong>Classification</strong> <strong>of</strong> <strong>Intrusion</strong> <strong>Detection</strong> <strong>Dataset</strong> <strong>using</strong> <strong>machine</strong> <strong>learning</strong> Approaches<br />

K Eigen vectors with the highest Eigen values are selected. The dimensionality <strong>of</strong> subspace K is determined by the<br />

equation.<br />

= threshold (3)<br />

ISSN 2277-1956/V1N3-1044-1051<br />

VII.CLASSIFICATION<br />

1047<br />

Classifier construction is another important challenge to build efficient IDS. The classification accuracy <strong>of</strong> most<br />

existing data mining algorithms needs to be improved, because it is very difficult to detect several new attacks, as<br />

the attackers are continuously changing their attack patterns. The classifier will separate the input dataset to two<br />

classes: Normal and Anomaly. Here for the experiment naïve bayes classifier is used.<br />

A Naïve Bayes Classifier<br />

The Naive Bayes classifier is a supervised <strong>learning</strong> algorithm based largely <strong>of</strong>f <strong>of</strong> Bayes Theorem<br />

P(A|B) (4)<br />

According to this theorem, we can calculate the probability <strong>of</strong> event A conditioned on data B by first calculating the<br />

probability <strong>of</strong> the data B conditioned by event A multiplied by<br />

the probability <strong>of</strong> event A and normalized by the probability <strong>of</strong> the data B. In regards to intrusion detection, this<br />

means that we can calculate the probability that an attack is occurring based on some data by first calculating the<br />

probability that some previous data was part <strong>of</strong> that type <strong>of</strong> attack and then multiplying by the probability <strong>of</strong> that<br />

type <strong>of</strong> attack occurring.<br />

The algorithm then works as follows: The algorithm will calculate the prior probability for each class P(Ci) and<br />

conditional probabilities p(Ai|Cj) for each class. The prior probability <strong>of</strong> the class is computed by counting how<br />

many times the class occur in the dataset. For discrete attribute values Ai, prior probability P(Ai) is calculated <strong>using</strong><br />

the sample mean and for continues values, this is calculated via normal distribution. The conditional probability<br />

P(Ai|Cj) is calculated by counting how many times the attribute comes in the specific class. To classify a new<br />

example we need to find the probability <strong>of</strong> being new example ei in class Cj. To classify the example, the probability<br />

that ei is in a class is the product <strong>of</strong> the conditional probabilities for each attribute value with prior probability for<br />

that class.<br />

P(ei|Cj) = P(Cj) P(Ai|Cj) (5)<br />

The posterior probability P(Cj | ei) is then found for each class and the example classifies with the highest posterior<br />

probability for that example.<br />

VIII.EXPERIMENTAL RESULTS<br />

A. <strong>Dataset</strong><br />

The dataset used in the experiment is KDD99 cup dataset. The dataset has been widely used for the anomaly<br />

detection methods since 1999.The KDD Cup '99[13] dataset was created by processing the tcpdump portions <strong>of</strong> the<br />

1998 DARPA <strong>Intrusion</strong> <strong>Detection</strong> System (IDS) Evaluation dataset, created by Lincoln Lab under contract to<br />

DARPA . KDD training dataset consists <strong>of</strong> approximately 4,900,000 single connection vectors each <strong>of</strong> which<br />

contains 42 features and is labeled as either normal or an attack, with exactly one specific attack type [43].


ISSN 2277-1956/V1N3-1044-1051<br />

IJECSE,Volume1,Number3<br />

Neethu B et al.<br />

The KDD’99 dataset in ARFF (Attribute Relation File Format) format is taken as input, because it is necessary to<br />

have type information about each attribute which cannot be automatically deduced from the attribute values.<br />

B. Evaluation and Results<br />

Each data sample in KDD99 dataset represents attribute value <strong>of</strong> a class in the network data flow, and each class is<br />

labeled either as normal or as an attack. In total, 42 features have been used in KDD99 dataset 26] and each<br />

connection can be categorized into five main classes (normal class and anomaly classes (DOS, Probe, U2r,r2l)).<br />

The input attributes in KDD99 dataset [26] for each network connection that have either discrete or continuous<br />

values and divided into three groups. The first group <strong>of</strong> attributes is the basic features <strong>of</strong> network connection, which<br />

include the duration, prototype, service, number <strong>of</strong> bytes from source IP addresses or from destination IP addresses,<br />

and some flags in TCP connections. The second group <strong>of</strong> attributes in KDD99 is composed <strong>of</strong> the content features <strong>of</strong><br />

network connections and the third group is composed <strong>of</strong> the statistical features that are computed either by a time<br />

window or a window <strong>of</strong> certain kind <strong>of</strong> connections.<br />

The dimensionality reduction process conducted <strong>using</strong> principal component analysis reduces the feature size <strong>of</strong> the<br />

dataset from 42 to 16. The classification step <strong>using</strong> naïve bayes classifier is conducted by <strong>using</strong> the data mining tool<br />

called WEKA (Waikato Environment for Knowledge Analysis). The naïvebayes Updatable classifier in the WEKA<br />

is used for doing the experiment. The experiment is conducted with 40 instances. The result <strong>of</strong> the classification is<br />

given in the figure.<br />

Figure 1: <strong>Classification</strong><br />

The comparison <strong>of</strong> the performance <strong>of</strong> the classification <strong>of</strong> the system with the simple Naïve Bayes Classifier is<br />

given below<br />

Figure 2: Performance Comparison


<strong>Classification</strong> <strong>of</strong> <strong>Intrusion</strong> <strong>Detection</strong> <strong>Dataset</strong> <strong>using</strong> <strong>machine</strong> <strong>learning</strong> Approaches<br />

ISSN 2277-1956/V1N3-1044-1051<br />

1049<br />

Though the system has high false positive rate compared to the system that uses Naïve Bayes classifier only, it has<br />

less memory requirement and high execution speed. The time for execution <strong>of</strong> the two systems is shown in the<br />

figure 3.From the Figure 3 we can see that the execution time <strong>of</strong> the system is comparatively less compared to the<br />

system that uses no dimensionality reduction<br />

Figure 3: Execution Time<br />

The accuracy <strong>of</strong> the system is calculated to be 89%. The detailed accuracy for each class in the proposed system is<br />

given in the figure 4.<br />

Figure 4: Accuracy by Class<br />

IX. CONCLUSION AND FUTURE WORK<br />

The paper described a Network IDS framework that is a combination <strong>of</strong> Naïve Bayes and Principal Component<br />

Analysis algorithm. The framework builds the patterns <strong>of</strong> the network services over data sets labeled by the services.<br />

With the built patterns, the framework detects attacks in the datasets <strong>using</strong> the naïve Bayes Classifier algorithm.<br />

Compared to the neural network based approach, our approach achieve higher detection rate, less time consuming<br />

and has low cost factor. However, it generates somewhat more false positives.<br />

As a naïve Bayesian network is a restricted network that has only two layers and assumes complete independence<br />

between the information nodes. This poses a limitation to this research work. In order to alleviate this problem so as<br />

to reduce the false positives, active platform or event based classification may be thought <strong>of</strong> <strong>using</strong> Bayesian<br />

network. We continue our work in this direction in order to build an efficient intrusion detection model. The<br />

performance evaluation with KDD99 cup benchmark dataset is also done. The result obtained is encouraging as the<br />

approach is faster compared to some existing systems. As a future work, we would like to extent the system to real


ISSN 2277-1956/V1N3-1044-1051<br />

IJECSE,Volume1,Number3<br />

Neethu B et al.<br />

time data capture and online detection <strong>of</strong> intrusions. This approach can be extended to include more types <strong>of</strong> novel<br />

attacks in real time.<br />

REFERRENCES<br />

1. <strong>Intrusion</strong> <strong>Detection</strong> system: host based and Network based IDS,Harley Kozushko<br />

2. Richard Heady, George Luger, Arthur Maccabe, and Mark Servilla, “The Architecture <strong>of</strong> a Network Level <strong>Intrusion</strong><br />

<strong>Detection</strong> System,” Technical report, University <strong>of</strong> New Mexico, 1990.<br />

3. James P. Anderson, “Computer Security Threat Monitoring and Surveillance,” Technical report, James P. Anderson Co.,<br />

Fort Washington, Pennsylvania. April 1980.<br />

4. Dorothy E. Denning, “An <strong>Intrusion</strong> <strong>Detection</strong> Model,” IEEE Transaction on S<strong>of</strong>tware Engineering, SE-13(2), 1987, pp.<br />

222-232.<br />

5. Mukkamala S., Sung A. H. and Abraham A., “<strong>Intrusion</strong> <strong>Detection</strong> <strong>using</strong> Ensemble <strong>of</strong> S<strong>of</strong>t Computing paradigms,” In<br />

Proceedings <strong>of</strong> the 3 rd International Conference on Intelligent Systems Design and Applications, Springer Verlag Germany,<br />

2003, pp. 209-217.<br />

6. Commission <strong>of</strong> the European Communities, “Information Technology Security Evaluation Criteria,” Version 1.1.1991.<br />

7. MIT Lincoln Laboratory, http://www.ll.mit.edu/IST/idaval/<br />

8. Marcus A. Malo<strong>of</strong>, and Ryszard S. Michalski, “Incremental <strong>learning</strong> with partial instance memory,” In Proceedings <strong>of</strong><br />

Foundations <strong>of</strong> Intelligent Systems: 13th International Symposium, ISMIS 2002, volume 2366 <strong>of</strong> Lecture Notes in<br />

Artificial Intelligence, Springer-Verlag, 2002,pp. 16-27.<br />

9. M. M. Sebring, E. Shellhouse, M. E. Hanna, and R. Alan Whitehurst. “Expert systems in intrusion detection: A case<br />

study”, In Proceedings <strong>of</strong> the 11th National Computer Security Conference, Baltimore, Maryland,<br />

10. W.K. Lee, S.J.Stolfo. “A data mining framework for building intrusion detection model”, In: Gong L., Reiter M.K. (eds.):<br />

Proceedings <strong>of</strong> the IEEE Symposium on Security and Privacy. Oakland, CA: IEEE Computer Society Press, 1999.<br />

11. W.K. Lee, et al., "Mining audit data to build intrusion detection models", In Proc. Int. Conf. Knowledge Discovery and<br />

Data Mining (KDD'98), pp.66-72, 1998.<br />

12. J. Ryan, M-J. Lin, R. Miikkulainen. “<strong>Intrusion</strong> detection with neural networks”, In Proceedings <strong>of</strong> AAAI-97 Workshop on<br />

AI Approaches to Fraud <strong>Detection</strong> and Risk Management, 1997.<br />

12. Selecting Features for <strong>Intrusion</strong> <strong>Detection</strong>: A Feature Relevance Analysis on KDD 99 <strong>Intrusion</strong> <strong>Detection</strong> <strong>Dataset</strong>s H.<br />

Güneş Kayacık, A. Nur Zincir-Heywood, Malcolm I. Heywood Dalhousie University, Faculty <strong>of</strong> Computer Science<br />

13. A Detailed Analysis <strong>of</strong> the KDD CUP 99 Data Set ,Mahbod Tavallaee, Ebrahim Bagheri, Wei Lu, and Ali A. Ghorbani<br />

14. N. B.Amor, S.Benferhat, and Z. Elouedi. “Naive Bayes vs decision trees in intrusion detection systems”, In Proc. 2004<br />

ACM Symp. on Applied Computing, 2004.<br />

15. Feature selection <strong>using</strong> principal component analysis ,Fengxi Song, Zhongwei Guo, Dayong Mei ,Department <strong>of</strong><br />

Automation and Simulation New Star Research Inst. <strong>of</strong> Applied Tech. in Hefei City Hefei, China.<br />

16. Correlation-based feature selection for intrusion detection design ,te-shun chou, kang k. yen, and jun luo, department <strong>of</strong><br />

electrical and computer engineering florida international university Miami.<br />

17. A tutorial on principal component analysis, Lindsay I Smith.<br />

18. ADAM:A test bed for exploring the use <strong>of</strong> datamining in intrusion detection SIGMOD, vol30, no.4, pp 15-24, 2001.<br />

19. Tomas Abraham, “IDDM: INTRUSION <strong>Detection</strong> <strong>using</strong> Data Mining Techniques”, Technical report DTSO electronics<br />

and surveillance research laboratory,Sailsbury.<br />

20. Wenke Lee and Salvatore J.Stolfo, "A Framework for constructing features and models for intrusion detection systems”,<br />

ACM transactions on Information and system security (TISSEC), vol.3, Issue 4, Nov 2000.<br />

21. S.chavan, K.Shah, N.Dave, S.Mukherjee, A.Abraham, and S.Sanyal, "Adaptive neuro-fuzzy <strong>Intrusion</strong> detection syatems”,<br />

ITCC, Vol 1, 2004.<br />

22. Z. Zhang, J. Li, C.N. Manikapoulos, J.Jorgenson, J.ucles, "HIDE: A hierarchical network intrusion detection system <strong>using</strong><br />

statistical pre-processing and neural network classification”, IEEE workshop proceedings on Information assurance and<br />

security, 2001, pp.85-90.<br />

23. Roy-I Chang, Liang-Bin Lai, et al, "<strong>Intrusion</strong> detection by back propagation network with sample query and attribute<br />

query”, International Journal <strong>of</strong> computational Intelligence Research,Vol..3, no.1, 2007, pp 6-10.<br />

24. S. Axelsson, "The base rate fallacy and its implications for the difficulty <strong>of</strong> <strong>Intrusion</strong> detection”, Proc. Of 6th.ACM<br />

conference on computer and communication security 1999.<br />

25. R.Puttini, Z.marrakchi, and L. Me, "Bayesian classification model for Real time intrusion detection”, Proc. <strong>of</strong> 22nd.<br />

International workshop on Bayesian inference and maximum entropy methods in science and engineering, 2002.<br />

26. An Introduction to the WEKA Data Mining System,Zdravko Markov Central Connecticut State University<br />

markovz@ccsu.edu ,Ingrid Russell University <strong>of</strong> Hartford.


<strong>Classification</strong> <strong>of</strong> <strong>Intrusion</strong> <strong>Detection</strong> <strong>Dataset</strong> <strong>using</strong> <strong>machine</strong> <strong>learning</strong> Approaches<br />

ISSN 2277-1956/V1N3-1044-1051<br />

1051<br />

27. An evaluation <strong>of</strong> <strong>machine</strong> <strong>learning</strong> techniques in intrusion detection By Christina lee.<br />

28. Dimensionality Reduction by Manoranjan Dash.<br />

29. Wei Fan, “Cost-Sensitive, Scalable and Adaptive Learning <strong>using</strong> Ensemble-based Methods,” PhD thesis, Columbia<br />

University, 2001.<br />

30. M.A. Malo<strong>of</strong> and R.S. Michalski, “A partial memory incremental <strong>learning</strong> methodology and its applications to computer<br />

intrusion detection,” Reports <strong>of</strong> the Machine Learning and Inference Laboratory MLI 95-2, Machine Learning and<br />

Inference Laboratory, George Mason University, 1995.<br />

31. Kenneth A. Kaufman, Guido Cervone, and Ryszard S. Michalski, “An application <strong>of</strong> Symbolic Learning to <strong>Intrusion</strong><br />

<strong>Detection</strong>: Preliminary Result from the LUS Methodology,” Reports <strong>of</strong> the Machine Learning and Inference Laboratory<br />

MLI 03-2, Machine Learning and Inference Laboratory, George Mason University, 2003.<br />

32. V. Venkatechalam and S. selvan, Performance comparison <strong>of</strong> intrusion detection system classification <strong>using</strong> various<br />

feature reduction techniques. International journal <strong>of</strong> simulation. Vol. 9, No. 1, pp.30-39, 2008.<br />

33. M. Panda and M.R.Patra, Network intrusion detection <strong>using</strong> Naïve Bayes . International journal <strong>of</strong> computer science and<br />

network security, Vol. 7, No. 12, pp. 258- 263, 2007.<br />

34. M.Panda and M.R. Patra, Bayesian belief network <strong>using</strong> genetic local search for detecting network intrusions. International<br />

journal <strong>of</strong> secure digital information age (IJSDIA), Vol. 1, No. 1, pp. 34-44, 2009.<br />

35. Sung –Hae Jun and Kyung –whan oh, An evolutionary support vector <strong>machine</strong> for intrusion detection. Asian journal <strong>of</strong><br />

information technology, Vol. 5, No. 7, pp. 778-783, 2006.<br />

36. N. B. Annur, H.Sallehudin, A. Gani and O.zakari. identifying false alarm for network intrusion detection system <strong>using</strong><br />

hybrid data mining decision tree. Malaysian journal <strong>of</strong> computer science, Vol.21, No. 2, pp.101-115, 2008.<br />

37. M. Panda and M.R. Patra. A semi-Naïve Bayesian method for detecting network intrusions. LNCS, Vol. 5863, pp. 614-<br />

621, 2009.<br />

38. P.Gaonjur, N. Z. Tarapore and S.G.Pokale. <strong>using</strong> neuro-fuzzy techniques to reduce false alerts in intrusion detection. In:<br />

Proceedings <strong>of</strong> International conference on Computer Networks and Security, India, pp. 1-6, 2008. IEEE Press.<br />

39. Adaptive <strong>Intrusion</strong> <strong>Detection</strong> based on <strong>machine</strong> <strong>learning</strong> :Feature selection ,Classifier construction and sequential pattern<br />

prediction, Xing Xu<br />

40. Identifying <strong>Intrusion</strong>s In Computer Networks Based On Principal Component Analysis, Wei Wang, Roberto Battiti.<br />

41. A Detailed Analysis <strong>of</strong> the KDD CUP 99 Data Set, Mahbod Tavallaee, Ebrahim Bagheri, Wei Lu, and Ali A. Ghorbani.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!