JST Vol. 21 (1) Jan. 2013 - Pertanika Journal - Universiti Putra ...

More documents

Recommendations

Info

Yuhanis Yusof and Mohammed Hayel Refai To explain the CuA discovery of rules and classifier development, consider data shown in Table 1. The table consists of two attributes, A 1 (a 1, b 1, c 1) and A 2 (a 2, b 2, c 2), and two class labels (l 1, l 2). Assume that Minsupp = 30% and Minconf = 80%, the frequent one, and two items for data depicted in Table 1 are shown in Table 2, along with the associated support and confidence values. The frequent items (in bold) in Table 2 denote those that pass the confidence and support thresholds, which are later converted into rules. Finally, the classifier is constructed using a subset of these rules. The support threshold is the key to success in CuA. However, for certain applications, rules with large confidence value are ignored because they do not have enough support. Traditional CuA algorithms such as CBA (Liu et al., 1998) and MCAR (Thabtah & Cowling, 2007) use one support threshold to control the number of rules derived and may be unable to capture high confidence rules that have low support. In order to explore a large search space and to capture as many high confidence rules as possible, such algorithms commonly tend to set a very low support threshold, which may lead to problems such as generating low statistically support rules and a large number of candidate rules which require high computational time and storage. In response to these issues, one approach that suspends the support and uses only the confidence threshold to rule generation has been proposed by Wang et al. (2001). This confidence-based approach aims to extract all the rules in a dataset that passes the Minconf threshold. Other approaches are to extend some of the existing algorithms such as CBA and CMAR. These extensions resulted in a new approach called multiple supports (Baralis et al., 2008) that considers class distribution frequencies in a dataset and also assigns a different support threshold to each class. This assignment is done by distributing the global support TABLE 1: Training Dataset RowNo Attribute 1 Attribute 2 class 1 a 1 a 2 l 1 2 a 1 a 2 l 2 3 a 1 b 2 l 1 4 a 1 b 2 l 2 5 b 1 b 2 l 2 6 b 1 a 2 l 1 7 a 1 b 2 l 1 8 a 1 a 2 l 1 9 c 1 c 2 l 2 10 a 1 a 2 l 1 TABLE 2: Potential Classifier for Data in Table 1 Frequent Items Rule Condition Rule class Supp Conf l 1 4/10 4/5 l 1 5/10 5/7 208 Pertanika J. Sci. & Technol. 21 (1): 283 - 298 (2013)
Association Rule Mining threshold to each class corresponding to its number of frequencies in the training dataset, and thus, considers the generation of rules for the class labels with low frequencies in the training data. An approach called MCAR (Thabtah, 2005) that employs vertical mining was proposed, in which during the first training data scan, the frequent items of size one are determined and their appearances in the training data (rowIds) are stored inside an array in a vertical format. Any item that fails to pass the support threshold is discarded. The rowIds hold useful information that can be used laterally during the training step in order to compute the support threshold by intersecting the rowIds of any disjoint items of the same size. It should be noted that the proposed algorithm of MMCAR also utilises vertical mining in discovering and generating the rules. Next section describes the proposed algorithm based on an example. MATERIAL AND METHODS MMCAR goes through three main phases: Training, Construction of classifier, and Forecasting of new cases as shown in Fig.1. During the first phase, it scans the input data set to find frequent items in the form of size 1. These items are called one-item. The algorithm repeatedly joins them to produce frequent two-items, and so forth. It should be noted that any item that appears in the input dataset less than the MinSupp threshold will be discarded. Once all the frequent items of all sizes are discovered, the algorithm will check for the items confidence values. Only if the confidence value is larger than the MinConf threshold, it will then become a CAR. Otherwise, the item gets deleted. The next step is to sort the rules according to certain measures and to choose a subset of the complete set of CARs to form the classifier. Details on the proposed MMCAR are given in the next subsections. CARs Discovery and Production MMCAR uses an intersection method based on the so-called Tid-list to compute the support and confidence values of the item values. The Tid-list of an item represents the number of rows in the training dataset in which an item has occurred. Thus, by intersecting the Tid-lists of two disjoint items, the resulting set denotes the number of rows, in which the new resulting item has appeared in the training dataset, and the cardinality of the resulting set represents the new item support value. The proposed algorithm goes over the training dataset only once to count the frequencies of one-items, from which it discovers those that pass the MinSupp. During the scan, the frequent one-items are determined, and their appearances in the input data (Tid-lists) are stored inside a data structure in a vertical format. Meanwhile, any item that does not pass the MinSupp will be removed. Then, the Tid-lists of the frequent one-item are used to produce the candidate twoitem by simply intersecting the Tid-lists of any two disjoint one-items. Consider for instance, the frequent attribute values (size 1) (, l 1) and (, l 1) that are shown in Table 2 can be utilised to produce the frequent item (size 2) (, l 1) by intersecting their Tid-lists, i.e. (1,3,7,8,10) and (1,6,8,10) within the training dataset (Table 1). The result of the above intersection is the set (1,8,10) in which its cardinality equals 3, and denotes the support value of the new attribute value (, l 1). Now, since this attribute value support is larger than Pertanika J. Sci. & Technol. 21 (1): 283 - 298 (2013) 209
Page 2 and 3:
Journal of Science & Technology Jou
Page 4 and 5:
International Advisory Board Adarsh
Page 6 and 7:
ut bovine milk allergy by far is th
Page 8 and 9:
Selected Articles from UPM-Malaysia
Page 11 and 12:
ISSN: 0128-7680 © 2013 Universiti
Page 13 and 14:
Aspect of Fatigue Analysis of Compo
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Application of Anthropometric Dimen
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Different Media Formulation on Bioc
Page 43 and 44:
Page 45:
Page 48 and 49:
Heng, S. C., Ibrahim, Z. B., Suleim
Page 50 and 51:
Page 52 and 53:
Page 54 and 55:
CONCLUSION Heng, S. C., Ibrahim, Z.
Page 56 and 57:
Y. Yusuf, J. M. Juoi, Z. M. Rosli,
Page 58 and 59:
Page 60 and 61:
Page 62 and 63:
Surface Roughness Y. Yusuf, J. M. J
Page 64 and 65:
DISCUSSION Y. Yusuf, J. M. Juoi, Z.
Page 66 and 67:
Page 69 and 70:
Page 71 and 72:
Modelling of Motion Resistance Rati
Page 73 and 74:
The Empirical Method Modelling of M
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Assessment of Heavy Metal Pollution
Page 89 and 90:
TABLE 1: Size of mussels in average
Page 91 and 92:
Page 93 and 94:
TABLE 5: Heavy metal concentrations
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Physico-Chemical and Electrical Pro
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
TABLE 1: (continued) Physico-Chemic
Page 117 and 118:
Page 119:
Page 122 and 123:
Norhashila Hashim et al. effectiven
Page 124 and 125:
Statistical Analysis Norhashila Has
Page 126 and 127:
Norhashila Hashim et al. Post-hoc a
Page 128 and 129:
Norhashila Hashim et al. Cubero, S.
Page 130 and 131:
S. Ismail and K. A. Mohd Atan 4 4 p
Page 132 and 133:
S. Ismail and K. A. Mohd Atan The f
Page 134 and 135:
Lemma 1.1 The equation S. Ismail an
Page 136 and 137:
CONCLUSION S. Ismail and K. A. Mohd
Page 138 and 139:
INTRODUCTION Tajau, R. et al. A hig
Page 140 and 141:
Tajau, R. et al. the acrylate’s c
Page 142 and 143:
Tajau, R. et al. double bond) and C
Page 144 and 145:
Tajau, R. et al. Ulanski, P., & Ros
Page 146 and 147:
Aji, I. S., Zinudin, E. S., Khairul
Page 148 and 149:
Aji, I. S., Zinudin, E. S., Khairul
Page 150 and 151:
CONCLUSION Aji, I. S., Zinudin, E.
Page 152 and 153:
D. Bachtiar, S. M. Sapuan, E. S. Za
Page 154 and 155:
Page 156 and 157:
TABLE 1: (continue) 4 D. Bachtiar,
Page 158 and 159:
Page 160 and 161:
Page 162 and 163:
N. Maizatul, I. Norazowa, W. M. Z.
Page 164 and 165:
Page 166 and 167:
Page 168 and 169: REFERENCES N. Maizatul, I. Norazowa
Page 171 and 172: ISSN: 0128-7680 © 2013 Universiti
Page 173 and 174: CBIR Using Colour and Shape Fused F
Page 181 and 182: Toward Automatic Semantic Annotatin
Page 195 and 196: Issues on Trust Management in Wirel
Page 199 and 200: Trust Value MF-ID 1 0.8 0.6 0.4 0.2
Page 205 and 206: A Negation Query Engine for Complex
Page 213 and 214: REFERENCES A Negation Query Engine
Page 217: The Problem Association Rule Mining
Page 221 and 222: Association Rule Mining must be sor
Page 223 and 224: Association Rule Mining On the othe
Page 225: Association Rule Mining Quinlan, J.
Page 228 and 229: Che Mustapha Yusuf, J., Mohd Su’u
Page 238 and 239: Az Azrinudin Alidin and Fabio Crest
Page 250 and 251: Yogan Jaya Kumar, Naomie Salim, Ahm
Page 256 and 257: REFERENCES Yogan Jaya Kumar, Naomie
Page 258 and 259: Hasiah Mohamed@Omar, Azizah Jaafar
Page 268 and 269:
Hasiah Mohamed@Omar, Azizah Jaafar
Page 271 and 272:
Page 273 and 274:
The Role of Similarity Measurement
Page 275 and 276:
Page 277 and 278:
Page 279 and 280:
Page 281 and 282:
TABLE 5: Winners MRS The Role of Si
Page 283:
REFEREES FOR THE PERTANIKA JOURNAL
Page 286 and 287:
Guidelines for Authors Publication
Page 288 and 289:
table should be prepared on a separ
Page 290 and 291:
The Journal’s review process What
Page 293 and 294:
Usability of Educational Computer G
show all

JST Vol. 21 (1) Jan. 2013 - Pertanika Journal - Universiti Putra ...

Create successful ePaper yourself

Delete template?

Save as template?