JST Vol. 21 (1) Jan. 2013 - Pertanika Journal - Universiti Putra ...

More documents

Recommendations

Info

Tianyong Hao and Yingying Qu The structural pattern of the example sentence is “the(gov [N1]) [N1](gov be) of(gov N1) [N2](gov N1) be(gov fin) [N3](gov be)”. Therefore, the pattern can be matched with the whole pattern set. Suppose there is a frequent pattern matched and its covered sentences include “The benefits of red wine are reducing coronary heart diseases, maintains the immune system, polyphenols, resveratrol, flavonoids, anti-bacterial activity, anti-stress”. The relation is then applied in the sentence, and thus, a relation is learned automatically, as follows: By this way, all the possible relations related to the concepts in our domain sentences could be acquired, especially using the previous semantic bank. Therefore, the semantic annotations of concepts with their relations can be used to enrich or built a domain specific knowledge base. PRELIMINARY EXPERIMENTS In this study, the automatic knowledge acquisition method was implemented in a Windowsform application. The domain was selected as “heart diseases” from Yahoo! Answers (2012) as of 10/25/2007, which was used as corpus since it contained huge amount of data (4,483,032 items) and also tagged with rated answers. These human-based answers are favourable domain text source. Furthermore, the dataset contained a large amount of metadata, i.e., which answer was selected as the best answer, as well as the category and sub-category that were assigned to this question. Therefore, the questions with their best answers were selected in the “Heart Diseases” category as our domain dataset, containing 19,622 items in total. The example data in the domain are shown in Fig.3. After pre-processing the corpus into sentences, Minipar was applied to automatically identify the concepts and dependency relations. In this experiment, the focus was mainly placed on recognizing nouns and noun phrases, in which the latter has higher priority since the noun phrases can convey more specific semantic information. After that, the system extracted all the possible structural patterns. With the sequence-based Apriori algorithm, the frequent patterns are mined and extracted. WordNet is further applied to annotate the nouns and nouns phrases as the format of unit annotation and context annotation. In order to annotate a sentence more accurately, all the semantic labels and items were recorded with their occurrences into a semantic bank. These statistical data were further used to assist annotation on specific sentences. 178 Pertanika J. Sci. & Technol. 21 (1): 283 - 298 (2013)
Toward Automatic Semantic Annotating and Pattern Mining for Domain Knowledge Acquisition Fig. 3: Question Examples in “Heart Diseases” category Fig.4: The extracted frequent patterns by different support values From a total of 19,622 question items, 10.3 words were allocated for each question on average. Using the proposed pattern extraction method, 56,118 structural patterns were finally extracted in total, indicating that 2.86 patterns for each sentence on average. With different values of the support threshold, the number of the extracted frequent patterns changed dramatically, and the distribution is shown in Fig. 4. Since it was difficult to find annotated knowledge resources in this domain as the training data, 100 sentences were randomly selected from the domain dataset and they were also manually annotated for this preliminary experiment. On these training data, the structural patterns and concepts were firstly extracted automatically by sentence labelling. After that, all the concepts and related semantic labels that are automatically annotated by the semantic bank and WordNet were randomly selected. The relations in the training data were also annotated and applied into the sentences where the matched patterns had been generated from. Based on the training data, there were a total of 272 structural patterns extracted. With regard to the pattern matching on all the other sentences in the whole dataset, 1,929 new concepts and 632 relations were finally acquired in total, and the results are shown in Table 4. The preliminary Pertanika J. Sci. & Technol. 21 (1): 283 - 298 (2013) 179
Page 2 and 3:
Journal of Science & Technology Jou
Page 4 and 5:
International Advisory Board Adarsh
Page 6 and 7:
ut bovine milk allergy by far is th
Page 8 and 9:
Selected Articles from UPM-Malaysia
Page 11 and 12:
ISSN: 0128-7680 © 2013 Universiti
Page 13 and 14:
Aspect of Fatigue Analysis of Compo
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Application of Anthropometric Dimen
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Different Media Formulation on Bioc
Page 43 and 44:
Page 45:
Page 48 and 49:
Heng, S. C., Ibrahim, Z. B., Suleim
Page 50 and 51:
Page 52 and 53:
Page 54 and 55:
CONCLUSION Heng, S. C., Ibrahim, Z.
Page 56 and 57:
Y. Yusuf, J. M. Juoi, Z. M. Rosli,
Page 58 and 59:
Page 60 and 61:
Page 62 and 63:
Surface Roughness Y. Yusuf, J. M. J
Page 64 and 65:
DISCUSSION Y. Yusuf, J. M. Juoi, Z.
Page 66 and 67:
Page 69 and 70:
Page 71 and 72:
Modelling of Motion Resistance Rati
Page 73 and 74:
The Empirical Method Modelling of M
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Assessment of Heavy Metal Pollution
Page 89 and 90:
TABLE 1: Size of mussels in average
Page 91 and 92:
Page 93 and 94:
TABLE 5: Heavy metal concentrations
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Physico-Chemical and Electrical Pro
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
TABLE 1: (continued) Physico-Chemic
Page 117 and 118:
Page 119:
Page 122 and 123:
Norhashila Hashim et al. effectiven
Page 124 and 125:
Statistical Analysis Norhashila Has
Page 126 and 127:
Norhashila Hashim et al. Post-hoc a
Page 128 and 129:
Norhashila Hashim et al. Cubero, S.
Page 130 and 131:
S. Ismail and K. A. Mohd Atan 4 4 p
Page 132 and 133:
S. Ismail and K. A. Mohd Atan The f
Page 134 and 135:
Lemma 1.1 The equation S. Ismail an
Page 136 and 137:
CONCLUSION S. Ismail and K. A. Mohd
Page 138 and 139: INTRODUCTION Tajau, R. et al. A hig
Page 140 and 141: Tajau, R. et al. the acrylate’s c
Page 142 and 143: Tajau, R. et al. double bond) and C
Page 144 and 145: Tajau, R. et al. Ulanski, P., & Ros
Page 146 and 147: Aji, I. S., Zinudin, E. S., Khairul
Page 148 and 149: Aji, I. S., Zinudin, E. S., Khairul
Page 150 and 151: CONCLUSION Aji, I. S., Zinudin, E.
Page 152 and 153: D. Bachtiar, S. M. Sapuan, E. S. Za
Page 156 and 157: TABLE 1: (continue) 4 D. Bachtiar,
Page 162 and 163: N. Maizatul, I. Norazowa, W. M. Z.
Page 168 and 169: REFERENCES N. Maizatul, I. Norazowa
Page 171 and 172: ISSN: 0128-7680 © 2013 Universiti
Page 173 and 174: CBIR Using Colour and Shape Fused F
Page 181 and 182: Toward Automatic Semantic Annotatin
Page 187: Toward Automatic Semantic Annotatin
Page 195 and 196: Issues on Trust Management in Wirel
Page 199 and 200: Trust Value MF-ID 1 0.8 0.6 0.4 0.2
Page 205 and 206: A Negation Query Engine for Complex
Page 213 and 214: REFERENCES A Negation Query Engine
Page 217 and 218: The Problem Association Rule Mining
Page 219 and 220: Association Rule Mining threshold t
Page 221 and 222: Association Rule Mining must be sor
Page 223 and 224: Association Rule Mining On the othe
Page 225: Association Rule Mining Quinlan, J.
Page 228 and 229: Che Mustapha Yusuf, J., Mohd Su’u
Page 238 and 239:
Az Azrinudin Alidin and Fabio Crest
Page 240 and 241:
Page 242 and 243:
Page 244 and 245:
Page 246 and 247:
Page 248 and 249:
Page 250 and 251:
Yogan Jaya Kumar, Naomie Salim, Ahm
Page 252 and 253:
Page 254 and 255:
Page 256 and 257:
REFERENCES Yogan Jaya Kumar, Naomie
Page 258 and 259:
Hasiah Mohamed@Omar, Azizah Jaafar
Page 260 and 261:
Page 262 and 263:
Page 264 and 265:
Page 266 and 267:
Page 268 and 269:
Page 271 and 272:
Page 273 and 274:
The Role of Similarity Measurement
Page 275 and 276:
Page 277 and 278:
Page 279 and 280:
Page 281 and 282:
TABLE 5: Winners MRS The Role of Si
Page 283:
REFEREES FOR THE PERTANIKA JOURNAL
Page 286 and 287:
Guidelines for Authors Publication
Page 288 and 289:
table should be prepared on a separ
Page 290 and 291:
The Journal’s review process What
Page 293 and 294:
Usability of Educational Computer G
show all

JST Vol. 21 (1) Jan. 2013 - Pertanika Journal - Universiti Putra ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?