Abstract-Band - Fakultät für Informatik, TU Wien - Technische ...
Abstract-Band - Fakultät für Informatik, TU Wien - Technische ...
Abstract-Band - Fakultät für Informatik, TU Wien - Technische ...
Sie wollen auch ein ePaper? Erhöhen Sie die Reichweite Ihrer Titel.
YUMPU macht aus Druck-PDFs automatisch weboptimierte ePaper, die Google liebt.
genug sind, um funktionale Abhängigkeiten in DLs zu modellieren. Deshalb erweitern<br />
wir pIdCs zu tree-based identification constraints. Mittels diesen<br />
untersuchen wir Redundanzen in DL KBs. Der Hauptbeitrag dieser Arbeit ist<br />
eine Definition der Beschreibungslogik Normalform (DLNF), welche eine sinnesgetreue<br />
Erweiterung der BCNF zu DLs ist. Zusätzlich stellen wir eine direkte Abbildung<br />
von relationalen Schemata auf DL KBs vor. Wir zeigen, dass jedes relationale<br />
Schema genau dann in BCNF ist, wenn auch die direkte Abbildung<br />
dieses Schemas auf eine DL KB in DLNF ist und umgekehrt.<br />
Iraklis Kordomatis<br />
Machine Learning Algorithms for Visual Pattern Detection on Web Pages<br />
Studium: Masterstudium Information & Knowledge Management<br />
BetreuerIn: Univ.Prof. Dr. Reinhard Pichler<br />
In this thesis the question how to robustly identify web objects across different<br />
sites is tackled. TAMCROW introduces a novel approach exploiting visually<br />
perceivable characteristics of a web object and its surrounding objects. This approach<br />
is entirely independent of textual labels, and hence has the noteworthy<br />
advantage of being language-agnostic. Another main advantage of the visual<br />
detection approach is sample parsimony. Fewer examples are required for the<br />
learning process to learn how to find certain web objects on previously<br />
unknown pages. Moreover, visual cues are crucial for the human perception<br />
and as a consequence also for the usability of a web page. Therefore, web<br />
designers create web pages coherent with the human perception in order to<br />
yield a high usability. Supervised machine learning techniques are applied for<br />
the object identification process. The knowledge is limited to features representing<br />
the visual appearance of the different web objects. An additional<br />
question is whether it is possible to predict the role of a web object by its visual<br />
appearance which is formally a classification problem. Within the scope of this<br />
master thesis, the following machine learning techniques are investigated in<br />
detail: logistic regression, k~nearest-neighbor, classification trees (in particular,<br />
c4.5 of Quinlan) and support vector machines. The evaluation results are<br />
illustrated in chapter \ref{ch:evaluationResults} indicating that the approach<br />
developed within the TAMCROW project is very fruitful. The workflow on web<br />
object identification is evaluated with different scenarios. These scenarios<br />
include searches for train, bus and flight connections as well as for accommodations.<br />
K-page cross-validation is used as evaluation technique. The mean<br />
precision is chosen as performance measure, since it fits best for the used<br />
scenarios. The results are significant for all classification techniques.<br />
34