01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

59<br />

C:<br />

X → Y<br />

( )<br />

C x<br />

K<br />

= arg max ∑ 1<br />

y k<br />

kC : x = y<br />

( )<br />

(3.26)<br />

The hope is, thus, that the aggregation of all the classifiers will give better classification results<br />

than training a single classifier on the whole dataset.<br />

If the classification method bases its estimation on the average of the data, for a very<br />

heterogeneous dataset with local structures in the data, such an approach may fail to properly<br />

represent the local structure unless the classifier is provided with enough granularity to<br />

encapsulate these non-linearities 3 . A set of classifiers based on smaller subsets of the data may<br />

have more chances to catch these local structures. However, training classifiers on only a small<br />

subset of the data at hand may also degrade performance as they may fail to extract the generic<br />

sets of features of the data at hand (because they see too small a set) and focus on noise or<br />

particularities of each subset. This is a usual problem in machine learning and in classification, in<br />

particular. One must find a tradeoff between generalizing (hence having much less parameters<br />

than original datapoints) and representing all the local structures in the data.<br />

3.2.3.2 Boosting / Adaboost<br />

While bagging creates in parallel a set of K classifiers, boosting creates the classifiers<br />

sequentially and uses each previously created classifier to boost training of the next classifier.<br />

Principle:<br />

A weight is associated to each datapoint of the training set as well as to each classifier. Weights<br />

associated to the datapoint are adapted at each iteration step to reflect how well the datapoint is<br />

predicted by the global classifier. The less well classified the data point, the bigger its associated<br />

weight. This way, poorly classified data will be given more influence on the measure of the error<br />

and will be more likely to be selected to train the new classifier created at each iteration. As a<br />

result, they should be better estimated by this new classifier.<br />

Similarly the classifiers are weighted when combined to form the final classifier, so as to reflect<br />

their classification power. The poorer the classification power of a given classifier on the training<br />

set associated with this classifier, the less influence the classifier is given for final classification.<br />

Algorithm:<br />

M<br />

i i i N i<br />

Let us consider a binary classification problem with X = { x , y } , x ∈ , y ∈ { + 1; −1}<br />

v, i 1... M<br />

i<br />

i=<br />

1<br />

° . Let<br />

= be the weights associated to each datapoint. Usually these are uniformly distributed<br />

k<br />

for starters and hence one builds a first set of l classifiers C , k 1,... l<br />

all the set of data points.<br />

= by drawing uniformly from<br />

The final classifier is composed of a linear combination of the classifiers, so that each data point x<br />

is then classified according to the function:<br />

3 For instance in the case of multi-layer perceptron, one can add several hidden neurons and achieve optimal<br />

performance. However in this case one may argue that each neuron in the hidden layer is some sort of subclassifier<br />

(especially when using the threshold function as activation function for the output of the hidden neurons). Similarly in the<br />

case of Support Vector Machines, one may increase the number of support vector points until reaching an optimal<br />

description of all local non-linearities. At maximum, one may take all points as support vectors….!<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!