01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

195<br />

9.4.2.1 Maximum Likelihood<br />

Machine learning techniques often assume that the form of the distribution function is known and<br />

that sole its parameters must be optimized to fit at best a set of observed datapoints. It then<br />

proceeds to determine these parameters through maximum likelihood optimization.<br />

The principle of maximum likelihood consists of finding the optimal parameters of a given<br />

distribution by maximizing the likelihood function of these parameters, equivalently by maximizing<br />

the probability of the data given the model and its parameters, e.g.:<br />

The method of maximum likelihood is a method of point estimation that uses as an estimate of<br />

an unobservable population parameter the member of the parameter space that maximizes the<br />

likelihood function. If y is the unobserved parameter and X the observed instance of this<br />

parameter, the probability of an observed outcome X=x given the underlying distribution µ is:<br />

( ) ( | )<br />

L y P X x µ<br />

X= x<br />

= = (8.38)<br />

the value of y that maximizes L ( µ ) is the maximum-likelihood estimate of µ . In order to find<br />

∂ L =<br />

∂µ<br />

the maximum, one will compute the derivative of L: 0 . However, it is often much simpler to<br />

compute the derivative of the logarithm of the likelihood function, the log-likelihood.<br />

Let X = { x1, x2,..., x N<br />

} be the dataset of observed instance of the variable X generated by a<br />

distribution parameterized by the unknown µ , i.e., p( x | µ ). µ can be estimated by Maximum-<br />

Likelihood:<br />

( ) p( x )<br />

ˆ µ arg max[ p X | µ | µ<br />

N<br />

= =∏ (8.39)<br />

µ<br />

In most problems, it is not possible to find an analytical expression for ˆµ . One, then, has to<br />

proceed to an estimation of the parameter through an iterative procedure called EM.<br />

i=<br />

1<br />

i<br />

9.4.2.2 EM-Algorithm<br />

EM is an algorithm for finding maximum likelihood estimates of parameters in probabilistic<br />

models, where the model depends on unobserved latent variables. EM alternates between<br />

performing an expectation (E) step, which computes the expected value of the latent variables,<br />

and a maximization (M) step, which computes the maximum likelihood estimates of the<br />

parameters given the data and setting the latent variables to their expectation.<br />

A basic intuition of the EM-algorithm goes as follows:<br />

• Guess an initial ˆµ . (Initialization)<br />

• Using current ˆµ , obtain an expectation of the complete data likelihood L ( µ ). (E-step)<br />

• Find (and update) ˆµ to maximize the expectation (M-step).<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!