MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA MACHINE LEARNING TECHNIQUES - LASA

01.11.2014 Views

2 1. I. Introduction .............................................................................................................. 6 1.1 What is Machine Learning? - Definitions ...................................................................... 6 1.1.1 ML Resources: ........................................................................................................... 6 1.2 What is Learning? ........................................................................................................... 7 1.2.1 Taxonomy of Learning Algorithms ........................................................................... 10 1.2.2 Other important terms in machine learning .............................................................. 10 1.2.3 Key features for a good learning system ................................................................. 11 1.2.4 Exercise ................................................................................................................... 12 1.3 Best Practices in ML ..................................................................................................... 12 1.3.1 Training, validation and testing sets ........................................................................ 12 1.3.2 Crossvalidation ........................................................................................................ 13 1.3.3 Performance Measures in ML .................................................................................. 13 1.3.4 In Practice ................................................................................................................ 14 1.4 Focus of this course ..................................................................................................... 14 2. 2 Methods for Correlation Analysis PCA, CCA, ICA .............................................. 16 2.1 Principal Component Analysis .................................................................................... 17 2.1.1 Dimensionality Reduction ........................................................................................ 18 2.1.2 Solving PCA as an optimization under constraint problem ...................................... 20 2.1.3 PCA limitations ........................................................................................................ 21 2.1.4 Projection Pursuit ..................................................................................................... 22 2.1.5 Probabilistic PCA ..................................................................................................... 24 2.2 Canonical Correlation Analysis ................................................................................... 26 2.2.1 CCA for more than two variables ............................................................................. 27 2.2.2 Limitations ................................................................................................................ 27 2.3 Independent Component Analysis .............................................................................. 28 2.3.1 Illustration of ICA ..................................................................................................... 28 2.3.2 Why Gaussian variables are forbidden .................................................................... 31 2.3.3 Definition of ICA ....................................................................................................... 31 2.3.4 Whitening ................................................................................................................. 33 2.3.5 ICA Ambiguities ....................................................................................................... 35 2.3.6 ICA by maximizing non-gaussianity ......................................................................... 35 2.4 Further Readings .......................................................................................................... 38 3. 3 Clustering and Classification ............................................................................... 39 3.1 Clustering Techniques ................................................................................................. 39 3.1.1 Hierarchical Clustering ............................................................................................. 40 3.1.2 K-means clustering .................................................................................................. 45 3.1.3 Soft K-means ........................................................................................................... 47 3.1.4 Clustering with Mixtures of Gaussians ..................................................................... 49 3.1.5 Gaussian Mixture Models ........................................................................................ 51 3.2 Linear Classifiers .......................................................................................................... 56 3.2.1 Linear Discriminant Analysis .................................................................................... 56 3.2.2 Fisher Linear Discriminant ....................................................................................... 57 3.2.3 Mixture of linear classifiers (boosting and bagging) ................................................. 58 3.3 Bayes Classifier ............................................................................................................ 61 3.4 Linear classification with Gaussian Mixture Models ................................................. 62 3.5 Further Readings .......................................................................................................... 63 © A.G.Billard 2004 – Last Update March 2011

3 4. 4 Regression Techniques ........................................................................................ 64 4.1 Linear Regression ......................................................................................................... 64 4.2 Partial Least Square Methods ...................................................................................... 64 4.3 Probabilistic Regression .............................................................................................. 66 4.4 Gaussian Mixture Regression ..................................................................................... 69 4.4.1 One Gaussian Case ................................................................................................ 70 4.4.2 Multi-Gaussian Case ............................................................................................... 71 5. 5 Kernel Methods ...................................................................................................... 73 5.1 The kernel trick ............................................................................................................. 73 5.2 Which kernel, when? .................................................................................................... 75 5.3 Kernel PCA .................................................................................................................... 76 5.4 Kernel CCA .................................................................................................................... 81 5.5 Kernel ICA ...................................................................................................................... 84 5.6 Kernel K-Means ............................................................................................................. 88 5.7 Support Vector Machines ............................................................................................. 90 5.7.1 Support Vector Machine for Linearly Separable Datasets ....................................... 92 5.7.2 Support Vector Machine for Non-linearly Separable Datasets ................................ 96 5.7.3 Non-Linear Support Vector Machines ...................................................................... 97 5.7.4 n-SVM ...................................................................................................................... 98 5.8 Support Vector Regression ......................................................................................... 99 5.8.1 n-SVR .................................................................................................................... 106 5.9 Gaussian Process Regression .................................................................................. 109 5.9.1 What is a Gaussian Process .................................................................................. 109 5.9.2 Equivalence of Gaussian Process Regression and Gaussian Mixture Regression 113 5.9.3 Curse of dimensionality, choice of hyperparameters ............................................. 115 5.10 Gaussian Process Classification .............................................................................. 116 6. 6 Artificial Neural Networks ................................................................................... 120 6.1 Applications of ANN ................................................................................................... 120 6.2 Biological motivation .................................................................................................. 120 6.2.1 The Brain as an Information Processing System ................................................... 120 6.2.2 Neural Networks in the Brain ................................................................................. 121 6.2.3 Neurons and Synapses ......................................................................................... 122 6.2.4 Synaptic Learning .................................................................................................. 122 6.2.5 Summary ............................................................................................................... 123 6.3 Perceptron ................................................................................................................... 124 6.3.1 Learning rule for the Perceptron ............................................................................ 126 6.3.2 Information Theory and the Neuron ....................................................................... 127 6.4 The Backpropagation Learning Rule ........................................................................ 129 6.4.1 The Adaline ............................................................................................................ 130 6.4.2 The Backpropagation Network .............................................................................. 131 6.4.3 The Backpropagation Algorithm ............................................................................ 132 6.5 Willshaw net ................................................................................................................ 133 6.6 Hebbian Learning ........................................................................................................ 134 © A.G.Billard 2004 – Last Update March 2011

2<br />

1. I. Introduction .............................................................................................................. 6<br />

1.1 What is Machine Learning? - Definitions ...................................................................... 6<br />

1.1.1 ML Resources: ........................................................................................................... 6<br />

1.2 What is Learning? ........................................................................................................... 7<br />

1.2.1 Taxonomy of Learning Algorithms ........................................................................... 10<br />

1.2.2 Other important terms in machine learning .............................................................. 10<br />

1.2.3 Key features for a good learning system ................................................................. 11<br />

1.2.4 Exercise ................................................................................................................... 12<br />

1.3 Best Practices in ML ..................................................................................................... 12<br />

1.3.1 Training, validation and testing sets ........................................................................ 12<br />

1.3.2 Crossvalidation ........................................................................................................ 13<br />

1.3.3 Performance Measures in ML .................................................................................. 13<br />

1.3.4 In Practice ................................................................................................................ 14<br />

1.4 Focus of this course ..................................................................................................... 14<br />

2. 2 Methods for Correlation Analysis PCA, CCA, ICA .............................................. 16<br />

2.1 Principal Component Analysis .................................................................................... 17<br />

2.1.1 Dimensionality Reduction ........................................................................................ 18<br />

2.1.2 Solving PCA as an optimization under constraint problem ...................................... 20<br />

2.1.3 PCA limitations ........................................................................................................ 21<br />

2.1.4 Projection Pursuit ..................................................................................................... 22<br />

2.1.5 Probabilistic PCA ..................................................................................................... 24<br />

2.2 Canonical Correlation Analysis ................................................................................... 26<br />

2.2.1 CCA for more than two variables ............................................................................. 27<br />

2.2.2 Limitations ................................................................................................................ 27<br />

2.3 Independent Component Analysis .............................................................................. 28<br />

2.3.1 Illustration of ICA ..................................................................................................... 28<br />

2.3.2 Why Gaussian variables are forbidden .................................................................... 31<br />

2.3.3 Definition of ICA ....................................................................................................... 31<br />

2.3.4 Whitening ................................................................................................................. 33<br />

2.3.5 ICA Ambiguities ....................................................................................................... 35<br />

2.3.6 ICA by maximizing non-gaussianity ......................................................................... 35<br />

2.4 Further Readings .......................................................................................................... 38<br />

3. 3 Clustering and Classification ............................................................................... 39<br />

3.1 Clustering Techniques ................................................................................................. 39<br />

3.1.1 Hierarchical Clustering ............................................................................................. 40<br />

3.1.2 K-means clustering .................................................................................................. 45<br />

3.1.3 Soft K-means ........................................................................................................... 47<br />

3.1.4 Clustering with Mixtures of Gaussians ..................................................................... 49<br />

3.1.5 Gaussian Mixture Models ........................................................................................ 51<br />

3.2 Linear Classifiers .......................................................................................................... 56<br />

3.2.1 Linear Discriminant Analysis .................................................................................... 56<br />

3.2.2 Fisher Linear Discriminant ....................................................................................... 57<br />

3.2.3 Mixture of linear classifiers (boosting and bagging) ................................................. 58<br />

3.3 Bayes Classifier ............................................................................................................ 61<br />

3.4 Linear classification with Gaussian Mixture Models ................................................. 62<br />

3.5 Further Readings .......................................................................................................... 63<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!