Gradient Descent and the Nelder-Mead Simplex Algorithm

Optimisation and Search: 

Gradient Descent and 

the Nelder-Mead Simplex Algorithm

Why minimise a function numerically? 

a 

b 

f(a,b) 

y 

Unknown!


Background: linear regression


Background: linear regression 

Straight Line: f(x) = α 1 x + α 2



Straight Line: f(x) = α 1 x + α 2 

ε i




Error between f(x i ) given by 

the model and y i from the 

data: 

ε ( α 

i 

1 

, α 

2) 

= f ( xi 

) 

= α x 

1 

i 

− 

+ α 

y 

2 

i 

− 

y 

i 

ε i




Error between f(x i ) given by 

the model and y i from the 

data: 

ε ( α 

i 

1 

, α 

2) 

= f ( xi 

) 

= α x 

1 

i 

− 

+ α 

y 

2 

i 

− 

y 

i 

ε i 

Task: Find the parameters α 1 

and α 2 that minimise the sum 

of squared errors! 

E( 

α , α ) 

1 

2 

= 

N 

∑ 

i= 

1 

ε ( α , α ) 

i 

1 

2 

2 

= 

N 

∑ 

i= 

1 

( α x 

1 

i 

+ α 

2 

− 

y 

i 

) 

2


non-linear regression 

• Linear Regression: 

▫ Fitting function is linear with respect to the 

parameters 

can be solved analytically (see Wikipedia) 

• Non-linear Regression: 

▫ Fitting function is non-linear with respect to the 

parameters (e.g. f(x,α 1 ,α 2 ) = sin(α 1 x)+cos(α 2 x)) 

Often no analytical solution 

Numerical optimisation or direct search

Gradient Descent: Example 

E(α 1 ,α 2 ) 

α 1 

α2

Gradient Descent 

1. Choose initial parameters α 1 and α 2 

2. Calculate the gradient 

3. Step in the direction of the gradient with a step 

size proportional to the amplitude of the 

gradient 

you get new parameters α 1 and α 2 

4. Check if the parameters have changed at a rate 

above a certain threshold 

5. If yes, go to 2, else terminate


E(α 1 ,α 2 ) 

α 1 

α2


E(α 1 ,α 2 ) 

α 1 

α2


E(α 1 ,α 2 ) 

α 1 

α2


E(α 1 ,α 2 ) 

α 1 

α2


E(α 1 ,α 2 ) 

α 1 

α2


E(α 1 ,α 2 ) 

α 1 

α2


E(α 1 ,α 2 ) 

α 1 

α2


E(α 1 ,α 2 ) 

α 1 

α2


E(α 1 ,α 2 ) 

α 1 

α2


E(α 1 ,α 2 ) 

α 1 

α2


E(α 1 ,α 2 ) 

α 1 

α2


E(α 1 ,α 2 ) 

α 1 

α2


E(α 1 ,α 2 ) 

α 1 

α2

Nelder-Mead Simplex Algorithm 

(for functions of 2 variables) 

1. Pick 3 parameter combinations Triangle 

2. Evaluate the function for those combinations 

f h ,f s ,f l : highest, second highest and lowest point 

3. Update the triangle 

using the best of the 

transformations in 

the figure 

4. Check for end 

condition 

5. Go to 2 or 

terminate

f 

l 

Nelder-Mead Algorithm: Update Rules 

≤ 

f 

r 

accept 

< 

f 

r 

f 

s 

f 

r 

< f l 

f ≤ f < 

s 

r 

f 

h 

no improvement 

fr ≥ f h

Nelder-Mead Algorithm: Example 

E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2


E(α 1 ,α 2 ) 

α 1 

α 2

Gradient Descent and the Nelder-Mead Simplex Algorithm

Create successful ePaper yourself

Delete template?

Save as template?