01.02.2014 Views

GNUPlot Manual

GNUPlot Manual

GNUPlot Manual

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

22 FIT gnuplot 4.0 35<br />

(branch) for each data set are selected by using a ’pseudo-variable’, e.g., either the dataline number (a<br />

’column’ index of -1) or the datafile index (-2), as the second independent variable.<br />

Example: Given two exponential decays of the form, z=f(x), each describing a different data set but<br />

having a common decay time, estimate the values of the parameters. If the datafile has the format x:z:s,<br />

then<br />

f(x,y) = (y==0) ? a*exp(-x/tau) : b*exp(-x/tau)<br />

fit f(x,y) ’datafile’ using 1:-1:2:3 via a, b, tau<br />

For a more complicated example, see the file "hexa.fnc" used by the "fit.dem" demo.<br />

Appropriate weighting may be required since unit weights may cause one branch to predominate if there<br />

is a difference in the scale of the dependent variable. Fitting each branch separately, using the multibranch<br />

solution as initial values, may give an indication as to the relative effect of each branch on the<br />

joint solution.<br />

22.6 Starting values<br />

Nonlinear fitting is not guaranteed to converge to the global optimum (the solution with the smallest<br />

sum of squared residuals, SSR), and can get stuck at a local minimum. The routine has no way to<br />

determine that; it is up to you to judge whether this has happened.<br />

fit may, and often will get "lost" if started far from a solution, where SSR is large and changing slowly<br />

as the parameters are varied, or it may reach a numerically unstable region (e.g., too large a number<br />

causing a floating point overflow) which results in an "undefined value" message or gnuplot halting.<br />

To improve the chances of finding the global optimum, you should set the starting values at least roughly<br />

in the vicinity of the solution, e.g., within an order of magnitude, if possible. The closer your starting<br />

values are to the solution, the less chance of stopping at another minimum. One way to find starting<br />

values is to plot data and the fitting function on the same graph and change parameter values and replot<br />

until reasonable similarity is reached. The same plot is also useful to check whether the fit stopped at a<br />

minimum with a poor fit.<br />

Of course, a reasonably good fit is not proof there is not a "better" fit (in either a statistical sense,<br />

characterized by an improved goodness-of-fit criterion, or a physical sense, with a solution more consistent<br />

with the model.) Depending on the problem, it may be desirable to fit with various sets of starting<br />

values, covering a reasonable range for each parameter.<br />

22.7 Tips<br />

Here are some tips to keep in mind to get the most out of fit. They’re not very organized, so you’ll have<br />

to read them several times until their essence has sunk in.<br />

The two forms of the via argument to fit serve two largely distinct purposes. The via "file" form is<br />

best used for (possibly unattended) batch operation, where you just supply the startup values in a file<br />

and can later use update to copy the results back into another (or the same) parameter file.<br />

The via var1, var2, ... form is best used interactively, where the command history mechanism may<br />

be used to edit the list of parameters to be fitted or to supply new startup values for the next try. This<br />

is particularly useful for hard problems, where a direct fit to all parameters at once won’t work without<br />

good starting values. To find such, you can iterate several times, fitting only some of the parameters,<br />

until the values are close enough to the goal that the final fit to all parameters at once will work.<br />

Make sure that there is no mutual dependency among parameters of the function you are fitting. For<br />

example, don’t try to fit a*exp(x+b), because a*exp(x+b)=a*exp(b)*exp(x). Instead, fit either a*exp(x)<br />

or exp(x+b).<br />

A technical issue: the parameters must not be too different in magnitude. The larger the ratio of the<br />

largest and the smallest absolute parameter values, the slower the fit will converge. If the ratio is close<br />

to or above the inverse of the machine floating point precision, it may take next to forever to converge,<br />

or refuse to converge at all. You will have to adapt your function to avoid this, e.g., replace ’parameter’<br />

by ’1e9*parameter’ in the function definition, and divide the starting value by 1e9.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!