10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Using</strong> R <strong>for</strong> introductory statistics 336The initial parameter guesses are often found by doing some experimental plots. Thesecan be done quickly using the curve () function with the argument add=TRUE, asillustrated in the examples. When we model with a given function, it helps to have ageneral understanding of how the parameters change the graph of the function. Forexample, the parameters in the exponential model, writtenmaybe interpreted by t 0 being the place where we want time to begin counting, N the initialamount at this time, and r the rate of decay. For this model, the mean of the data decaysby 1/e, or roughly 1/3 in 1/r units of time.Some models have self-starting functions programmed <strong>for</strong> them. These typically startwith SS. A list can be found with the command apropos("SS"). These functions do notneed starting values.■ Example 12.5: Yellowfin tuna catch rate The data set yellowf in (<strong>Using</strong>R)contains data on the average number of yellowfin tuna caught per 100 hooks in thetropical Indian Ocean <strong>for</strong> various years. This data comes from a paper by Myers andWorm (see ?yellowf in) that uses such numbers to estimate the decline of fish stocks(biomass) since the advent of large-scale commercial fishing. The authors fit theexponential decay model with some threshold to the data.We can repeat the analysis using R. First, we plot (Figure 12.2).> plot(count ~ year, data=yellowfin)A scatterplot is made, as the data frame contains two numeric variables. The countvariable does seem to decline exponentially to some threshold. We try to fit the modelY=N(e −r(t−1952) (1−d)+d)+ε.(Instead of β i we give the parameters letter names.)To fit this in R, we define a function <strong>for</strong> the mean> f = function(t, N, r, d) N*(exp(-r*(t-1952))*(l-d)+d)We need to find some good starting points <strong>for</strong> nls (). The value of N=7 seems about right,as this is the starting value when t=1952. The value r is a decay rate. It can be estimatedby how long it takes <strong>for</strong> the data to decay by roughly 1/3. We guess about 10, so we startwith r=1/10. Finally, d is the percent of decay, which seems to be .6/6 = .10.We plot the function with these values to see how well they fit.> curve(f(x, N=6, r=1/10, d=0.1), add=TRUE)The fit is good (the solid line in Figure 12.2), so we expect nls() to converge with thesestarting values.> res.yf = nls(count ~ f(year, N, r, d),start=c(N=6,r=1/10, d=.1),+ data=yellowfin)> res.yf

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!