10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Using</strong> R <strong>for</strong> introductory statistics 31211.2.2 Comparing multiple differencesWhen analysis of variance is per<strong>for</strong>med with lm(), the output contains numerousstatistical tests. The F-test that is per<strong>for</strong>med uses <strong>for</strong> the null hypothesis that β 2 =β 3 =…=β k =0 against an alternative that one or more differ from 0. That is, that one or more of thetreatments has an effect compared to the reference level. The marginal t-tests that areper<strong>for</strong>med are two-sided tests with a null hypothesis that β i =β 1 . One each is done <strong>for</strong> i=2,…, k. These test whether any of the additional treatments have a different effect from thereference one when controlled by the other variables. However, we may wish to ask otherquestions about the various parameters. For example, comparisons not covered by thestandard output are “Do the β 2 and β 3 differ?” and “Are β 1 and β 2 half of β 3 ?” We shownext how to handle simultaneous pairwise comparisons of the parameters, such as thefirst comparison.If we know ahead of time that we are looking <strong>for</strong> a pairwise difference, then a simplet-test is appropriate (as in the case where we are considering just two independentsamples). However, if we look at the data and then decide to test whether the second andthird parameters differ, then our t-test is shaky. Why? Remember that any test is correctonly with some probability—even if the models are correct. This means that sometimesthey fail, and the more tests we per<strong>for</strong>m, the more likely one or more will fail. When welook at the data, we are essentially per<strong>for</strong>ming lots of tests, so there is more chance offailing.In this case, to be certain that our t-test has the correct significance level, we adjust itto include all the tests we can possibly consider. This adjustment can be done by handwith the simple, yet often overly conservative Bonferroni adjustment. This method uses asimple probability bound to ensure the proper significance level.However, with R it is straight<strong>for</strong>ward to per<strong>for</strong>m Tukey’s generally more useful andpowerful “honest significant difference” test. This test covers all pairwise comparisons atone time by simultaneously constructing confidence intervals of the type(11.6)The values are the sample means <strong>for</strong> the i-th level and q* is the quantile <strong>for</strong> adistribution known as the studentized range distribution. This choice of q* means that allthese confidence intervals hold simultaneously with probability 1−α.This procedure is implemented in the TukeyHSD() function as illustrated in the nextexample.■ Example 11.7: Difference in takeoff times at the airportWe investigate the takeoff times <strong>for</strong> various airlines at Newark Liberty airport. As withother busy airports, Newark’s is characterized by long delays on the runway due torequirements that plane departures be staggered. Does this affect all the airlines equally?Without suspecting that any one airline is favored, we can per<strong>for</strong>m a simultaneous pairwisecomparison to investigate.First, we massage the data in ewr (<strong>Using</strong>R) so that we have two variables: one to keeptrack of the time and the other a factor indicating the airline.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!