13.07.2015 Views

Stat-403/Stat-650 : Intermediate Sampling and Experimental Design ...

Stat-403/Stat-650 : Intermediate Sampling and Experimental Design ...

Stat-403/Stat-650 : Intermediate Sampling and Experimental Design ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Stat</strong>istical Significance TestingFig 1. Results of a test that failed to reject the null hypothesis that a mean equals 0.Shaded areas indicate regions for which hypothesis would be rejected. (A) suggests thenull hypothesis may well be false, but the sample was too small to indicate significance;there is a lack of power. (B) suggests the data truly were consistent with the nullhypothesisPower AnalysisPower analysis is an adjunct to hypothesis testing that has become increasingly popular (Peterman 1990,Thomas <strong>and</strong> Krebs 1997). The procedure can be used to estimate the sample size needed to have aspecified probability (power = 1 - ) of declaring as significant (at the level) a particular difference oreffect (effect size). As such, the process can usefully be used to design a survey or experiment (Gerard etal. 1998). Its use is sometimes recommended to ascertain the power of the test after a study has beenconducted <strong>and</strong> nonsignificant results obtained (The Wildlife Society 1995). The notion is to guardagainst wrongly declaring the null hypothesis to be true. Such retrospective power analysis can bemisleading, however. Steidl et al. (1997:274) noted that power estimated with the data used to test thenull hypothesis <strong>and</strong> the observed effect size is meaningless, as a high P-value will invariably result inlow estimated power. Retrospective power estimates may be meaningful if they are computed with effectsizes different from the observed effect size. Power analysis programs, however, assume the input valuesfor effect <strong>and</strong> variance are known, rather than estimated, so they give misleadingly high estimates ofpower (Steidl et al. 1997, Gerard et al. 1998). In addition, although statistical hypothesis testing invokeswhat I believe to be 1 rather arbitrary parameter ( or P), power analysis requires three of them ( , ,effect size). For further comments see Shaver (1993:309), who termed power analysis "a vacuousintellectual game," <strong>and</strong> who noted that the tendency to use criteria, such as Cohen's (1988) st<strong>and</strong>ards forsmall, medium, <strong>and</strong> large effect sizes, is as mindless as the practice of using the = 0.05 criterion instatistical significance testing. Questions about the likely size of true effects can be better addressed withconfidence intervals than with retrospective power analyses (e.g., Steidl et al. 1997, Steiger <strong>and</strong> Fouladi1997).69http://www.npwrc.usgs.gov/resource/1999/statsig/stathyp.htm (5 of 7) [12/6/2002 3:10:59 PM]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!