Stat-403/Stat-650 : Intermediate Sampling and Experimental Design ...

More documents

Recommendations

Info

Statistical Significance TestingFig 1. Results of a test that failed to reject the null hypothesis that a mean equals 0.Shaded areas indicate regions for which hypothesis would be rejected. (A) suggests thenull hypothesis may well be false, but the sample was too small to indicate significance;there is a lack of power. (B) suggests the data truly were consistent with the nullhypothesisPower AnalysisPower analysis is an adjunct to hypothesis testing that has become increasingly popular (Peterman 1990,Thomas and Krebs 1997). The procedure can be used to estimate the sample size needed to have aspecified probability (power = 1 - ) of declaring as significant (at the level) a particular difference oreffect (effect size). As such, the process can usefully be used to design a survey or experiment (Gerard etal. 1998). Its use is sometimes recommended to ascertain the power of the test after a study has beenconducted and nonsignificant results obtained (The Wildlife Society 1995). The notion is to guardagainst wrongly declaring the null hypothesis to be true. Such retrospective power analysis can bemisleading, however. Steidl et al. (1997:274) noted that power estimated with the data used to test thenull hypothesis and the observed effect size is meaningless, as a high P-value will invariably result inlow estimated power. Retrospective power estimates may be meaningful if they are computed with effectsizes different from the observed effect size. Power analysis programs, however, assume the input valuesfor effect and variance are known, rather than estimated, so they give misleadingly high estimates ofpower (Steidl et al. 1997, Gerard et al. 1998). In addition, although statistical hypothesis testing invokeswhat I believe to be 1 rather arbitrary parameter ( or P), power analysis requires three of them ( , ,effect size). For further comments see Shaver (1993:309), who termed power analysis "a vacuousintellectual game," and who noted that the tendency to use criteria, such as Cohen's (1988) standards forsmall, medium, and large effect sizes, is as mindless as the practice of using the = 0.05 criterion instatistical significance testing. Questions about the likely size of true effects can be better addressed withconfidence intervals than with retrospective power analyses (e.g., Steidl et al. 1997, Steiger and Fouladi1997).69http://www.npwrc.usgs.gov/resource/1999/statsig/stathyp.htm (5 of 7) [12/6/2002 3:10:59 PM]
Statistical Significance TestingBiological Versus Statistical SignificanceMany authors make note of the distinction between statistical significance and subject-matter (in ourcase, biological) significance. Unimportant differences or effects that do not attain significance are okay,and important differences that do show up significant are excellent, for they facilitate publication (Table1). Unimportant differences that turn out significant are annoying, and important differences that failstatistical detection are truly depressing. Recalling our earlier comments about the effect of sample sizeon P-values, the two outcomes that please the researcher suggest the sample size was about right (Table2). The annoying unimportant differences that were significant indicate that too large a sample wasobtained. Further, if an important difference was not significant, the investigator concludes that thesample was insufficient and calls for further research. This schizophrenic nature of the interpretation ofsignificance greatly reduces its value.Table 1. Reaction of investigator to results of astatistical significance test (after Nester 1996).Statistical significancePracticalimportance ofobservedNot significant SignificantdifferenceNot important Happy AnnoyedImportant Very sad ElatedTable 2. Interpretation of sample size as relatedto results of a statistical significance test.Statistical significancePracticalimportance ofobservedNot significant SignificantdifferenceNot important n okay n too bigImportant n too small n okayOther Comments on Hypothesis TestsStatistical hypothesis testing has received an enormous amount of criticism, and for a rather long time. In1963, Clark (1963:466) noted that it was "no longer a sound or fruitful basis for statistical investigation."Bakan (1966:436) called it "essential mindlessness in the conduct of research." The famed quality guruW. Edwards Deming (1975) commented that the reason students have problems understandinghypothesis tests is that they may be trying to think. Carver (1978) recommended that statisticalsignificance testing should be eliminated; it is not only useless, it is also harmful because it is interpretedto mean something else. Guttman (1985) recognized that "In practice, of course, tests of significance arenot taken seriously." Loftus (1991) found it difficult to imagine a less insightful way to translate data intoconclusions. Cohen (1994:997) noted that statistical testing of the null hypothesis "does not tell us whatwe want to know, and we so much want to know what we want to know that, out of desperation, wenevertheless believe that it does!" Barnard (1998:47) argued that "... simple P-values are not now used bythe best statisticians." These examples are but a fraction of the comments made by statisticians and usersof statistics about the role of statistical hypothesis testing. While many of the arguments againstsignificance tests stem from their misuse, rather than their intrinsic values (Mulaik et al. 1997), I believe70http://www.npwrc.usgs.gov/resource/1999/statsig/stathyp.htm (6 of 7) [12/6/2002 3:10:59 PM]
Page 1 and 2:
Stat-403/Stat-650 : Intermediate Sa
Page 3:
Chapter 1Statistics - why the badim
Page 8 and 9:
Problems With Using MicrosoftExcel
Page 10 and 11:
The next example shows the help sup
Page 12 and 13:
Knusel, Leo, “On the Accuracy of
Page 14 and 15:
The latter tool is but one of many
Page 16 and 17:
Is It Practical To Use Excel For St
Page 18 and 19:
Is It Practical To Use Excel For St
Page 20 and 21: Is It Practical To Use Excel For St
Page 22 and 23: Using Excel for StatisticsStatistic
Page 24 and 25: Using Excel for StatisticsTipsWarni
Page 26 and 27: Using Excel for Statistics2. Adding
Page 28 and 29: Using Excel for StatisticsThe field
Page 30 and 31: Using Excel for StatisticsIn this e
Page 32 and 33: Using Excel for Statisticsbehaves l
Page 34 and 35: Using Excel for StatisticsThe resul
Page 36 and 37: Using Excel for Statisticsare liste
Page 38 and 39: DFID Guides Introduction PageStatis
Page 40 and 41: DFID Guides Introduction Pagetel: +
Page 42 and 43: BIOMETRICSINFORMATIONIndex of Pamph
Page 44 and 45: Index of Pamphlet TopicsExperimenta
Page 46 and 47: Index of Pamphlet TopicsRegression2
Page 48 and 49: Chapter 6How do I interpret a p-val
Page 50 and 51: 2Type I error in the data exceeds w
Page 52 and 53: 4ssssssssssssssssssssssssssssssssss
Page 61 and 62: Chapter 8The Insignificance ofStati
Page 63 and 64: Statistical Signigicance TestingEco
Page 65 and 66: Statistical Signigicance TestingThe
Page 67 and 68: Statistical Significance Testingof
Page 69: Statistical Significance Testingapp
Page 73 and 74: Statistical Significance TestingThe
Page 75 and 76: Statistical Significance Testinginf
Page 77 and 78: Statistical Significance TestingHom
Page 79 and 80: Statistical Significance Testingsug
Page 81 and 82: Statistical Significance Testingres
Page 83 and 84: Statistical Significance TestingThe
Page 85 and 86: http://www.npwrc.usgs.gov/resource/
Page 87 and 88: http://www.npwrc.usgs.gov/resource/
Page 89: Chapter 9Designing EnvironmentalFie
Page 102 and 103: 101
Page 104 and 105: 103
Page 106 and 107: 105
Page 108 and 109: 107
Page 110 and 111: 109
Page 112 and 113: Chapter 10Power analysis in wildlif
Page 114 and 115: 113
Page 116 and 117: 115
Page 118 and 119: 117
Page 120 and 121:
119
Page 122 and 123:
121
Page 124 and 125:
123
Page 126 and 127:
125
Page 128 and 129:
127
Page 130 and 131:
129
Page 132 and 133:
131
Page 134 and 135:
133
Page 136 and 137:
135
Page 138 and 139:
137
Page 140 and 141:
139
Page 142 and 143:
141
Page 144 and 145:
143
Page 146 and 147:
145
Page 148 and 149:
147
Page 150 and 151:
Chapter 12Pseudo-replication revisi
Page 152 and 153:
151
Page 154 and 155:
153
Page 156:
155
show all

Stat-403/Stat-650 : Intermediate Sampling and Experimental Design ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?