Stat-403/Stat-650 : Intermediate Sampling and Experimental Design ...

More documents

Recommendations

Info

Statistical Significance TestingThe Insignificance of StatisticalSignificance TestingWhat is Statistical Hypothesis Testing?Four basic steps constitute statistical hypothesis testing. First, one develops a null hypothesis about somephenomenon or parameter. This null hypothesis is generally the opposite of the research hypothesis,which is what the investigator truly believes and wants to demonstrate. Research hypotheses may begenerated either inductively, from a study of observations already made, or deductively, deriving fromtheory. Next, data are collected that bear on the issue, typically by an experiment or by sampling. (Nullhypotheses often are developed after the data are in hand and have been rummaged through, but that'sanother topic.) A statistical test of the null hypothesis then is conducted, which generates a P-value.Finally, the question of what that value means relative to the null hypothesis is considered. Severalinterpretations of P often are made.Sometimes P is viewed as the probability that the results obtained were due to chance. Small values aretaken to indicate that the results were not just a happenstance. A large value of P, say for a test that µ = 0,would suggest that the mean actually recorded was due to chance, and µ could be assumed to be zero(Schmidt and Hunter 1997).Other times, 1-P is considered the reliability of the result, that is, the probability of getting the sameresult if the experiment were repeated. Significant differences are often termed "reliable" under thisinterpretation.Alternatively, P can be treated as the probability that the null hypothesis is true. This interpretation is themost direct one, as it addresses head-on the question that interests the investigator.These 3 interpretations are what Carver (1978) termed fantasies about statistical significance. None ofthem is true, although they are treated as if they were true in some statistical textbooks and applicationspapers. Small values of P are taken to represent strong evidence that the null hypothesis is false, butworkers demonstrated long ago (see references in Berger and Sellke 1987) that such is not the case. Infact, Berger and Sellke (1987) gave an example for which a P-value of 0.05 was attained with a sample65http://www.npwrc.usgs.gov/resource/1999/statsig/stathyp.htm (1 of 7) [12/6/2002 3:10:59 PM]
Statistical Significance Testingof n = 50, but the probability that the null hypothesis was true was 0.52. Further, the disparity between Pand Pr[H 0 | data], the probability of the null hypothesis given the observed data, increases as samplesbecome larger.In reality, P is the Pr[observed or more extreme data | H 0 ], the probability of the observed data or datamore extreme, given that the null hypothesis is true, the assumed model is correct, and the sampling wasdone randomly. Let us consider the first two assumptions.What are More Extreme Data?Suppose you have a sample consisting of 10 males and three females. For a null hypothesis of a balancedsex ratio, what samples would be more extreme? The answer to that question depends on the samplingplan used to collect the data (i.e., what stopping rule was used). The most obvious answer is based on theassumption that a total of 13 individuals were sampled. In that case, outcomes more extreme than 10males and 3 females would be 11 males and 2 females, 12 males and 1 female, and 13 males and nofemales.However, the investigator might have decided to stop sampling as soon as he encountered 10 males.Were that the situation, the possible outcomes more extreme against the null hypothesis would be 10males and 2 females, 10 males and 1 female, and 10 males and no females. Conversely, the investigatormight have collected data until 3 females were encountered. The number of more extreme outcomes thenare infinite: they include 11 males and 3 females, 12 males and 3 females, 13 males and 3 females, etc.Alternatively, the investigator might have collected data until the difference between the numbers ofmales and females was 7, or until the difference was significant at some level. Each set of more extremeoutcomes has its own probability, which, along with the probability of the result actually obtained,constitutes P.The point is that determining which outcomes of an experiment or survey are more extreme than theobserved one, so a P-value can be calculated, requires knowledge of the intentions of the investigator(Berger and Berry 1988). Hence, P, the outcome of a statistical hypothesis test, depends on results thatwere not obtained, that is, something that did not happen, and what the intentions of the investigatorwere.Are Null Hypotheses Really True?P is calculated under the assumption that the null hypothesis is true. Most null hypotheses tested,however, state that some parameter equals zero, or that some set of parameters are all equal. Thesehypotheses, called point null hypotheses, are almost invariably known to be false before any data arecollected (Berkson 1938, Savage 1957, Johnson 1995). If such hypotheses are not rejected, it is usuallybecause the sample size is too small (Nunnally 1960).66http://www.npwrc.usgs.gov/resource/1999/statsig/stathyp.htm (2 of 7) [12/6/2002 3:10:59 PM]
Page 1 and 2:
Stat-403/Stat-650 : Intermediate Sa
Page 3:
Chapter 1Statistics - why the badim
Page 8 and 9:
Problems With Using MicrosoftExcel
Page 10 and 11:
The next example shows the help sup
Page 12 and 13:
Knusel, Leo, “On the Accuracy of
Page 14 and 15:
The latter tool is but one of many
Page 16 and 17: Is It Practical To Use Excel For St
Page 22 and 23: Using Excel for StatisticsStatistic
Page 24 and 25: Using Excel for StatisticsTipsWarni
Page 26 and 27: Using Excel for Statistics2. Adding
Page 28 and 29: Using Excel for StatisticsThe field
Page 30 and 31: Using Excel for StatisticsIn this e
Page 32 and 33: Using Excel for Statisticsbehaves l
Page 34 and 35: Using Excel for StatisticsThe resul
Page 36 and 37: Using Excel for Statisticsare liste
Page 38 and 39: DFID Guides Introduction PageStatis
Page 40 and 41: DFID Guides Introduction Pagetel: +
Page 42 and 43: BIOMETRICSINFORMATIONIndex of Pamph
Page 44 and 45: Index of Pamphlet TopicsExperimenta
Page 46 and 47: Index of Pamphlet TopicsRegression2
Page 48 and 49: Chapter 6How do I interpret a p-val
Page 50 and 51: 2Type I error in the data exceeds w
Page 52 and 53: 4ssssssssssssssssssssssssssssssssss
Page 61 and 62: Chapter 8The Insignificance ofStati
Page 63 and 64: Statistical Signigicance TestingEco
Page 65: Statistical Signigicance TestingThe
Page 69 and 70: Statistical Significance Testingapp
Page 71 and 72: Statistical Significance TestingBio
Page 73 and 74: Statistical Significance TestingThe
Page 75 and 76: Statistical Significance Testinginf
Page 77 and 78: Statistical Significance TestingHom
Page 79 and 80: Statistical Significance Testingsug
Page 81 and 82: Statistical Significance Testingres
Page 83 and 84: Statistical Significance TestingThe
Page 85 and 86: http://www.npwrc.usgs.gov/resource/
Page 87 and 88: http://www.npwrc.usgs.gov/resource/
Page 89: Chapter 9Designing EnvironmentalFie
Page 102 and 103: 101
Page 104 and 105: 103
Page 106 and 107: 105
Page 108 and 109: 107
Page 110 and 111: 109
Page 112 and 113: Chapter 10Power analysis in wildlif
Page 114 and 115: 113
Page 116 and 117:
115
Page 118 and 119:
117
Page 120 and 121:
119
Page 122 and 123:
121
Page 124 and 125:
123
Page 126 and 127:
125
Page 128 and 129:
127
Page 130 and 131:
129
Page 132 and 133:
131
Page 134 and 135:
133
Page 136 and 137:
135
Page 138 and 139:
137
Page 140 and 141:
139
Page 142 and 143:
141
Page 144 and 145:
143
Page 146 and 147:
145
Page 148 and 149:
147
Page 150 and 151:
Chapter 12Pseudo-replication revisi
Page 152 and 153:
151
Page 154 and 155:
153
Page 156:
155
show all

Stat-403/Stat-650 : Intermediate Sampling and Experimental Design ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?