Statistical Power - People.stat.sfu.ca
Statistical Power - People.stat.sfu.ca
Statistical Power - People.stat.sfu.ca
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
STATISTICAL POWER ANALYSIS<br />
IN WILDLIFE RESEARCH<br />
November 19, 2011!
Back to the basics: !<br />
α: Probability of a false positive <br />
(detecting an effect that doesn’t exist)!<br />
β: Probability of a false negative <br />
(failure to detect an effect when it actually exists)!<br />
<strong>Power</strong>: 1 – β, <br />
Probability of correctly rejecting a null<br />
hypothesis.!<br />
Practi<strong>ca</strong>l definition: PROBABILITY of detecting <br />
an effect when the effect actually exists.!<br />
Background
!Interrelated components: target power (1 – β) , <br />
α, sample size, and effect size.!<br />
• Probability of correctly detecting an effect.!<br />
• Probability of incorrectly detecting an effect.!<br />
• Sample size.!<br />
• Minimum response size that is considered biologi<strong>ca</strong>lly<br />
signifi<strong>ca</strong>nt.!<br />
!Examples of mutual relationship:!<br />
Target power = 0.8 with α= .05!<br />
Target power = 0.9 with α= .10!<br />
!<br />
Background
• Effect: “Minimum response size that is considered<br />
biologi<strong>ca</strong>lly signifi<strong>ca</strong>nt.”!<br />
• <strong>Statisti<strong>ca</strong>l</strong> and biologi<strong>ca</strong>l signifi<strong>ca</strong>nce are different.!<br />
• Biologi<strong>ca</strong>lly trivial differences may be <strong>stat</strong>isti<strong>ca</strong>lly<br />
signifi<strong>ca</strong>nt with large sample sizes and high power.!<br />
• Biologi<strong>ca</strong>lly important differences may not be<br />
<strong>stat</strong>isti<strong>ca</strong>lly signifi<strong>ca</strong>nt is power is low.!<br />
Background
!Effect: magnitude of response, original units!<br />
increase in fish concentration, 20 fish/m 2 !<br />
!Effect size: standardized effect, percentage!<br />
if sd = 50 fish/m 2 , effect size = 20/50 = 0.4 = 40!<br />
!Effect size (alternate form): pct. difference from mean value!<br />
if mean = 60 fish/m 2 , 72 fish/m 2 = 20% increase!<br />
if mean = 60 fish/m 2 , 45 fish/m 2 = 25% decrease !<br />
!<strong>Power</strong> to detect large effects is always greater <br />
than power to detect small effects.!<br />
Background
PPA: Conducted before the experiment is <strong>ca</strong>rried out.!<br />
Goal: to improve research design to increase the<br />
probability of detecting biologi<strong>ca</strong>lly signifi<strong>ca</strong>nt effects.!<br />
!<br />
• Determine the probability that an effect size of<br />
interest will be detected with a given sample size.!<br />
• Determine the sample size necessary to achieve<br />
acceptably high power.!<br />
!<br />
!<br />
Prospective <strong>Power</strong> Analysis (PPA)
• Set a meaningful effect size, α, and sample size.!<br />
• Compute range of values for combinations of parameters. !<br />
Prospective <strong>Power</strong> Analysis
Objective in research design: minimize experimental<br />
error and maximize precision of parameter estimates.!<br />
!<br />
Error reduction = increase in <strong>stat</strong>isti<strong>ca</strong>l power.!<br />
!<br />
Choices that influence the power of the experiment:!<br />
• Range of treatment levels selected.!<br />
• Number and type of experimental unit.!<br />
• Assignment of treatment to experimental units.!<br />
<strong>Power</strong> and Research Design
Typi<strong>ca</strong>l constraints:!<br />
• Maximum number of repli<strong>ca</strong>tes!<br />
• Range of treatment levels.!<br />
!<br />
<strong>Power</strong> <strong>ca</strong>n be increased cheaply by:!<br />
• Blocking!<br />
• Measuring related information (covariates)!<br />
• Efficient experimental design!<br />
<strong>Power</strong> and Research Design
Example: effect of people <strong>ca</strong>mping near nests on time<br />
spent by eagles with their nestlings.!<br />
Treatments: 100m and 500m; effect size: 20%; <br />
α = 0.1; power = 0.2!<br />
Test: 2-tailed t-test for independent samples!<br />
Results: null hypothesis not rejected; (t = 0.54, df = 52,<br />
p = 0.59, observed effect = 4.5%, se = 4.1)!<br />
!<br />
Problem: eagle nesting behaviour changes rapidly as<br />
nestlings mature (not accounted for)!<br />
<strong>Power</strong> and Research Design
• Change to crossover (paired) design!<br />
• Treatment and control are both applied to the same<br />
experimental unit (nest)!<br />
• Eliminates variability due to nestling age.!<br />
• Null hypothesis rejected (t = 2.19, 26 df, p = 0.038).!<br />
• Eagle behaviour change when people <strong>ca</strong>mp near<br />
their nests.!<br />
• Pooled sd in CRD: 29.8; sd for paired design: 10.7,<br />
even though sample size is half.!<br />
<strong>Power</strong> and Research Design
RPA: Conducted after the experiment has taken place.!<br />
!<br />
If a null hypothesis is not rejected there are two!<br />
possible reasons:!<br />
• No real effect existed.!<br />
• There is an effect but it was not detected.!<br />
!<br />
Type II error?!<br />
Retrospective <strong>Power</strong> Analysis (RPA)
<strong>Power</strong> is <strong>ca</strong>lculated using sample size, α, and!<br />
observed effect size… but so is p !!!!<br />
!<br />
There is no relationship between the observed!<br />
p value for a hypothesis test that was not rejected!<br />
and true power.!<br />
Retrospective <strong>Power</strong> Analysis (RPA)
• Alternative to Retrospective <strong>Power</strong> Analysis: <br />
Provide range of effect sizes!<br />
• Confidence Intervals provide information about <br />
the true size of an effect instead of just <br />
“<strong>stat</strong>isti<strong>ca</strong>lly different from 0”!<br />
• The same factors that reduce power (high α, <br />
low sample size, high sample variability) also <br />
increase the width of confidence intervals.!<br />
Confidence Intervals and <strong>Power</strong>
Confidence Intervals and <strong>Power</strong>
• If the cost of environmental effect could be great <br />
the consequences of a false negative error (Type II) <br />
may outweigh those from a false positive (Type I) error. !<br />
• Example: <strong>ca</strong>n we harvest timber without adversely <br />
affecting songbird populations?!<br />
• Typi<strong>ca</strong>l Null Hypothesis: timber harvesting has no effect!<br />
Conduct a low power test, fail to reject null.!<br />
Conclusion: no effect !!!<br />
• (Wrong) assumptions:!<br />
Cost of Type I error > cost of Type II error.!<br />
Failure to reject = accept.!<br />
Consequences of Type I and Type II errors
Points To Remember:!<br />
• Hypothesis testing has been overused.!<br />
• Practi<strong>ca</strong>l (biologi<strong>ca</strong>l) importance is preferrable <br />
to <strong>stat</strong>isti<strong>ca</strong>l signifi<strong>ca</strong>nce.!<br />
• Confidence intervals are more adequate for <br />
practi<strong>ca</strong>l importance.!<br />
• Is there a signifi<strong>ca</strong>nt effect? Should be: what’s the<br />
magnitude of the effect?!<br />
Points To Remember
!<br />
THANK YOU!<br />
!<br />
(Now go and design <br />
good experiments…)!