10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Using</strong> R <strong>for</strong> introductory statistics 2288.6 Two-sample tests of centerA physician may be interested in knowing whether the time to recover <strong>for</strong> one surgery isshorter than that <strong>for</strong> another surgery. A taxicab driver might wish to know whether thetime to travel one route is faster than the time to travel another. A consumer group mightwish to know whether gasoline prices are similar in two different cities. Or a governmentagency might want to know whether consumer spending is similar in two different states.All of these questions could be approached by taking random samples from the respectivepopulations and comparing. We consider the situation when the question of issue can beboiled down to a comparison of the centers of the two populations. We can use asignificance test to compare centers in the same manner as we compare two populationproportions. However, as there are more possibilities <strong>for</strong> types of populations considered,there are more test statistics to consider.Suppose Xi,i=1,…,n x and Y j , j=1,…,n y are random samples from the two populationsof interest. A significance test to compare the centers of their parent distributions woulduse the hypothesesH 0 :µ x =µ y , H A :µ x µ y , or µ x ≠µ y .(8.3)A reasonable test statistic depends on the assumptions placed on the parent populations.If the populations are normally distributed or nearly so, and the samples are independentof each other, then a t-test can be used. If the populations are not normally distributed,then a nonparametric Wilcoxon test may be appropriate. If the samples are notindependent but paired off in some way, then a paired test might be called <strong>for</strong>.8.6.1 Two sample tests of center with normal populationsSuppose the two samples are independent with normally distributed populations. Asand estimate µ x and µ y respectively, the value of should be a good estimate <strong>for</strong>µ x −µ y . We can use this to <strong>for</strong>m a test statistic. Both sample means have normallydistributed sampling distributions. A natural test statistic is thenUnder H 0 , the expected value of the difference is a. The standard error is found from the<strong>for</strong>mula <strong>for</strong> the standard deviation, which is based on the independence of the samples:As with confidence intervals, the estimate used <strong>for</strong> the population variances depends onan assumption of equal variances.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!