10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Using</strong> R <strong>for</strong> introductory statistics 44(22)The sample median is found in R with the median() function.For example:> bar = c(50,60,100,75,200) # bar patrons worthin 1000s> bar.with.gates = c(bar,50000) # after Bill Gatesenters> mean(bar)[1] 97> mean(bar.with.gates) # mean is sensitive tolarge values[1] 8414> median(bar)[1] 75> median(bar.with.gates) # median is resistant[1] 87.5The example shows that a single large value can change the value of the sample meanconsiderably, whereas the sample median is much less influenced by the large value.<strong>Statistics</strong> that are not greatly influenced by a few values far from the bulk of the data arecalled resistant statistics.Visualizing the mean and median from a graphicFigure 2.8 shows how to visualize the mean and median from a strip chart (and, similarly,from a stem-and-leaf plot). The strip chart implicitly orders the data from smallest tolargest; to find the median we look <strong>for</strong> the middle point. This is done by counting. Whenthere is an even number, an average is taken of the two middle ones.The <strong>for</strong>mula <strong>for</strong> the mean can be interpreted using the physics <strong>for</strong>mula <strong>for</strong> a center ofmass. In this view, the mean is the balancing point of the strip chart when we imagine thepoints as equal weights on a seesaw.With this intuition, we can see why a single extremely large or small data point canskew the mean but not the median.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!