10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Using</strong> R <strong>for</strong> introductory statistics 132Which type of car has the largest spread? Repeat using the variable Driver.deaths+Other.deaths. Which has the largest spread? Are there any trends?4.27 The data set Orange contains data on the growth of five orange trees. Usexyplot() to make a scatterplot of the variable circumference modeled by age <strong>for</strong> eachlevel of Tree. Are the growth patterns similar?4.28 The data set survey (MASS) contains survey in<strong>for</strong>mation on students.1. Make scatterplots using xyplot() of the writing-hand size (Wr.Hnd) versus nonwriting-handsize (NW.Hnd), broken down by gender (Sex). Is there a trend?2. Make boxplots using bwplot() of Pulse() <strong>for</strong> the four levels of Smoke broken downby gender Sex. Are there any trends? Differences?Do you expect any other linear relationships among the variables?4.5 Types of data in R(This section may be skipped initially. It is somewhat technical and isn’t used directly inthe remainder of the text.)The basic structures <strong>for</strong> storing data in R are data vectors <strong>for</strong> univariate data, matricesand data frames <strong>for</strong> rectangular data, and lists <strong>for</strong> more general needs. Each data vectormust contain the same type of data. The basic types are numeric, logical, and character.Many objects in R also have a class attribute given by the class() function. It is theclass of an object that is used by R to give different meanings to generic functions, suchas plot() and summary().4.5.1 FactorsFactors should also be considered to be another storage type, as they are handleddifferently than a data vector. Recall, factors keep track of categorical data, as can a datavector, yet unlike other data types, their values can come only from the specified levels ofthe factor. Manipulating the levels requires knowing how to do a few things: creating afactor, listing the levels, adding a level, dropping levels, and ordering levels.Creating factorsFactors are made with factor() or as. factor(). These functions coerce the data into a factorand define the levels of the new factor. For example:> x = 1:3; fac=letters[1:3]> factor(x) # default uses sortedorder[1] 1 2 3Levels: 1 2 3> factor(fac) # same with characters[1] a b cLevels: a b c

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!