10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Using</strong> R <strong>for</strong> introductory statistics 1284.18 The data set morley contains measurements of the speed of light. Make side-by-sideboxplots of the Speed variable <strong>for</strong> each experiment recorded by Expt. Are the centerssimilar or different? Are the spreads similar or different?4.19 For the data set PlantGrowth, make boxplots of the weight variable <strong>for</strong> each levelof group. Are the centers similar or different? Are the spreads similar or different?4.20 For the data set Insect Sprays, make boxplots of the count variable <strong>for</strong> levels C,D, and E. Hint: These can be found with a command such as> spray %in% c("C" ,"D" , “E")Use this with the model notation and the argument subset= when making the boxplots.4.21 The pairs() function also has a model-<strong>for</strong>mula interface. We can redo Example(4.4) with the command> pairs( ~ gestation + age + wt + inc, data = babies,+ subset = gestation < 999 & age < 99 & inc < 98)For the US cereal (MASS) data set, use the <strong>for</strong>mula interface to make a scatterplot matrixof the variables calories, carbo, protein, fat, fibre, and sugars. Which relationships show alinear trend?4.4 Lattice graphicsIn Figure 4.2 various colors and plotting characters were used to show whether the thirdvariable, smoke, affected the relationship between gestation time and birth weight. As wenoted, the figure was a bit crowded <strong>for</strong> this approach. A better solution would be to createa separate scatterplot <strong>for</strong> each level of the third variable. These graphs can be made usingthe lattice graphics package.The add-on package lattice is modeled after Cleveland’s Trellis graphics concepts anduses the newer, low-level plotting commands available in the grid package. These two arerecommended packages and should be part of a standard R installation.The graphics shown below are useful and easy to create. Many other usages arepossible. If the package is loaded (with library (lattice)), a description of lattice graphicsis available in the help system under ?Lattice. The help page ?xyplot also containsextensive documentation. *The basic idea is that the graphic consists of a number of panels. Each panelcorresponds to some value(s) of a conditioning variable. The lattice graphing functionsare called using the model-<strong>for</strong>mula interface. The <strong>for</strong>mulas have the <strong>for</strong>matresponse ~ predictor|conditionThe response variable is not always present. For univariate graphs, such as histograms, itis not given; <strong>for</strong> bivariate graphs, such as scatterplots, it is. The optional conditionvariable is either a factor or a numeric value. If it is a factor, there is a separate panel <strong>for</strong>each level. If it is numeric, “shingles” are created that split up the range of the variable to

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!