10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Using</strong> R <strong>for</strong> introductory statistics 24what the data set provides, as with ?lynx. See Table 1.3 <strong>for</strong> more details on data() andlibrary().Accessing the variables in a data set: $, attach(), and with()A data set can store a single variable, such as lynx, or several variables, such as thesurvey data set in the MASS package. Usually, data sets that store several variables arestored as data frames. This <strong>for</strong>mat combines many variables in a rectangular grid, like aspreadsheet, where each column is a different variable, and usually each row correspondsto the same subject or experimental unit. This conveniently allows us to have all the datavectors together in one object.The different variables of a data frame typically have names, but initially these namesare not directly accessible as variables. We can access the values by name, but we mustalso include the name of the data frame. The $ syntax can be used to do this, as in> library(MASS) # load package.Includes geyser> names(geyser) # what are variablenames of geyser[1] "waiting" "duration" # or ?geyser <strong>for</strong> moredetail> geyser$waiting # access waitingvariable in geyser[1] 80 71 57 80 75 77 60 86 77 56 81 50 8954 90…Table 1.3 library() and data() usagelibrary()list all the installed packageslibrary(pkg)Load the package pkg. Use lib.loc=argument to load package from a nonprivilegeddirectory.data()list all available data sets in loaded packagesdata(package="pkg") list all data sets <strong>for</strong> this packagedata(ds)load the data set dsdata(ds,package=("pkg") load the data set from package?dsfind help on this data setupdate.packages() contact CRAN and interactively update installed packagesinstall.packages(pkg) Install the package named pkg. This gets package from CRAN. Uselib=argument to specify a nonprivileged directory <strong>for</strong> installation. Thecontriburl=…allows us to specify other servers to find the package.Alternately, with a bit more typing, the data can be referred to using index notation aswith geyser [ ["waiting"]]. Both these styles use the syntax <strong>for</strong> a list discussed in Chapter4.Having to type the data frame name each time we reference a variable can becumbersome when multiple references are per<strong>for</strong>med. There are several ways to avoidthis.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!