10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Using</strong> R <strong>for</strong> introductory statistics 110A list is a more general type of storage than a data frame. Think of a list as a collectionof components. Each component can be any R object, such as a vector, a data frame, afunction, or even another list. In particular, a data frame is a list with top-levelcomponents given by equal-length data vectors. A list is a very flexible type of dataobject. Many of the functions in R, such as lm(), have return values that are lists,although only selected portions may be displayed when the return value is printed.Lists can be used to store variables of different lengths, such as the cancer (<strong>Using</strong>R)data set, but it is usually more convenient to store such data in a data frame with twocolumns—one column recording the measurements and the other a factor recordingwhich variable the value belongs to.As a data frame is a special type of list, it can be accessed as either a matrix or a list.4.2.1 Creating a data frame or listData frames are created with the data.frame() function, and lists are made with the list()function. Data frames are also returned by read.table() and read, csv().For example:> x=1:2 # define x> y=letters[1:2] # y=c("a","b”)> z=1:3 # z has 3 elements, x,yonly 2> data.frame(x,y) # rectangular, cols arevariablesx y1 1 a2 2 b> data.frame(x,y,z) # not all the samesize.Error in data.frame(x, y, z) : arguments implydiffering numberof rows: 2, 3Data frames must have variables of the same length.Lists are created using the function list().> list(x,y,z)[[1]][1] 1 2[[2]][1] "a" "b"[[3]][1] 1 2 3The odd-looking numbers that appear with the command list(x, y, z) specify where thevalues are stored. The first one, [[1]], says this is <strong>for</strong> the first toplevel component of thelist, which is a data vector. The following [1] refers to the first entry of this data vector.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!