10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Multivariate data 111One difference that isn’t apparent in the output, when using data.frame to create a dataframe, is that character variables, such as y, are coerced to be factors, unless insulatedwith I. This coercion isn’t done by list(). This can cause confusion when trying to addnew values to the variable.Adding names to a data frame or listJust like data vectors, both data frames and lists can have a names attribute. These arefound and set by the names() function or when we define the object. The names of a listrefer to the top-level components. For data frames, these top-level components are thevariables. In the above examples, the command data.frame(x,y) assigns names of x and yautomatically, but the list() function does not. If we want to define names when usingdata.frame() or list() we can use a name=value <strong>for</strong>mat, as in> list(x.name=x,"y name"=y) # quotes may be needed$x.name[1] 1 2$"y name”[1] "a" "b"The names() function can be used to retrieve or assign the names. When assigning namesit is used on the left side of the assignment (when using the equals sign). For example:> eg=data.frame(x,y) # store the data frame> names(eg) # the current names[1] "x" "y"> names(eg) = c("x.name","y name”) # change the names> names(eg) # names are changed[1] "x.name" "y name"Data frames can also have their column names accessed with the function colnames() andtheir rows named with rownames(). Both can be set at the same time with dimnames().The row names must be unique, and it is recommended that column names be also. Thesefunctions are applicable to matrix-like objects.The size of a data frame or listData frames represent a number of variables, each with the same number of entries. Theewr (<strong>Using</strong>R) is an example. As this data is matrix-like, its size is determined by thenumber of rows and columns. The dim() function returns the size of matrix-like objects:> dim(ewr) # number or rows andcolumns[1] 46 11> dim(ewr)[2] # number of cols is 2nd[1] 11Row and column sizes may also be found directly with nrow() and ncol().A list need not be rectangular. Its size is defined by the number of top-levelcomponents in it. This is found with the function length(). As data frames are lists whose

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!