10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Univariate data 33The table() function will produce this type of output:> table(central.park.cloud)central.park.cloudclear partly.cloudy cloudy11 11 92.1.2 BarplotsCategorical data is also summarized in a graphical manner. Perhaps most commonly, thisis done with a barplot (or bar chart). A barplot in its simplest usage arranges the levels ofthe variable in some order and then represents their frequency with a bar of a heightproportional to the frequency.In R, barplots are made with the barplot() function. It uses a summarized version of thedata, often the result of the table() function. * The summarized data can be eitherfrequencies or proportions. The resulting graph will look the same, but the scales on they-axis will differ.■ Example 2.2: A first barplot Twenty-five students are surveyed about their beerpreferences. The categories to choose from are coded as (1) domestic can, (2) domesticbottle, (3) microbrew, and (4) import. The raw data is3 4 1 1 3 4 3 3 1 3 2 1 2 1 2 3 2 3 1 1 1 1 4 3 1Let’s make a barplot of both frequencies and proportions. We first use scan () instead ofc(), to read in the data. Then we plot (Figure 2.1) in several ways. The last two graphshave different scales. For barplots, it is most common to use the frequencies.* In version 1.9.0 of R one must coerce the resulting table into a data vector to get the desired plot.This can be done with the command t(table(x))> beer=scan()1:3 4 1 1 3 4 3 3 1 3 2 1 2 1 2 3 2 3 1 1 1 1 4 3 126:Read 25 items> barplot(beer) # this isn’tcorrect> barplot(table(beer), # frequencies+ xlab="beer", ylab="frequency")> barplot(table(beer)/length(beer), # proportions+ x lab="beer", ylab="proportion")The + symbol is the continuation prompt. Just like the usual prompt, >, it isn’t typed butis printed by R.The barplot on the left in Figure 2.1 is not correct, as the data isn’t summarized. Aswell, the barplot() function hasn’t labeled the x-and y-axes, making it impossible <strong>for</strong> thereader to identify what is being shown. In the subsequent barplots, the extra argumentsxlab= and ylab= are used to label the x- and yaxes.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!