10.07.2015 Views

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

Using R for Introductory Statistics : John Verzani

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Using</strong> R <strong>for</strong> introductory statistics 803.3 Relationships in numeric dataThere are many scientific relationships between numeric variables. Among them:distance equals rate times time, pressure is proportional to temperature; and demand isinverse to supply. Many relationships are not precisely known, prompting an examinationof the data. For instance, is there a relationship between a person’s height and weight?If a bivariate data set has a natural pairing, such as (x 1 , y 1 ), …, (x n ,y n ), then it likelymakes sense <strong>for</strong> us to investigate the data set jointly, as a two-way table does <strong>for</strong>categorical data.3.3.1 <strong>Using</strong> scatterplots to investigate relationshipsA scatterplot is a good place to start when investigating a relationship between twonumeric variables. A scatterplot plots the values of one data vector against another aspoints (x i , y i ) in a Cartesian plane.The plot() function will make a scatterplot. A basic template <strong>for</strong> its usage isPlot (x, y)where x and y are data vectors containing the paired data. The plot() function is used tomake many types of plots, including densityplots, as seen. For scatterplots, there areseveral options to plot() that can adjust how the points are drawn, whether the points areconnected with lines, etc. We show a few examples and then collect them in Table 3.7.■ Example 3.2: Home values Buying a home has historically been a goodinvestment. Still, there are expenses. Typically, a homeowner needs to pay a property taxin proportion to the assessed value of the home. To ensure some semblance of fairness,the assessed values should be updated periodically. In Maplewood, New Jersey,properties were reassessed in the year 2000 <strong>for</strong> the first time in 30 years. The data sethomedata (<strong>Using</strong>R) contains values <strong>for</strong> 150 randomly chosen homes. A scatterplot ofassessed values should show a rela-tionship, as homes that were expensive in 1970should still have been expensive in 2000. We can use this data set to get an insight intothe change in property values <strong>for</strong> these 30 years.The scatterplot is made after loading the data set.> attach(homedata)> plot(y1970, y2000) # make the scatterplot

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!