12.01.2015 Views

RESEARCH METHOD COHEN ok

RESEARCH METHOD COHEN ok

RESEARCH METHOD COHEN ok

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

REGRESSION ANALYSIS 537<br />

we know or assume values of the other variable(s)’<br />

(Cohen and Holliday 1996: 88). It<br />

is a way of modelling the relationship between<br />

variables. We concern ourselves here with<br />

simple linear regression and multiple regression<br />

(see http://www.routledge.com/textbo<strong>ok</strong>s/<br />

9780415368780 – Chapter 24, file SPSS Manual<br />

24.7).<br />

Simple linear regression<br />

In simple linear regression the model includes<br />

one explanatory variable (the independent variable)<br />

and one explained variable (the dependent<br />

variable) (see http://www.routledge.com/<br />

textbo<strong>ok</strong>s/9780415368780 – Chapter 24, file<br />

24.15.ppt). For example, we may wish to see the<br />

effect of hours of study on levels of achievement in<br />

an examination, to be able to see how much improvement<br />

will be made to an examination mark<br />

by a given number of hours of study. Hours of study<br />

is the independent variable and level of achievement<br />

is the dependent variable. Conventionally,<br />

as in the example in Box 24.29, one places the<br />

independent variable in the vertical axis and the<br />

dependent variable in the horizontal axis. In the<br />

example in Box 24.29, we have taken 50 cases of<br />

hours of study and student performance, and have<br />

constructed a scatterplot to show the distributions<br />

(SPSS performs this function at the click of two<br />

or three keys). We have also constructed a line<br />

of best fit (SPSS will do this easily) to indicate<br />

the relationship between the two variables. The<br />

line of best fit is the closest straight line that can<br />

be constructed to take account of variance in the<br />

scores, and strives to have the same number of<br />

cases above it and below it and making each point<br />

as close to the line as possible; for example, one<br />

can see that some scores are very close to the<br />

line and others are some distance away. There is a<br />

formula for its calculation, but we do not explore<br />

that here.<br />

One can observe that the greater the number<br />

of hours spent in studying, generally the greater<br />

is the level of achievement. This is akin to<br />

correlation. The line of best fit indicates not only<br />

that there is a positive relationship, but also that<br />

Box 24.29<br />

Ascatterplotwiththeregressionline<br />

Hours of study<br />

6<br />

5<br />

4<br />

3<br />

2<br />

1<br />

0<br />

20<br />

30<br />

40 50 60<br />

Level of achievement<br />

the relationship is strong (the slope of the line is<br />

quite steep). However, where regression departs<br />

from correlation is that regression provides an<br />

exact prediction of the value – the amount – of<br />

one variable when one knows the value of the<br />

other. One could read off the level of achievement,<br />

for example, if one were to study for two hours (43<br />

marks out of 80) or for four hours (72 marks out<br />

of 80), of course, taking no account of variance.<br />

To help here scatterplots (e.g. in SPSS) can insert<br />

grid lines, for example (Box 24.30).<br />

It is dangerous to predict outside the limits of<br />

the line; simple regression is to be used only to<br />

calculate values within the limits of the actual line,<br />

and not beyond it. One can observe, also, that<br />

though it is possible to construct a straight line<br />

of best fit (SPSS does this automatically), some<br />

of the data points lie close to the line and some<br />

lie a long way from the line; the distance of the<br />

data points from the line is termed the residuals,<br />

and this would have to be commented on in any<br />

analysis (there is a statistical calculation to address<br />

this but we do not go into it here).<br />

Where the line strikes the vertical axis is named<br />

the intercept. Wereturntothislater,butatthis<br />

stage we note that the line does not go through<br />

the origin but starts a little way up the vertical<br />

70<br />

80<br />

Chapter 24

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!