12.01.2015 Views

RESEARCH METHOD COHEN ok

RESEARCH METHOD COHEN ok

RESEARCH METHOD COHEN ok

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

VALIDITY AND RELIABILITY IN TESTS 163<br />

<br />

in question. Content validity is achieved by<br />

making professional judgements about the relevance<br />

and sampling of the contents of the<br />

test to a particular domain. It is concerned<br />

with coverage and representativeness rather<br />

than with patterns of response or scores. It is<br />

a matter of judgement rather than measurement<br />

(Kerlinger 1986). Content validity will<br />

need to ensure several features of a test (Wolf<br />

1994): (a) test coverage (the extent to which<br />

the test covers the relevant field); (b) test relevance<br />

(the extent to which the test items<br />

are taught through, or are relevant to, a particular<br />

programme); (c) programme coverage<br />

(the extent to which the programme covers<br />

the overall field in question).<br />

Criterion-related validity is where a high correlation<br />

coefficient exists between the scores on the<br />

test and the scores on other accepted tests of<br />

the same performance: this is achieved by comparing<br />

the scores on the test with one or more<br />

variables (criteria) from other measures or tests<br />

that are considered to measure the same factor.<br />

Wolf (1994) argues that a major problem<br />

facing test devisers addressing criterion-related<br />

validity is the selection of the suitable criterion<br />

measure. He cites the example of the difficulty<br />

of selecting a suitable criterion of academic<br />

achievement in a test of academic aptitude.<br />

The criterion must be: relevant (and agreed to<br />

be relevant); free from bias (i.e. where external<br />

factors that might contaminate the criterion<br />

are removed); reliable – precise and accurate;<br />

capable of being measured or achieved.<br />

Construct validity (e.g. the clear relatedness<br />

of a test item to its proposed construct/unobservable<br />

quality or trait, demonstrated<br />

by both empirical data and logical<br />

analysis and debate, i.e. the extent to which<br />

particular constructs or concepts can give an<br />

account for performance on the test): this is<br />

achieved by ensuring that performance on the<br />

test is fairly explained by particular appropriate<br />

constructs or concepts. As with content validity,<br />

it is not based on test scores, but is more a<br />

matter of whether the test items are indicators<br />

of the underlying, latent construct in question.<br />

<br />

<br />

<br />

<br />

<br />

In this respect construct validity also subsumes<br />

content and criterion-related validity. It is argued<br />

(Loevinger 1957) that, in fact, construct<br />

validity is the queen of the types of validity because<br />

it is subsumptive and because it concerns<br />

constructs or explanations rather than methodological<br />

factors. Construct validity is threatened<br />

by under-representation of the construct,<br />

i.e. the test is too narrow and neglects significant<br />

facets of a construct, and by the inclusion<br />

of irrelevancies – excess reliable variance.<br />

Concurrent validity is where the results of the<br />

test concur with results on other tests or instruments<br />

that are testing/assessing the same<br />

construct/performance – similar to predictive<br />

validity but without the time dimension. Concurrent<br />

validity can occur simultaneously with<br />

another instrument rather than after some time<br />

has elapsed.<br />

Face validity is where, superficially, the test appears<br />

– at face value – to test what it is designed<br />

to test.<br />

Jury validity is an important element in construct<br />

validity, where it is important to agree<br />

on the conceptions and operationalization of<br />

an unobservable construct.<br />

Predictive validity is where results on a test accurately<br />

predict subsequent performance – akin<br />

to criterion-related validity.<br />

Consequential validity is where the inferences<br />

that can be made from a test are sound.<br />

Systemic validity (Frederiksen and Collins<br />

1989) is where programme activities both<br />

enhance test performance and enhance performance<br />

of the construct that is being addressed<br />

in the objective. Cunningham (1998) gives an<br />

example of systemic validity where, if the test<br />

and the objective of vocabulary performance<br />

leads to testees increasing their vocabulary,<br />

then systemic validity has been addressed.<br />

To ensure test validity, then, the test must<br />

demonstrate fitness for purpose as well as<br />

addressing the several types of validity outlined<br />

above. The most difficult for researchers to<br />

address, perhaps, is construct validity, for it<br />

argues for agreement on the definition and<br />

Chapter 6

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!