RESEARCH METHOD COHEN ok
RESEARCH METHOD COHEN ok RESEARCH METHOD COHEN ok
RELIABILITY IN QUALITATIVE RESEARCH 149 reliability of quantitative research. Purists might argue against the legitimacy, relevance or need for this in qualitative studies. In qualitative research reliability can be regarded as a fit between what researchers record as data and what actually occurs in the natural setting that is being researched, i.e. a degree of accuracy and comprehensiveness of coverage (Bogdan and Biklen 1992: 48). This is not to strive for uniformity; two researchers who are studying a single setting may come up with very different findings but both sets of findings might be reliable. Indeed Kvale (1996: 181) suggests that, in interviewing, there might be as many different interpretations of the qualitative data as there are researchers. A clear example of this is the study of the Nissan automobile factory in the United Kingdom, where Wickens (1987) found a ‘virtuous circle’ of work organization practices that demonstrated flexibility, teamwork and quality consciousness, whereas the same practices were investigated by Garrahan and Stewart (1992), who found a ‘vicious circle’ of exploitation, surveillance and control respectively. Both versions of the same reality coexist because reality is multilayered. What is being argued for here is the notion of reliability through an eclectic use of instruments, researchers, perspectives and interpretations (echoing the comments earlier about triangulation) (see also Eisenhart and Howe 1992). Brock-Utne (1996) argues that qualitative research, being holistic, strives to record the multiple interpretations of, intention in and meanings given to situations and events. Here the notion of reliability is construed as dependability (Lincoln and Guba 1985: 108–9; Anfara et al. 2002), recalling the earlier discussion on internal validity. For them, dependability involves member checks (respondent validation), debriefing by peers, triangulation, prolonged engagement in the field, persistent observations in the field, reflexive journals, negative case analysis, and independent audits (identifying acceptable processes of conducting the inquiry so that the results are consistent with the data). Audit trails enable the research to address the issue of confirmability of results, in terms of process and product (Golafshani 2003: 601). These are a safeguard against the charge levelled against qualitative researchers, namely that they respond only to the ‘loudest bangs or the brightest lights’. Dependability raises the important issue of respondent validation (see also McCormick and James 1988). While dependability might suggest that researchers need to go back to respondents to check that their findings are dependable, researchers also need to be cautious in placing exclusive store on respondents, for, as Hammersley and Atkinson (1983) suggest, they are not in a privileged position to be sole commentators on their actions. Bloor (1978) suggests three means by which respondent validation can be addressed: researchers attempt to predict what the participants’ classifications of situations will be researchers prepare hypothetical cases and then predict respondents’ likely responses to them researchers take back their research report to the respondents and record their reactions to that report. The argument rehearses the paradigm wars discussed in the opening chapter: quantitative measures are criticized for combining sophistication and refinement of process with crudity of concept (Ruddock 1981) and for failing to distinguish between educational and statistical significance (Eisner 1985); qualitative methodologies, while possessing immediacy, flexibility, authenticity, richness and candour, are criticized for being impressionistic, biased, commonplace, insignificant, ungeneralizable, idiosyncratic, subjective and short-sighted (Ruddock 1981). This is an arid debate; rather the issue is one of fitness for purpose. For our purposes here we need to note that criteria of reliability in quantitative methodologies differ from those in qualitative methodologies. In qualitative methodologies reliability includes fidelity to real life, context- and situation-specificity, authenticity, comprehensiveness, detail, honesty, depth of response and meaningfulness to the respondents. Chapter 6
150 VALIDITY AND RELIABILITY Validity and reliability in interviews In interviews, inferences about validity are made too often on the basis of face validity (Cannell and Kahn 1968), that is, whether the questions asked look as if they are measuring what they claim to measure. One cause of invalidity is bias, defined as ‘a systematic or persistent tendency to make errors in the same direction, that is, to overstate or understate the ‘‘true value’’ of an attribute’ (Lansing et al. 1961). One way of validating interview measures is to compare the interview measure with another measure that has already been shown to be valid. This kind of comparison is known as ‘convergent validity’. If the two measures agree, it can be assumed that the validity of the interview is comparable with the proven validity of the other measure. Perhaps the most practical way of achieving greater validity is to minimize the amount of bias as much as possible. The sources of bias are the characteristics of the interviewer, the characteristics of the respondent, and the substantive content of the questions. More particularly, these will include: the attitudes, opinions and expectations of the interviewer a tendency for the interviewer to see the respondent in his or her own image atendencyfortheinterviewertoseekanswers that support preconceived notions misperceptions on the part of the interviewer of what the respondent is saying misunderstandings on the part of the respondent of what is being asked. Studies have also shown that race, religion, gender, sexual orientation, status, social class and age in certain contexts can be potent sources of bias, i.e. interviewer effects (Lee 1993; Scheurich 1995). Interviewers and interviewees alike bring their own, often unconscious, experiential and biographical baggage with them into the interview situation. Indeed Hitchcock and Hughes (1989) argue that because interviews are interpersonal, humans interacting with humans, it is inevitable that the researcher will have some influence on the interviewee and, thereby, on the data. Fielding and Fielding (1986: 12) make the telling comment that even the most sophisticated surveys only manipulate data that at some time had to be gained by asking people! Interviewer neutrality is a chimera (Denscombe 1995). Lee (1993) indicates the problems of conducting interviews perhaps at their sharpest, where the researcher is researching sensitive subjects, i.e. research that might pose a significant threat to those involved (be they interviewers or interviewees). Here the interview might be seen as an intrusion into private worlds, or the interviewer might be regarded as someone who can impose sanctions on the interviewee, or as someone who can exploit the powerless; the interviewee is in the searchlight that is being held by the interviewer (see also Scheurich 1995). Indeed Gadd (2004) reports that an interviewee may reduce his or her willingness to ‘open up’ to an interviewer if the dynamics of the interview situation are too threatening, taking the role of the ‘defended subject’. The issues also embrace transference and counter-transference, which have their basis in psychoanalysis. In transference the interviewees project onto the interviewer their feelings, fears, desires, needs and attitudes that derive from their own experiences (Scheurich 1995). In countertransference the process is reversed. One way of controlling for reliability is to have a highly structured interview, with the same format and sequence of words and questions for each respondent (Silverman 1993), though Scheurich (1995: 241–9) suggests that this is to misread the infinite complexity and open-endedness of social interaction. Controlling the wording is no guarantee of controlling the interview. Oppenheim (1992: 147) argues that wording is a particularly important factor in attitudinal questions rather than factual questions. He suggests that changes in wording, context and emphasis undermine reliability, because it ceases to be the same question for each respondent. Indeed he argues that error and bias can stem from alterations to wording, procedure, sequence, recording and rapport, and that training for interviewers is essential to
- Page 117 and 118: 98 PLANNING EDUCATIONAL RESEARCH Pa
- Page 119 and 120: 4 Sampling Introduction The quality
- Page 121 and 122: 102 SAMPLING sample of 200 might be
- Page 123 and 124: 104 SAMPLING Box 4.1 Sample size, c
- Page 125 and 126: 106 SAMPLING would be insufficient
- Page 127 and 128: 108 SAMPLING The formula assumes th
- Page 129 and 130: 110 SAMPLING school governors, scho
- Page 131 and 132: 112 SAMPLING terms of sex, a random
- Page 133 and 134: 114 SAMPLING the required sample si
- Page 135 and 136: 116 SAMPLING Snowball sampling In s
- Page 137 and 138: 118 SAMPLING the kind of sample (d
- Page 139 and 140: 120 SENSITIVE EDUCATIONAL RESEARCH
- Page 141 and 142: 122 SENSITIVE EDUCATIONAL RESEARCH
- Page 143 and 144: 124 SENSITIVE EDUCATIONAL RESEARCH
- Page 145 and 146: 126 SENSITIVE EDUCATIONAL RESEARCH
- Page 147 and 148: 128 SENSITIVE EDUCATIONAL RESEARCH
- Page 149 and 150: 130 SENSITIVE EDUCATIONAL RESEARCH
- Page 151 and 152: 132 SENSITIVE EDUCATIONAL RESEARCH
- Page 153 and 154: 134 VALIDITY AND RELIABILITY It is
- Page 155 and 156: 136 VALIDITY AND RELIABILITY using
- Page 157 and 158: 138 VALIDITY AND RELIABILITY includ
- Page 159 and 160: 140 VALIDITY AND RELIABILITY leadin
- Page 161 and 162: 142 VALIDITY AND RELIABILITY social
- Page 163 and 164: 144 VALIDITY AND RELIABILITY this i
- Page 165 and 166: 146 VALIDITY AND RELIABILITY prese
- Page 167: 148 VALIDITY AND RELIABILITY by ass
- Page 171 and 172: 152 VALIDITY AND RELIABILITY typica
- Page 173 and 174: 154 VALIDITY AND RELIABILITY people
- Page 175 and 176: 156 VALIDITY AND RELIABILITY
- Page 177 and 178: 158 VALIDITY AND RELIABILITY sensit
- Page 179 and 180: 160 VALIDITY AND RELIABILITY certif
- Page 181 and 182: 162 VALIDITY AND RELIABILITY how m
- Page 183 and 184: 164 VALIDITY AND RELIABILITY operat
- Page 186 and 187: 7 Naturalistic and ethnographic res
- Page 188 and 189: ELEMENTS OF NATURALISTIC INQUIRY 16
- Page 190 and 191: PLANNING NATURALISTIC RESEARCH 171
- Page 192 and 193: PLANNING NATURALISTIC RESEARCH 173
- Page 194 and 195: PLANNING NATURALISTIC RESEARCH 175
- Page 196 and 197: PLANNING NATURALISTIC RESEARCH 177
- Page 198 and 199: PLANNING NATURALISTIC RESEARCH 179
- Page 200 and 201: PLANNING NATURALISTIC RESEARCH 181
- Page 202 and 203: PLANNING NATURALISTIC RESEARCH 183
- Page 204 and 205: PLANNING NATURALISTIC RESEARCH 185
- Page 206 and 207: CRITICAL ETHNOGRAPHY 187 Relatio
- Page 208 and 209: SOME PROBLEMS WITH ETHNOGRAPHIC AND
- Page 210 and 211: 8 Historical and documentary resear
- Page 212 and 213: DATA COLLECTION 193 One can see fro
- Page 214 and 215: WRITING THE RESEARCH REPORT 195 Ext
- Page 216 and 217: THE USE OF QUANTITATIVE METHODS 197
150 VALIDITY AND RELIABILITY<br />
Validity and reliability in interviews<br />
In interviews, inferences about validity are made<br />
too often on the basis of face validity (Cannell<br />
and Kahn 1968), that is, whether the questions<br />
asked lo<strong>ok</strong> as if they are measuring what they<br />
claim to measure. One cause of invalidity is bias,<br />
defined as ‘a systematic or persistent tendency<br />
to make errors in the same direction, that is,<br />
to overstate or understate the ‘‘true value’’ of<br />
an attribute’ (Lansing et al. 1961). One way of<br />
validating interview measures is to compare the<br />
interview measure with another measure that has<br />
already been shown to be valid. This kind of<br />
comparison is known as ‘convergent validity’. If<br />
the two measures agree, it can be assumed that the<br />
validity of the interview is comparable with the<br />
proven validity of the other measure.<br />
Perhaps the most practical way of achieving<br />
greater validity is to minimize the amount<br />
of bias as much as possible. The sources of<br />
bias are the characteristics of the interviewer,<br />
the characteristics of the respondent, and the<br />
substantive content of the questions. More<br />
particularly, these will include:<br />
the attitudes, opinions and expectations of the<br />
interviewer<br />
a tendency for the interviewer to see the<br />
respondent in his or her own image<br />
atendencyfortheinterviewertoseekanswers<br />
that support preconceived notions<br />
misperceptions on the part of the interviewer<br />
of what the respondent is saying<br />
misunderstandings on the part of the<br />
respondent of what is being asked.<br />
Studies have also shown that race, religion,<br />
gender, sexual orientation, status, social class and<br />
age in certain contexts can be potent sources of<br />
bias, i.e. interviewer effects (Lee 1993; Scheurich<br />
1995). Interviewers and interviewees alike bring<br />
their own, often unconscious, experiential and<br />
biographical baggage with them into the interview<br />
situation. Indeed Hitchcock and Hughes (1989)<br />
argue that because interviews are interpersonal,<br />
humans interacting with humans, it is inevitable<br />
that the researcher will have some influence on<br />
the interviewee and, thereby, on the data. Fielding<br />
and Fielding (1986: 12) make the telling comment<br />
that even the most sophisticated surveys only<br />
manipulate data that at some time had to be<br />
gained by asking people! Interviewer neutrality is<br />
a chimera (Denscombe 1995).<br />
Lee (1993) indicates the problems of conducting<br />
interviews perhaps at their sharpest, where the<br />
researcher is researching sensitive subjects, i.e.<br />
research that might pose a significant threat<br />
to those involved (be they interviewers or<br />
interviewees). Here the interview might be seen as<br />
an intrusion into private worlds, or the interviewer<br />
might be regarded as someone who can impose<br />
sanctions on the interviewee, or as someone who<br />
can exploit the powerless; the interviewee is in the<br />
searchlight that is being held by the interviewer<br />
(see also Scheurich 1995). Indeed Gadd (2004)<br />
reports that an interviewee may reduce his or<br />
her willingness to ‘open up’ to an interviewer<br />
if the dynamics of the interview situation are<br />
too threatening, taking the role of the ‘defended<br />
subject’. The issues also embrace transference and<br />
counter-transference, which have their basis in<br />
psychoanalysis. In transference the interviewees<br />
project onto the interviewer their feelings, fears,<br />
desires, needs and attitudes that derive from their<br />
own experiences (Scheurich 1995). In countertransference<br />
the process is reversed.<br />
One way of controlling for reliability is<br />
to have a highly structured interview, with<br />
the same format and sequence of words and<br />
questions for each respondent (Silverman 1993),<br />
though Scheurich (1995: 241–9) suggests that<br />
this is to misread the infinite complexity and<br />
open-endedness of social interaction. Controlling<br />
the wording is no guarantee of controlling the<br />
interview. Oppenheim (1992: 147) argues that<br />
wording is a particularly important factor in<br />
attitudinal questions rather than factual questions.<br />
He suggests that changes in wording, context<br />
and emphasis undermine reliability, because<br />
it ceases to be the same question for each<br />
respondent. Indeed he argues that error and<br />
bias can stem from alterations to wording,<br />
procedure, sequence, recording and rapport, and<br />
that training for interviewers is essential to