13.07.2015 Views

Calibrating tests to the CEFR

Calibrating tests to the CEFR

Calibrating tests to the CEFR

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The European Association for Quality Language Services<strong>Calibrating</strong> <strong>tests</strong> <strong>to</strong> <strong>the</strong> <strong>CEFR</strong>A simplified guide for EAQUALS Members oncalibrating entry or progress <strong>tests</strong> <strong>to</strong> <strong>the</strong> <strong>CEFR</strong>© EAQUALS: The European Association for Quality Language Services:Email: info@eaquals.org Internet: www.eaquals.org


The European Association for Quality Language ServicesContentsIntroduction Page 1External Validation Page 2Summary Chart Page 3The Step-by-step Procedure Page 4© EAQUALS: The European Association for Quality Language Services:EAQUALS SECRETARIAT: P.O. Box 95, Budapest, H-1301 HungaryEmail: info@eaquals.org Internet: www.eaquals.orgRegistered Office: Via Torrebianca 18 34132 Trieste, Italy2


The European Association for Quality Language ServicesIntroductionThis guide provides simple procedures <strong>to</strong> calibrate a placement or progress test <strong>to</strong> <strong>the</strong> <strong>CEFR</strong>.It is not a guide for developing examinations or for relating <strong>the</strong> results from existing <strong>tests</strong> orexaminations <strong>to</strong> <strong>the</strong> <strong>CEFR</strong> by determining “cut-scores” that define a minimum performance for aparticular level. A manual is provided by <strong>the</strong> Council of Europe for that purpose 1 .In order <strong>to</strong> give information about a learner’s level of <strong>CEFR</strong> proficiency <strong>the</strong> test needs <strong>to</strong>:• be developed <strong>to</strong> a detailed, balanced specification that relates <strong>to</strong> <strong>the</strong> <strong>CEFR</strong>• have a broad coverage (no <strong>the</strong> grammar and vocabulary done this month bur <strong>the</strong> grammarand vocabulary most important at, for example, Level B1);• be appropriate <strong>to</strong> <strong>the</strong> context and type of learners concernedIt is perfectly legitimate <strong>to</strong> calibrate an existing test <strong>to</strong> <strong>the</strong> <strong>CEFR</strong> levels. This guide gives asimple way <strong>to</strong> externally validate <strong>the</strong> cut-off scores on <strong>the</strong> test for each <strong>CEFR</strong> level throughcorrelating and referencing results <strong>to</strong> an anchor test already calibrated <strong>to</strong> <strong>the</strong> <strong>CEFR</strong> and/or <strong>to</strong>teacher judgements with criteria.1 Council of Europe (2009) “Relating Language Examinations <strong>to</strong> <strong>the</strong> Common European Framework of Reference for Languages:Learning, Teaching, Assessment (<strong>CEFR</strong>): A Manual.” Available on www.coe.int/lang.North, B. and Jones J. (2009) “Relating Language Examinations <strong>to</strong> <strong>the</strong> Common European Framework of Reference for Languages:Learning, Teaching, Assessment (<strong>CEFR</strong>): Fur<strong>the</strong>r material on standard setting through scaling and teacher judgement.” Availableon www.coe.int/lang.© EAQUALS: The European Association for Quality Language Services:1EAQUALS SECRETARIAT: P.O. Box 95, Budapest, H-1301 HungaryEmail: info@eaquals.org Internet: www.eaquals.orgRegistered Office: Via Torrebianca 18 34132 Trieste, Italy


The European Association for Quality Language ServicesExternal ValidationAlthough empirical validation is mainly something examination providers are concerned with,schools can also apply basic principles in a simple way. Data analysis for external validation canbe carried out satisfac<strong>to</strong>rily with nothing more elaborate than <strong>the</strong> simple correlation functions inMicrosoft Excel and a couple of Microsoft Word tables. Examination providers will no doubt wish<strong>to</strong> go in<strong>to</strong> more detail and <strong>the</strong> <strong>CEFR</strong> Manual written for <strong>the</strong>m also provides a ReferenceSupplement should <strong>the</strong>y need advice on how <strong>to</strong> do so.External validation is <strong>the</strong> independent corroboration through <strong>the</strong> use of an external criterion of<strong>the</strong> claim that a test or assessment procedure reports in terms of <strong>CEFR</strong> levels. The mainproblem with undertaking such a study is collecting data. All learners operating as subjectsneed <strong>to</strong> take both <strong>the</strong> assessment procedure under study and <strong>the</strong> assessment used as anexternal criterion. Examination providers will want <strong>to</strong> conduct studies with large populationstudies, but in a school, a sample of 50 learners would be more than sufficient (30 is anabsolute minimum).The technique can be used ei<strong>the</strong>r <strong>to</strong> plot entry, progress or exit test scores against an externalcriterion. The test concern:(a) a particular skill, e.g. “Your linguistic knowledge is Level B2”(b) a global result e.g.: “You are Level B2.” This might be <strong>the</strong> result of averaging results fromsub <strong>tests</strong>.There are four alternatives for <strong>the</strong> external criterion:(a) Results in examinations calibrated <strong>to</strong> <strong>CEFR</strong> or on a test previously calibrated <strong>to</strong> <strong>the</strong> <strong>CEFR</strong>. Itcould be results from a suite of successive ALTE examinations situated at different levels. If<strong>the</strong> assessment under study <strong>tests</strong> a particular skill or aspect, <strong>the</strong>n <strong>the</strong> test (or suite of <strong>tests</strong>)used as an external criterion will have <strong>to</strong> test that skill <strong>to</strong>o; if <strong>the</strong> assessment under study isa global one, <strong>the</strong>n <strong>the</strong> external criterion could be <strong>the</strong> global result in <strong>CEFR</strong> terms from anexamination or suite of examinations (e.g. Cambridge exams).(b) Judgements made by teachers for <strong>the</strong> skill concerned could be used, provided <strong>the</strong>y know<strong>the</strong> learners’ work well and that <strong>the</strong>y have been systematically trained in a standardisedinterpretation of <strong>the</strong> <strong>CEFR</strong> levels (Activities 4, 5 & 6).(c) More formal school assessments of speaking and writing in relation <strong>to</strong> <strong>the</strong> <strong>CEFR</strong> criteria forstandardisation training in <strong>the</strong> appendices <strong>to</strong> <strong>the</strong> EAQUALS Standardisation Pack. Thesecould be averaged <strong>to</strong> give a “global” result. Results will be better if two teachers, each ofwhom has attended standardisation staff seminars act as <strong>the</strong> raters, and if <strong>the</strong>y have <strong>to</strong>negotiate a final grade.(d) Self assessments could also be used. Again, self-judgements will be more accurate if <strong>the</strong>learners have been trained in relation <strong>to</strong> <strong>the</strong> <strong>CEFR</strong> levels – perhaps in comparing <strong>the</strong>ir ownlevel <strong>to</strong> performances in standardisation training videos.The best thing <strong>to</strong> do is <strong>to</strong> collect data from as many sources as possible – and hope one of<strong>the</strong>m gives a good correlation.© EAQUALS: The European Association for Quality Language Services:EAQUALS SECRETARIAT: P.O. Box 95, Budapest, H-1301 HungaryEmail: info@eaquals.org Internet: www.eaquals.orgRegistered Office: Via Torrebianca 18 34132 Trieste, Italy2


The European Association for Quality Language ServicesSummary ChartActivity:Validating <strong>CEFR</strong> Levels reported by Entry, Progress or Exit TestsMaterials: o Exam results across a range of <strong>CEFR</strong> levels from a suite of examinationslinked <strong>to</strong> <strong>the</strong> <strong>CEFR</strong> (e.g. Cambridge) or from a test reporting across a numberof <strong>CEFR</strong> levelso <strong>CEFR</strong> Assessment procedures implemented for speaking and writing afterStandardisation training as in Activities 2-4 (Speaking) and 9 (Writing) above.o Data for 50 or more learners, with as wide a range of level as possible: TEST score for each student on <strong>the</strong> local test EXAM result for each student on <strong>the</strong> examination or suite of examinationsconcerned ASSESSMENTS results for each student from assessment of speaking (with<strong>CEFR</strong> Table 3 criteria grid) and writing (with <strong>CEFR</strong> writing criteria grid) SELF-assessed <strong>CEFR</strong> level from <strong>CEFR</strong> Global self-assessment scale orPortfolio self-assessment grid TEACHER informal assessment with <strong>CEFR</strong> scales and/or Portfolio checklists(Definitely not advisable on extensive courses with large classes)o Microsoft Excel & WordSteps:1. Prepare <strong>the</strong> data.a. Convert exam results <strong>to</strong> numbers; Example: with Cambridge a KETpass = A2 = 2; PET Pass = B1 = 3, etc. With Cambridge exams tryalso, separately using <strong>the</strong> actual grades, treating an “A” at onelevel as <strong>the</strong> same as a “C” at <strong>the</strong> next level (i.e. as a “Pass” as <strong>the</strong>next <strong>CEFR</strong> level); treat D/E as if it is a “pass” at <strong>the</strong> previous <strong>CEFR</strong>level.b. Create a “global” <strong>CEFR</strong> assessment result for each student byaveraging <strong>the</strong> <strong>CEFR</strong> Speaking/Writing Assessment results2. Correlate all <strong>the</strong>se results <strong>to</strong> <strong>the</strong> test results and see which gives <strong>the</strong> highestcorrelation; use that at <strong>the</strong> external criterion. Plot correlation (Excel functionCORREL) of test <strong>to</strong> converted exam scores and <strong>to</strong> self-assessments; if <strong>the</strong>correlation is less than 0.7, collect more data on students at higher and atlower levels <strong>to</strong> extend range of level; if still less than 0.7: give up. If <strong>the</strong>correlation <strong>to</strong> self-assessments is higher than <strong>the</strong> correlation <strong>to</strong> <strong>the</strong> exams –<strong>the</strong>n repeat procedure using self-assessments as <strong>the</strong> criterion. Then comparecut-offs from exam with “cut-offs from self-assessments and make acompromise.3. Create a “Scatterplot” (Excel: Insert Graph). Print copies of <strong>the</strong> plot and drawby hand a line of “best fit.”4. Create provisional cut-off scores between <strong>CEFR</strong> levels from <strong>the</strong> Scatterplot5. Use <strong>the</strong> cut-off scores <strong>to</strong> make a <strong>CEFR</strong> “Decision Table.”6. Calculate <strong>the</strong> percentage of matching classification7. Use <strong>the</strong> “Decision Table” <strong>to</strong> fine tune <strong>the</strong> “cut-off” scores in order <strong>to</strong> increasethis percentage of agreement.8. Create new “Decision Table9. Calculate percentage of agreement. Repeat cycle if necessary.© EAQUALS: The European Association for Quality Language Services:EAQUALS SECRETARIAT: P.O. Box 95, Budapest, H-1301 HungaryEmail: info@eaquals.org Internet: www.eaquals.orgRegistered Office: Via Torrebianca 18 34132 Trieste, Italy3


The European Association for Quality Language ServicesProcedureStep 1First convert examination results <strong>to</strong> <strong>CEFR</strong> levels, and convert/combine schools grades <strong>to</strong> give asingle <strong>CEFR</strong> result for each candidate. Then convert <strong>the</strong> <strong>CEFR</strong> levels <strong>to</strong> numbers, e.g. A1 = 1,A2 = 2, A2+ = 3, B1 = 4, B1+ = 5, etc.Step 2: Correlation.The next step is <strong>to</strong> enter <strong>the</strong> data in Microsoft Excel and calculate a correlation coefficientbetween <strong>the</strong> different external measures available and <strong>the</strong> results from <strong>the</strong> assessment understudy, in order <strong>to</strong> identify which is external measure best suited as external criterion. Yourschool Speaking/Writing assessment result is also an external measure.- Enter each student as a row in an Excel sheet.- Enter <strong>the</strong> assessment under study as <strong>the</strong> first column.- Enter <strong>the</strong> potential external criteria as fur<strong>the</strong>r columns.- Place <strong>the</strong> cursor in an empty cell underneath <strong>the</strong> column with <strong>the</strong> first potential externalcriterion.- Select “Insert” “Function” and select or type in CORR. This activates <strong>the</strong> MS Excel wizard.- It asks you for “Array 1” – so click on <strong>the</strong> first cell of <strong>the</strong> data for <strong>the</strong> external criterion anddrag <strong>the</strong> cursor down <strong>to</strong> <strong>the</strong> last.- Repeat for “Array 2” with one of <strong>the</strong> potential external criteria.- Press “Okay” and hope for a high number.A weak correlation coefficient (e.g. 0.40) shows a weak relationship between <strong>the</strong> twoassessments. There is little point in proceeding fur<strong>the</strong>r. There are all sorts of reasons why yourcorrelation might be low, but <strong>the</strong> better your test and <strong>the</strong> better <strong>the</strong> external criterion is atassessing <strong>the</strong> same thing, <strong>the</strong> higher <strong>the</strong> correlation will be. A strong correlation (e.g. 0.80)shows a very promising relationship between <strong>the</strong> two assessments. The technical significance of<strong>the</strong> size of <strong>the</strong> correlation coefficient depends upon <strong>the</strong> number of subjects.If you have no correlation reaching 0.70, <strong>the</strong>n <strong>the</strong> relationship, whilst it may be significant, isnot really very strong. Solution: collect <strong>the</strong> same data on learners with higher and lower levels,add that in your table and try again. Adding more data around <strong>the</strong> <strong>to</strong>p and <strong>the</strong> bot<strong>to</strong>m of <strong>the</strong>ability range of learners concerned extends <strong>the</strong> reporting scale and <strong>the</strong>refore almostau<strong>to</strong>matically increases <strong>the</strong> correlation. If doing this does not increase <strong>the</strong> correlation it is best<strong>to</strong> give up, because ei<strong>the</strong>r your test or your external criterion is just not very good. If you dohave some correlations above 0.7 and you can think of a plausible explanation for why thispotential external criterion is operating better than <strong>the</strong> o<strong>the</strong>rs, <strong>the</strong>n adopt that measure as yourexternal criterion and proceed <strong>to</strong> Step 3.Step 3: Creating a ScatterplotA correlation shows that <strong>the</strong>re is a relationship. The next question is what precisely thatrelationship is. A table can be constructed that compares <strong>the</strong> <strong>CEFR</strong> level classification for eachperson on <strong>the</strong> assessment under study and on <strong>the</strong> assessment taken as a criterion measure.© EAQUALS: The European Association for Quality Language Services:EAQUALS SECRETARIAT: P.O. Box 95, Budapest, H-1301 HungaryEmail: info@eaquals.org Internet: www.eaquals.orgRegistered Office: Via Torrebianca 18 34132 Trieste, Italy4


The European Association for Quality Language ServicesFirst, however, we need <strong>to</strong> be able <strong>to</strong> set some provisional “cut-off scores” <strong>to</strong> at leasttentatively convert scores on <strong>the</strong> test under study <strong>to</strong> <strong>CEFR</strong> levels. If <strong>the</strong> test was developedfollowing academic testing conventions known as “standard-setting” <strong>the</strong>n such cut-off scoreswill already exist. This current Step 2 is <strong>the</strong>n a way of validating <strong>the</strong> cut-off scores developed inthat way. If however, <strong>the</strong> test has been developed by a school <strong>to</strong> test a certain level, but noscores have been allocated <strong>to</strong> different levels, we will first have <strong>to</strong> “guessestimate” some cu<strong>to</strong>ffsas a working hypo<strong>the</strong>sis in order <strong>to</strong> proceed fur<strong>the</strong>r.Example Scatter plot5The Examination (Criterion)432100 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100My TestFigure 1: An example scatterplot with fictional dataTo do this, one need <strong>to</strong> compare <strong>the</strong> scores achieved on <strong>the</strong> test under study <strong>to</strong> <strong>the</strong> <strong>CEFR</strong> levelsreported by <strong>the</strong> external criterion, as first popularised by Brendan Carroll. 2 Figure 1 shows afictitious example. <strong>CEFR</strong> levels reported by an examination (external criterion) are shown up <strong>the</strong>left hand axis (<strong>the</strong> Y axis), and scores on <strong>the</strong> test under study are shown along <strong>the</strong> bot<strong>to</strong>m on<strong>the</strong> “X axis.” Each point represents a student. This particular plot was created with <strong>the</strong> chart“Scatterplot” in MS Excel, but it could also be done on graph paper. The correlation, calculatedwith MS Excel as described in Step 1, is 0.74.Step 4: Setting Provisional Cut-off scoresNow a line of “best fit“ is drawn through <strong>the</strong> dots. Don’t lose time over this, it’s just a startingpoint. If it turns out that you drew <strong>the</strong> line in <strong>the</strong> wrong place, it can be corrected later in Step2 Carroll, B. J. (1980): Communicative Testing, Oxford, Pergamon Press: 61-64.© EAQUALS: The European Association for Quality Language Services:EAQUALS SECRETARIAT: P.O. Box 95, Budapest, H-1301 HungaryEmail: info@eaquals.org Internet: www.eaquals.orgRegistered Office: Via Torrebianca 18 34132 Trieste, Italy5


The European Association for Quality Language Services7. Figure 2 shows an example. This was drawn inn MS Word (having copied <strong>the</strong> chart fromExcel), but it could be done by hand on graph paper.Example Scatter plot5The Examination (Criterion)432100 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100My TestFigure 2: A line of “best fit”The next step is <strong>to</strong> draw a vertical line from <strong>the</strong> point at which <strong>the</strong> line of best fit <strong>to</strong>uches <strong>the</strong>horizontal lines (<strong>the</strong> reporting levels of <strong>the</strong> external criterion) down <strong>to</strong> <strong>the</strong> axis along <strong>the</strong> bot<strong>to</strong>mwith <strong>the</strong> test scores. This is shown in Figure 3. These lines “cut” <strong>the</strong> X-axis. Anyone getting ascore between <strong>the</strong> point where <strong>the</strong> line of best fit cuts <strong>the</strong> horizontal line for “1” (representingA1) and <strong>the</strong> horizontal line “2” (representing A2) are A1 according <strong>to</strong> <strong>the</strong> test under study. Thisis <strong>the</strong> people who score above 20 (<strong>the</strong> “cut-score” or “cut-off score” for A1), and below 38, <strong>the</strong>cut-score A2. Here we have three people scoring 21, 23 and 32. Test and external criterionboth agree <strong>the</strong>y are A1.Figure 3 would thus provide provisional cut-off scores approximately as follows:Level 1 (e.g. A1): 20-37Level 2 (e.g. A2): 38-56Level 3 (e.g. B1): 57-75Level 4 (e.g. B2): 76-(100)Level 5 (e.g. C1): We only had one person at this level so we really cannot say.© EAQUALS: The European Association for Quality Language Services:EAQUALS SECRETARIAT: P.O. Box 95, Budapest, H-1301 HungaryEmail: info@eaquals.org Internet: www.eaquals.orgRegistered Office: Via Torrebianca 18 34132 Trieste, Italy6


The European Association for Quality Language ServicesExample Scatter plot5The Examination (Criterion)432100 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100My TestFigure 3: Marking provisional “cut-scores”Step 5: Creating a Decision TableThe result gives us one person below Level 1 (A1) and 5 people at Level 1 (A1), two of whomhowever were classified as Level 2 (A2) by <strong>the</strong> examination used as an external criterion. AtLevel 2 (between 38 and 57) we have 5 people, 2 of whom were classified at Level 3 (B1) by<strong>the</strong> examination. At <strong>the</strong> next Level things start <strong>to</strong> go a bit wonky (something not confined <strong>to</strong>fictional data). Our test has two people at Level 3, but <strong>the</strong> examination put <strong>the</strong>m at Levels 1and 2 respectively.The two people <strong>the</strong> examination put at Level 3, our test has at Level 4. This relationshipbetween <strong>the</strong> two sets of scores can be expressed in what is called a “Decision Table 3 ,” asshown in Table 1There are 18 dots on Figures 1, 2 and 3 (our fictional data set) and <strong>the</strong>refore one sees a <strong>to</strong>talof 18 in <strong>the</strong> bot<strong>to</strong>m right hand corner of Table 1.The number of people placed at each level on <strong>the</strong> examination (vertical axis) is shown above<strong>the</strong> number 18, those placed by <strong>the</strong> test (horizontal axis) along <strong>the</strong> bot<strong>to</strong>m of <strong>the</strong> table. Theexamination placed 5 people at level 1 (A1): 3 of whom <strong>the</strong> test also placed <strong>the</strong>re. These 3 arein <strong>the</strong> shaded square.3 The Decision Table in this form was recommended by Norman Verhelst (Ci<strong>to</strong>) in <strong>the</strong> Council of EuropeManual for examination authorities. It is also known as a Bivariate Table or a Cross-classification Table.© EAQUALS: The European Association for Quality Language Services:7EAQUALS SECRETARIAT: P.O. Box 95, Budapest, H-1301 HungaryEmail: info@eaquals.org Internet: www.eaquals.orgRegistered Office: Via Torrebianca 18 34132 Trieste, Italy


The European Association for Quality Language Services0Test0 A1 A2 B1 B2 C1 TotalCriterion (Exams)A1A2B1B2C13 1 1 51 2 2 1 62 2 42 21 1Total 1 5 5 2 5 0 18Table 1: Decision Table (fictional data)Step 6: Calculating <strong>the</strong> Proportion of Common Classifications.Shading shows agreement between test and examination. The number of matchingclassifications is determined by adding <strong>to</strong>ge<strong>the</strong>r those learners shown in <strong>the</strong> shaded diagonal,as in Table 2.A1 A2 B1 B2 C1 Total3 + 2 + 0 + 2 + 0 = 7Table 2: Table 16 Common classifications <strong>to</strong> <strong>CEFR</strong> levels7 out of 18 = 39% common classifications.Step 7: Fine-tuning cut-scoresLooking at Table 1 again, <strong>the</strong>re is one point that does suggest an imbalance: 3 of <strong>the</strong> peoplewhom <strong>the</strong> examination placed at Level 2 (A2) have come out at only Level 1 (A1) or below on<strong>the</strong> test. Possibly <strong>the</strong> cut-point for Level 2 (A2) is set <strong>to</strong>o high. Certainly <strong>the</strong>re are two learnerswho scored 33 and 35 respectively who just missed <strong>the</strong> cut-score of 38. Perhaps we shouldlower <strong>the</strong> cut-scores for A1 and A2 and <strong>the</strong>n collect some more data on low level learners <strong>to</strong>check.Adjusting cut-scores in this way would give us cut-scores as follows:Level 1 (e.g. A1): 15-32Level 2 (e.g. A2): 33-56Level 3 (e.g. B1): 57-75Level 4 (e.g. B2): 76-(100)© EAQUALS: The European Association for Quality Language Services:EAQUALS SECRETARIAT: P.O. Box 95, Budapest, H-1301 HungaryEmail: info@eaquals.org Internet: www.eaquals.orgRegistered Office: Via Torrebianca 18 34132 Trieste, Italy8


The European Association for Quality Language ServicesStep 8: Adjust Decision TableThe revised cut-off scores can be plotted in a new Decision Table as in Table 3. The pictureshown by Table 3 does now look more balanced. Obviously <strong>the</strong> fictional learner who failed <strong>to</strong>reach A1 on <strong>the</strong> test had an exceptionally bad day. If we were <strong>to</strong> do a sophisticated statisticalanalysis of <strong>the</strong> test we would no doubt discover that <strong>the</strong>ir responses were inconsistent.O<strong>the</strong>rwise our problem is <strong>the</strong> lack of “hits” at B1. But it is balanced.0Test0 A1 A2 B1 B2 C1 TotalCriterion (Exams)A1A2B1B2C13 1 1 51 4 1 62 2 42 21 1Total 1 3 7 2 5 0 18Table 3: Decision Table (fictional data) - revisedStep 9: Recalculating <strong>the</strong> Proportion of Common Classifications.As Table 3 shows, we now have 9 matched classifications – or 50%. This is a very respectableresult.A1 A2 B1 B2 C1 Total3 + 4 + 0 + 2 + 0 = 9Table 4: Table 3 Common classifications <strong>to</strong> <strong>CEFR</strong> levelsFinally, let us repeat <strong>the</strong> procedure with a real example.Table 5 shows how teacher judgements were used <strong>to</strong> confirm provisional cut-off pointsbetween levels on <strong>the</strong> Eurocentres Scale of Language Proficiency for an item bank testingsystemic knowledge of German. Both sets of results are presented in terms of levels on <strong>the</strong>Eurocentres scale, not <strong>the</strong> <strong>CEFR</strong>. The study is illustrated here only as a publishedmethodological example undertaken in a language school 4 . The results on <strong>the</strong> two assessmentsare simply plotted against each o<strong>the</strong>r in a Microsoft Word table. Again it helps <strong>to</strong> shade <strong>the</strong>diagonal line – which is where all <strong>the</strong> learners would be if <strong>the</strong> correlation between <strong>the</strong> twoassessments was absolutely perfect.4 North, B. (2000): Linking Language Assessments: an example in a low stakes context. System 28, 555-577.© EAQUALS: The European Association for Quality Language Services:EAQUALS SECRETARIAT: P.O. Box 95, Budapest, H-1301 HungaryEmail: info@eaquals.org Internet: www.eaquals.orgRegistered Office: Via Torrebianca 18 34132 Trieste, Italy9


The European Association for Quality Language ServicesDespite an extremely high correlation of 0.93, only 28 of <strong>the</strong> 68 subjects (41%) have actuallyreceived exactly <strong>the</strong> same level on <strong>the</strong> 9 Eurocentres levels. The reason why so few learnershave received <strong>the</strong> same classification concerns <strong>the</strong> number of levels on <strong>the</strong> scale. TheEurocentres scale splits each <strong>CEFR</strong> level o<strong>the</strong>r than A1 and C2 in<strong>to</strong> two. Thus Levels 2 & 3 makeA2; Levels 4 & 5 make B1; Levels 6 & 7 make B2, and Levels 8 & 9 make C1. It is easier <strong>to</strong> geta higher number of correct classifications with a smaller number of levels. Using finer levels (asEurocentres do) may be pedagogically meaningful, but in a testing context it requires a higherdegree of decision power.I 9 2t 8 1e 7 8 3m 6 2 8b 5 1 8 2 1a 4 4 4n 3 2 6 5k 2 1 1 5e 1 4r 1 2 3 4 5 6 7 8 9Teacher Judgement: SprachkenntnisseTable 5: Referencing an Item Bank <strong>to</strong> Teacher JudgementsThis problem of <strong>the</strong> weaker decision power and greater number of mis-classifications with alarger number of reporting levels is <strong>the</strong> reason “high stakes” assessment (examinations) tends<strong>to</strong> use broad levels. This is also why <strong>the</strong> <strong>CEFR</strong> and <strong>the</strong> European Language Portfolio have only 6criterion levels.Whatever set of finer level classifications may be used locally, only <strong>the</strong> official 6 <strong>CEFR</strong> levels A1-C2 should be used when aligning assessments <strong>to</strong> <strong>the</strong> <strong>CEFR</strong>. Table 6 <strong>the</strong>refore shows a “DecisionTable” using only <strong>CEFR</strong> levels as recommended.© EAQUALS: The European Association for Quality Language Services:EAQUALS SECRETARIAT: P.O. Box 95, Budapest, H-1301 HungaryEmail: info@eaquals.org Internet: www.eaquals.orgRegistered Office: Via Torrebianca 18 34132 Trieste, Italy10


The European Association for Quality Language Servicesdemonstrated above. And <strong>the</strong> extent <strong>to</strong> which two different measures correlate will always belimited. Correlations of 0.7-0.8 between two good, validated <strong>tests</strong> or examinations are normal.The equivalences published by Educational Testing Services on <strong>the</strong> equivalent scores for TOEFLand TOEIC were long based on a study showing a correlation of only 0.75 between <strong>the</strong> two<strong>tests</strong>. TOEFL and TOEIC have different styles and test language in different domains. O<strong>the</strong>rpairs of <strong>tests</strong> will use different item types. Performance <strong>tests</strong> (teacher rating of a sample)assess something different <strong>to</strong> knowledge <strong>tests</strong> (teacher marking of language items). Teachers –over a period of time – will see things that <strong>tests</strong> – taken one morning - do not.Apart from such “legitimate” reasons why results from different <strong>tests</strong> will only relate in a limitedfashion, <strong>the</strong>re are all sorts of o<strong>the</strong>r reasons. Testing is an imperfect science. Learners are notinvariable objects: atmosphere, emotional mood, amount of sleep, digestion – all contextualfac<strong>to</strong>rs affect assessment results.Therefore, if one can achieve a modest success in independently proving a statisticalrelationship between an assessment and <strong>the</strong> <strong>CEFR</strong> levels by following logical steps in aprincipled manner as suggested, <strong>the</strong>n one should be more than satisfied.© EAQUALS: The European Association for Quality Language Services:EAQUALS SECRETARIAT: P.O. Box 95, Budapest, H-1301 HungaryEmail: info@eaquals.org Internet: www.eaquals.orgRegistered Office: Via Torrebianca 18 34132 Trieste, Italy12

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!