06.06.2013 Views

CHALLENGING ASSESSMENT - PSYCONDIA

CHALLENGING ASSESSMENT - PSYCONDIA

CHALLENGING ASSESSMENT - PSYCONDIA

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>CHALLENGING</strong> <strong>ASSESSMENT</strong><br />

─<br />

BOOK OF ABSTRACTS OF THE<br />

FOURTH BIENNIAL<br />

EARLI/NORTHUMBRIA<br />

<strong>ASSESSMENT</strong> CONFERENCE 2008<br />

Edited by<br />

Marja van den Heuvel-Panhuizen<br />

Olaf Köller


<strong>CHALLENGING</strong> <strong>ASSESSMENT</strong><br />

─<br />

BOOK OF ABSTRACTS OF THE<br />

FOURTH BIENNIAL<br />

EARLI/NORTHUMBRIA<br />

<strong>ASSESSMENT</strong> CONFERENCE 2008


<strong>CHALLENGING</strong> <strong>ASSESSMENT</strong> – BOOK OF ABSTRACTS OF THE FOURTH BIENNIAL<br />

EARLI/NORTHUMBRIA <strong>ASSESSMENT</strong> CONFERENCE 2008<br />

Edited by<br />

Marja van den Heuvel-Panhuizen<br />

Olaf Köller<br />

Editorial assistance<br />

Monika Lacher<br />

Humboldt-Universität zu Berlin<br />

Institut zur Qualitätsentwicklung im Bildungswesen (IQB)<br />

Unter den Linden 6<br />

10099 Berlin<br />

Germany<br />

This Book of Abstracts is also available for download as a PDF file at<br />

http://www.iqb.hu-berlin.de/veranst/enac2008?reg=r_11<br />

2008<br />

Printed by<br />

Breitfeld Vervielfältigungsservice, Berlin, Germany<br />

Copyright © 2008 left to the Authors<br />

All rights reserved<br />

ISBN 978-3-00-025471-0<br />

Fourth Biennial EARLI/Northumbria Assessment Conference 2008-07-29<br />

August 27 – 29, 2008<br />

Hosted by IQB, Humboldt University Berlin<br />

Conference Venue: Seminaris Seehotel Potsdam/Berlin, Germany<br />

ii ENAC 2008


PREFACE<br />

This Book of Abstracts presents recent research in the field of assessment and evaluation.<br />

In total, the volume contains 124 contributions consisting of the abstracts of 3 plenary<br />

lectures, 31 symposium papers, 42 papers, 26 roundtable papers, and 22 posters. All these<br />

contributions have been brought together by the Fourth Biennial EARLI/Northumbria<br />

Assessment Conference 2008.<br />

The contributions cover a rich variety of topics that reflect the themes:<br />

• Standards-based assessment<br />

• E-assessment<br />

• Measuring and modelling performances<br />

• Consequences and contexts of assessment<br />

• Learning-oriented assessment.<br />

The book opens with the abstracts of the three invited plenary lectures. Two of them<br />

highlight the connection between assessment on the one hand, and instruction and learning<br />

on the other hand. The third plenary lecture addresses the rights of children in assessment.<br />

The three invited symposia give a view on the socio-cultural perspective of assessment, on<br />

new psychometric developments in test design, and on advances in e-assessment.<br />

With these invited contributions, the International Conference Committee is leaving its<br />

marks on the 2008 conference.<br />

In addition, the ICC chose as the title for this conference Challenging Assessment in order<br />

to signify that assessment is a very complex and complicated domain of research, which<br />

demands maximum input of energy and know-how from those involved in it. At the same<br />

time, this title indicates that assessment is a fascinating area to work in. As a matter of fact,<br />

this is where all human development and learning can become visible. It is our job to reveal<br />

the traces of growth and the results of education and other learning environments.<br />

However, this is only half of our work. Besides generating this knowledge, making it<br />

accessible to all stakeholders in a meaningful and productive way is equally important.<br />

May this Book of Abstracts inspire its readers’ thoughts and actions towards further<br />

progress in the field of assessment.<br />

Marja van den Heuvel-Panhuizen<br />

(Conference President)<br />

Olaf Köller<br />

(Director IQB)<br />

Berlin, August 2008<br />

ENAC 2008 iii


TABLE OF CONTENTS<br />

Preface iii<br />

Table of contents iv<br />

Introduction xvii<br />

EARLI/Northumbria Assessment Conference 2008 xvii<br />

International Conference Committee ENAC 2008 xvii<br />

The review process of ENAC 2008 xviii<br />

Plenary Lectures 3<br />

Eckhard Klieme<br />

Assessment, grading, and instruction: Understanding the context of<br />

educational measurement<br />

Ruth Leitch<br />

Improving Children’s Rights in Assessment: issues, challenges and<br />

possibilities<br />

Dylan Wiliam<br />

When is assessment learning-oriented?<br />

Invited Symposia 9<br />

Using socio-cultural perspectives to understand and change assessment in<br />

post-compulsory education<br />

Organiser: Liz McDowell<br />

David James<br />

Getting beyond the individual and the technical: How a<br />

cultural approach offers knowledge to transform assessment<br />

Kathryn Ecclestone<br />

Straitjacket or springboard?: the strengths and weaknesses<br />

of using a socio-cultural understanding of the effects of<br />

formative assessment on learning<br />

John Pryor, Barbara Crossouard<br />

Formative assessment: the discursive construction of<br />

identities<br />

iv ENAC 2008<br />

5<br />

6<br />

7<br />

11<br />

12<br />

13<br />

14


Measuring language skills by means of C-tests: Methodological challenges and<br />

psychometric properties<br />

Organisers: Alexander Robitzsch and Olaf Köller<br />

Chair: Olaf Köller<br />

Thomas Eckes<br />

Constructing a calibrated item bank for C-test<br />

Johannes Hartig, Claudia Harsch<br />

Gaining substantive information from local dependencies<br />

between C-test items<br />

Alexander Robitzsch, Ina Karius, Daniela Neumann<br />

C-tests for German Students: Dimensionality, Validity and<br />

Psychometric Perspectives<br />

Moving forward with e-assessment<br />

Organiser / Chair: Denise Whitelock<br />

Discussant: Kari Smith<br />

Cornelia Ruedel<br />

The Future of E-Assessment: E-Assessment as a Dialog<br />

Jim Ridgway, Sean McCusker, James Nicholson<br />

Alcohol and a Mash-up: Understanding Student<br />

Understanding<br />

Sally Jordan<br />

E-assessment for learning? The potential of short free-text<br />

questions with tailored feedback<br />

Symposia 23<br />

Portfolios in Higher Education in three European countries – Variations in<br />

Conceptions, Purposes and Practices<br />

Organiser: Olga Dysthe<br />

Chair: Nicola Reimann<br />

Discussant: Anton Havnes<br />

Elizabeth Hartnell-Young<br />

Learning opportunities through the processes of eportfolio<br />

development<br />

Olga Dysthe, Knut Steinar Engelsen<br />

The Disciplinary Content Portfolio in Norwegian Higher<br />

Education – How and Why?<br />

Wil Meeus, Peter van Petegem<br />

Portfolio diversity in Belgian (Flemish) Higher Education – A<br />

comparative study of eight cases<br />

ENAC 2008 v<br />

15<br />

16<br />

17<br />

18<br />

19<br />

20<br />

21<br />

22<br />

25<br />

26<br />

27<br />

28


Aims, values and ethical considerations in group work assessment<br />

Organiser: Lorraine Foreman-Peck<br />

Julia Vernon<br />

Involuntary Free Riding – how status affects performance in<br />

a group project<br />

Julie Jones, Andrew Smith<br />

Facilitating Group work: leading or empowering?<br />

Tony Mellor, Jane Entwsitle<br />

Marginalised students in group work assessment: ethical<br />

issues of group formation and the effective support of such<br />

individuals<br />

Multidimensional measurement models of students' competencies<br />

Organiser: Johannes Hartig<br />

Markus Wirtz, Timo Leuders, Marianne Bayrhuber, Regina Bruder<br />

Evaluation of non-unidimensional item contents using<br />

diagnostic results from Rasch-analysis<br />

Olga Kunina, Oliver Wilhelm, André A. Rupp<br />

Modelling multidimensional structure via cognitive diagnosis<br />

models: Theoretical potentials and methodological limitations<br />

for practical applications<br />

Jana Höhler, Johannes Hartig<br />

Modelling Specific Abilities for Listening Comprehension in a<br />

Foreign Language with a Multidimensional IRT Model<br />

Recent Developments in Computer-Based Assessment: Chances for the<br />

measurement of Competence<br />

Organiser: Thomas Martens<br />

Thibaud Latour, Raynald Jadoul, Patrick Plichart, Judith Swietlik-<br />

Simon, Lionel Lecaque, Samuel Renault<br />

Enlarging the range of assessment modalities using CBA:<br />

New challenges for generic (web-based) platforms<br />

Frank Goldhammer, Thomas Martens, Johannes Naumann, Heiko<br />

Rölke, Alexander Scharaf<br />

Developing stimuli for electronic reading assessment: The<br />

hypertext-builder<br />

Johannes Naumann, Nina Jude, Frank Goldhammer, Thomas<br />

Martens, Heiko Roelke, Eckhard Klieme<br />

Component skills of electronic reading competence<br />

vi ENAC 2008<br />

29<br />

30<br />

31<br />

32<br />

33<br />

34<br />

35<br />

36<br />

37<br />

38<br />

39<br />

40


Issues in High-Stakes Performance-based Assessment of Clinical Competence<br />

Organiser: Godfrey Pell<br />

Chair/Discussant: Trudie Roberts<br />

David Blackmore<br />

Lessons Learned from Administering a National OSCE for<br />

Medical Licensure<br />

Sydney Smee<br />

Quality Assurance through the OSCE Life Cycle<br />

Godfrey Pell, Richard Fuller<br />

Investigating OSCE Error Variance when measuring higher<br />

level competencies<br />

Katharine Boursicot, Trudie Roberts, Jenny Higham, Jane Dacre<br />

Beyond checklist scoring – clinicians’ perceptions of<br />

inadequate clinical performance<br />

Towards (quasi-) experimental research on the design of peer assessment<br />

Organiser: Dominique Sluijsmans<br />

Marjo van Zundert, Dominique Sluijsmans, Jeroen van<br />

Merriënboer<br />

The effects of peer assessment format and task complexity<br />

on learning and measurements<br />

Dominique Sluijsmans, Jan-Willem Strijbos, Gerard Van de<br />

Watering<br />

Modelling the impact of individual contributions on peer<br />

assessment during group work in teacher training: In search<br />

of flexibility<br />

Jan-Willem Strijbos, Susanne Narciss, Mien Segers<br />

Peer feedback in academic writing: How do feedback<br />

content, writing ability-level and gender of the sender affect<br />

feedback perception and performance?<br />

Assessment in kindergarten classes: experiences from assessing competences<br />

in three domains<br />

Organiser: Marja van den Heuvel-Panhuizen<br />

Chair/Discussant: Kees de Glopper<br />

Coosje van der Pol, Helma van Lierop-Debrauwer<br />

A picture book-based tool for assessing literary competence<br />

in 4 to 6-year olds<br />

Aletta Kwant, Jan Berenst, Kees de Glopper<br />

Assessing the social-emotional development of young<br />

children by means of storytelling and questions<br />

ENAC 2008 vii<br />

41<br />

42<br />

43<br />

44<br />

45<br />

46<br />

47<br />

48<br />

49<br />

50<br />

51<br />

52


Sylvia van den Boogaard, Marja van den Heuvel-Panhuizen<br />

Assessing mathematical abilities of kindergartners:<br />

possibilities of a group-administered multiple-choice test<br />

Papers 55<br />

Linda Allin, Lesley Fishwick<br />

Ethical Dilemmas: ‘Insider’ action research into Higher Education<br />

assessment practice<br />

Mandy Ashgar<br />

Reciprocal Peer Coaching as a Formative assessment strategy: Does it<br />

assist student to self regulate their learning<br />

Beth Black<br />

Using an adapted rank-ordering method to investigate January versus<br />

June awarding standards<br />

Sue Bloxham, Liz Campbell<br />

Generating dialogue in coursework feedback: exploring the use of<br />

interactive coversheets<br />

Saul Alejandro Contreras Palma<br />

Reforming practice or modifying Reforms? The science teacher’s<br />

responses to MBE and to assessment teaching in Chile<br />

Bronwen Cowie, Alister Jones, Judy Moreland, Kathrin Otrel-Cass<br />

Expanding student involvement in Assessment for Learning: A multimodal<br />

approach<br />

Julian Ebert<br />

Assessment Center Method to Evaluate Practice-Related University<br />

Courses<br />

Astrid Birgitte Eggen<br />

Democracy, Assessment and Validity. Discourses and practices<br />

concerning evaluation and assessment in an era of accountability<br />

viii ENAC 2008<br />

53<br />

57<br />

58<br />

59<br />

60<br />

61<br />

62<br />

63<br />

64


Kerry Harman, Erik Bohemia<br />

Using Assessment for Learning: exploring student learning experiences in<br />

a design studio module<br />

Christine Harrison, Paul Black, Jeremy Hodgen, Bethan Marshall, Natasha<br />

Serret<br />

Chasing Validity – The Reality of Teacher Summative Assessments<br />

Anton Havnes<br />

There is a bigger story behind. An analysis of mark average variation<br />

across Programmes<br />

Anton Havnes<br />

Course design and the Law of Unintended Consequences: Reflections on<br />

an assessment regime in a UK “new” University<br />

Mark Hoeksma, Judith Janssen, Wilfried Admiraal<br />

Reliability and validity of the assessment of web-based video portfolios:<br />

Consequences for teacher education<br />

Jenny Hounsell, Dai Hounsell<br />

Diversity in patterns of assessment across a university<br />

Gordon Joughin<br />

Learning-oriented assessment: A critical review of foundational research<br />

Patrick Lai<br />

Implementing standards-based assessment in Universities: Issues,<br />

Concerns and Recommendations<br />

Uwe Maier<br />

Test-based School Reform and the Quality of Performance Feedback: A<br />

comparative study of the relationship between mandatory testing policies<br />

and teacher perspectives in two German states<br />

Thomas Martens, Frank Goldhammer<br />

Motivational aspects of complex item formats<br />

Michael McCabe<br />

Remarkable Pedagogical Benefits of Reusable Assessment Objects for<br />

STEM Subjects<br />

ENAC 2008 ix<br />

65<br />

66<br />

67<br />

68<br />

69<br />

70<br />

71<br />

72<br />

73<br />

74<br />

75


Fiona Meddings, Christine Dearnley, Peter Hartley<br />

Demystifing the assessment process: using protocol analysis as a<br />

research tool in higher education<br />

Catherine Montgomery, Kay Sambell<br />

Challenging the formality of assessment: a student view of ‘Assessment<br />

for Learning’ in Higher Education<br />

Patrice O'Brien, Mei Kuin Lai<br />

Secondary students’ motivation to complete written dance examinations<br />

Michelle O'Doherty<br />

Mind the gap: assessment practices in the context of UK widening<br />

participation<br />

Raphaela Oehler, Alexander Robitzsch<br />

Measuring writing skills in large-scale assessment: Treatment of student<br />

non-responses for Multifaceted-Rasch-Modeling<br />

Susan Orr<br />

Collaborating or fighting for the marks? Students’ experiences of group<br />

assessment in the creative arts<br />

Ron Pat-El, M. Segers, P. Vedder, H. Tillema<br />

Constructing a new assessment for learning questionnaire<br />

Ruth Pilkington<br />

Assessing Professional Learning: the challenge of the UK Professional<br />

Standards Framework<br />

Margaret Price, Karen Handley, Berry O'Donovan<br />

Feedback – all that effort but what is the effect?<br />

Ana Remesal<br />

Student teachers on assessment: First year conceptions<br />

Mary Richardson<br />

Testing our citizens. How effective are assessments of citizenship in<br />

England?<br />

Andreas Saniter, Rainer Bremer<br />

Standards in vocational education<br />

x ENAC 2008<br />

76<br />

77<br />

78<br />

79<br />

80<br />

81<br />

82<br />

83<br />

84<br />

85<br />

86<br />

87


Lydia Schaap, H.G. Schmidt<br />

Why do some students stop showing progress on progress tests?<br />

Lee Shannon, Lin Norton, Bill Norton<br />

Contextualising Assessment: The Lecturer's Perspective<br />

Pou-seong Sit, Kwok-cheung Cheung<br />

Learning to read: Modeling and assessment of early reading<br />

comprehension of the 4-year-olds in Macao kindergartens<br />

Anne Kristin Sjo, Knut Steinar Engelsen, Kari Smith<br />

Assessment in action – Norwegian secondary-school teachers and their<br />

assessment activities<br />

Kari Smith<br />

How do students teachers and mentors assess the Practicum?<br />

Margit Stein<br />

Assessment of competencies of apprentices<br />

Janet Strivens, Cathal O'Siochru<br />

Academics’ epistemic beliefs about their discipline and implications for<br />

their judgements about student performance in assessments<br />

Dineke Tigelaar, Jan van Tartwijk, Fred Janssen, Ietje Veldman, Nico Verloop<br />

Techniques for trustworthiness as a way to describe teacher educators'<br />

Assessment processes<br />

Marjo van Zundert<br />

Peer Assessment for Learning: a State-of-the-art in Research and Future<br />

Directions<br />

Denise Whitelock<br />

Investigating the Pedagogical Push and Technological Pull of Computer<br />

Assisted Formative Assessment<br />

Oliver Wilhelm, Ulrich Schroeders, Maren Formazin, Nina Bucholtz<br />

Strict Tests of Equivalence for and Experimental Manipulations of Tests<br />

for Student Achievement<br />

ENAC 2008 xi<br />

88<br />

89<br />

90<br />

91<br />

92<br />

93<br />

94<br />

95<br />

96<br />

97<br />

98


Roundtable papers 99<br />

Morten Asmyhr<br />

Why the moderate levels of inter-assessor reliability of student essays?<br />

Simon Barrie, C. Hughes, C. Smith<br />

Approaches to the assessment of graduate attributes in higher education<br />

David Boud<br />

Assessment for learning in and beyond courses: a national project to<br />

challenge university assessment practice<br />

Kwok-cheung Cheung, Pou-seong Sit<br />

Electronic reading assessment: The PISA approach for the international<br />

comparison of reading comprehension<br />

Wendy Clark, Jackie Adamson<br />

Developing the autonomous lifelong learner: tools, tasks and taxonomies<br />

Gilian Davison, Craig McLean<br />

Assessing the Art of Diplomacy? Learners and Tutors perceptions of the<br />

use of Assessment for Learning (AfL) in non-vocational education<br />

Luc De Grez, Martin Valcke, Irene Roozen<br />

Assessment of oral presentation skills in higher education<br />

Christine Dearnley, Jill Taylor, Catherine Coates<br />

Mobile Assessment of Practice Learning: An Evaluation from a Student<br />

Perspective<br />

Margaret Fisher, Tracey Proctor-Childs<br />

How reliable is the assessment of practice, and what is its purpose?<br />

Student perceptions in Health and Social Work<br />

Richard Fuller, Matthew Homer, Godfrey Pell<br />

Measuring variance and improving the reliability of criterion based<br />

assessment (CBA): towards the perfect OSCE<br />

Concha Furnborough<br />

Learning through assessment and feedback: implications for adult<br />

beginner distance language learners<br />

xii ENAC 2008<br />

101<br />

102<br />

103<br />

104<br />

105<br />

106<br />

107<br />

108<br />

109<br />

110<br />

111


Stuart Hepplestone<br />

Secret scores: Encouraging student engagement with useful feedback<br />

Therese Nerheim Hopfenbeck<br />

Large-Scale Assessment and Learning-Oriented Assessment: Like Water<br />

and Oil or new Possibilities for Future Research Directions?<br />

Sally Jordan, Philip Butcher, Arlëne Hunter<br />

Online interactive assessment for open learning<br />

Per Lauvas, Gunnars Bjølseth, Anton Havnes<br />

Can inter-assessor reliability be improved by deliberation?<br />

Paulette Luff, Gilian Robinson<br />

Sketchbooks and Journals: a tool for challenging assessment?<br />

Michal Nachshon, Amira Rom<br />

Evaluating the use of popular science articles for assessing high schools<br />

students<br />

Berry O'Donovan, Margaret Price<br />

Supporting student intellectual development through assessment design:<br />

debating ‘how’?<br />

Berry O'Donovan, Margaret Price<br />

Assessment contexts that underpin student achievement: demonstrating<br />

effect<br />

Ann Ooms, Timothy Linsey, Marion Webb<br />

In-classroom use of mobile technologies to support formative assessment<br />

Jon Robinson, David Walker<br />

The Devil's Triad: the symbiotic link between Assessment, Study Skills<br />

and Key Employability Skills<br />

Ann Karin Sandal, Margrethe H. Syversen, Ragne Wangensteen, Kari Smith<br />

Learning-oriented assessment and students experiences<br />

Mark Schofield<br />

Connecting Research Behaviours with Quality Enhancement of<br />

Assessment: Eliciting Developmental Case Studies by Appreciative<br />

Enquiry<br />

ENAC 2008 xiii<br />

112<br />

113<br />

114<br />

115<br />

116<br />

117<br />

118<br />

119<br />

120<br />

121<br />

122<br />

123


Elias Schwieler, Stefan Ekecrantz<br />

Conceptions of assessment in higher education: A qualitative study of<br />

scholars as teachers and researchers<br />

Thomas Stern<br />

Innovative Assessment Practice and Teachers’ Professional Development:<br />

Some Results of Austria’s IMST-Project<br />

Dineke Tigelaar, Mirjam Bakker, Nico Verloop<br />

Characteristics of an effective approach for formative assessment of<br />

teachers’ competence development<br />

Posters 127<br />

Andy Bell, Kevin Rowley<br />

Predictive indicators of academic performance at degree level<br />

Christian Bokhove<br />

Online Formative Assessment for Algebra<br />

Barbara Brockbank, Sally Jordan, Tom Mitchell<br />

Investigating the use of short answer free-text e-assessment questions<br />

with instantaneous tailored feedback<br />

Nina Bucholtz, Maren Formazin, Oliver Wilhelm<br />

Contextualized reasoning with written and audiovisual material: Same or<br />

different?<br />

Tobias Diemer, Harm Kuper<br />

Effects of Large Scale Assessments in Schools: How Standard-Based<br />

School Reform Works<br />

Greet Fastré, Marcel van der Klink, Dominique Sluijsmans, Jeroen van<br />

Merriënboer<br />

Support in Self-assessment in Secondary Vocational Education<br />

Merrilyn Goos, Clair Hughes, Ann Webster-Wright<br />

The confidence levels of course/subject coordinators in undertaking<br />

aspects of their assessment responsibilities<br />

xiv ENAC 2008<br />

124<br />

125<br />

126<br />

129<br />

130<br />

131<br />

132<br />

133<br />

134<br />

135


Stuart Hepplestone<br />

Useful feedback and flexible submission: Designing and implementing<br />

innovative online assignment management<br />

Rosario Hernandez<br />

The challenge of engaging students with feedback<br />

Dai Hounsell, Chun Ming Tai, Rui Xu<br />

Towards More Integrative Assessment<br />

Clair Hughes<br />

Using a framework adapted from Systemic Functional Linguistics to<br />

enhance the understanding and design of assessment tasks<br />

Anders Jonsson<br />

The use of transparency in the "Interactive examination" for student<br />

teachers<br />

Ulrich Keller, Monique Reichert, Gilbert Busana, Romain Martin<br />

School monitoring in Luxembourg: computerized tests and automated<br />

results reporting<br />

Marjolijn Peltenburg, Marja van den Heuvel-Panhuizen<br />

Mathematical power of special needs students<br />

Glynis Pickworth, M. van Rooyen, T.J. Avenant<br />

Quality Assurance review of clinical assessment: How does one close the<br />

loop?<br />

Margaret Price, Karen Handley, Berry O'Donovan<br />

Feedback: What’s in it for me?<br />

Ana Remesal, Manuel Juárez, José Luis Ramírez<br />

From students’ to teachers’ collaboration: a case study of the challenges<br />

of e-teaching and assessing as co-responsibility<br />

Jon Robinson, David Walker<br />

Symbiotic relationships: Assessment for Learning (AfL), study skills and<br />

key employability skills<br />

ENAC 2008 xv<br />

136<br />

137<br />

138<br />

139<br />

140<br />

141<br />

142<br />

143<br />

144<br />

145<br />

146


Petra Scherer<br />

Assessing low achievers’ understanding of place value – consequences<br />

for learning and instruction<br />

Revital Tal<br />

Using a course forum to promote learning and assessment for learning in<br />

environmental education<br />

Mirabelle Walker<br />

Learning-oriented feedback: a challenge to assessment practice<br />

David Webb<br />

Progressive Formalization as an Interpretive Lens for Increasing the<br />

Learning Potentials of Classroom Assessment<br />

Author Index 151<br />

Address list of presenters 157<br />

xvi ENAC 2008<br />

147<br />

148<br />

149<br />

150


INTRODUCTION<br />

EARLI/Northumbria Assessment Conference 2008<br />

The EARLI/Northumbria Assessment Conference (ENAC) is held now for the fourth time. It<br />

is a conference series established jointly by the EARLI Special Interest Group on<br />

Assessment and Evaluation, and Northumbria University in Newcastle, United Kingdom.<br />

ENAC conferences are held biennially, in the years between the – equally biennial – fullscale<br />

EARLI conferences.<br />

The EARLI/Northumbria Assessment Conference started at Northumbria University in<br />

2002. In 2004, the second conference took place in Norway, organised by the University of<br />

Bergen. The third conference returned to Northumbria University.<br />

The present conference continues the tradition of the EARLI/Northumbria Assessment<br />

Conferences by providing a forum for participants to exchange ideas in an inspiring<br />

professional environment and a pleasant and comfortable venue.<br />

The EARLI/Northumbria Assessment Conference 2008 is hosted by IQB (Institut zur<br />

Qualitätsentwicklung im Bildungswesen) of Humboldt University and organised in<br />

collaboration with the International Conference Committee.<br />

International Conference Committee ENAC 2008<br />

Marja van den Heuvel-Panhuizen (Conference President)<br />

IQB, Humboldt University Berlin, Germany<br />

Freudenthal Institute, Utrecht University, the Netherlands<br />

Olaf Köller (Director of IQB)<br />

IQB, Humboldt University Berlin, Germany<br />

Dietlinde Granzer<br />

IQB, Humboldt University Berlin, Germany<br />

Liz McDowell<br />

Northumbria University, UK<br />

Kay Sambell<br />

Northumbria University, UK<br />

Nicola Reimann<br />

Northumbria University, UK<br />

Jim Ridgway (EARLI SIG)<br />

University of Durham, UK<br />

Denise Whitelock (EARLI SIG)<br />

Open University, UK<br />

Anton Havnes<br />

Oslo University College, Norway<br />

Kari Smith (EARLI SIG)<br />

University of Bergen, Norway<br />

ENAC 2008 xvii


The review process of ENAC 2008<br />

The review process was organised by Dietlinde Granzer. In total, we received<br />

196 submissions for the Fourth Biennial EARLI/Northumbria Assessment Conference 2008.<br />

The submissions included the 500-word abstracts of 170 papers (40 of them as part of a<br />

symposium), 11 roundtable papers and 15 posters. Every submission was anonymously<br />

peer-reviewed by two reviewers out of a group of experts selected by the International<br />

Conference Committee. In case the two reviewers had entirely different opinions about the<br />

submission, a third reviewer was asked.<br />

The main criterion in the review was whether the quality of a submission was high enough,<br />

in general, and with respect to the proposed presentation format in particular.<br />

All in all, the quality of the submissions was very high. Because of the large number of high<br />

quality proposals for paper presentations and symposia, the ICC re-allocated some of those<br />

proposals to round tables and posters.<br />

The final decisions about acceptance, rejection, or re-allocating to another presentation<br />

format were in the hands of the ICC. The total acceptance rate was almost 65%.<br />

The ICC thanks the following people for their help in the review process:<br />

Bremerich-Vos, Albert, Universität Duisburg-Essen (GERMANY)<br />

Brna, Paul, Educational Consultancy in Technology Enhanced Learning (UK)<br />

Clegg, Karen, University of York (UK)<br />

Granzer, Dietlinde, Humboldt University Berlin (GERMANY)<br />

Havnes, Anton, University of Bergen (NORWAY)<br />

Higgins, Steve, Durham University (UK)<br />

Köller, Olaf, Humboldt University Berlin (GERMANY)<br />

Lauvas, Per, Fellesadministrasjonen (NORWAY)<br />

McCusker, Sean, Durham University (UK)<br />

McDowell, Liz, Northumbria University (UK)<br />

Montgomery, Catherine, Northumbria University (UK)<br />

Reimann, Nicola, Northumbria University (UK)<br />

Reiss, Kristina, Ludwig-Maximilians-Universität München (GERMANY)<br />

Ridgway, Jim, Durham University (UK)<br />

Ruedel, Cornelia, University of Zürich (SWITZERLAND)<br />

Rust, Christopher, Oxford Brookes University (UK)<br />

Sambell, Kay, Northumbria University (UK)<br />

Smith, Kari, University of Bergen (NORWAY)<br />

Van den Heuvel-Panhuizen, Marja, Utrecht University/Humboldt University Berlin (NL/GER)<br />

Webb, David, University of Colorado at Boulder (USA)<br />

Whitelock, Denise, The Open University (UK)<br />

Wilhelm, Oliver, Humboldt University Berlin (Germany)<br />

Winkley, John; Becta (UK)<br />

xviii ENAC 2008


ABSTRACTS<br />

ENAC 2008 1


2 ENAC 2008


Plenary Lectures<br />

ENAC 2008 3


4 ENAC 2008


Assessment, grading, and instruction:<br />

Understanding the context of educational measurement<br />

Eckhard Klieme, German Institute for International Educational Research (DIPF), Germany<br />

Both institutional effectiveness and individualized, adaptive education depend on the<br />

availability of sophisticated instruments to measure and model student learning. However,<br />

“one size of assessment does not fit all” (Pellegrino at al., 2001, p. 222). These authors<br />

called for multidisciplinary research activities focusing on three facets: “(1) development of<br />

cognitive models of learning that can serve as the basis for assessment design, (2)<br />

research on new statistical measurement models and their applicability, (3) research on<br />

assessment design” (p. 284). Similarly, a recently started research program in Germany<br />

covers four key areas: “the development of theoretical models of competence (1), the<br />

construction of psychometric models (2), the construction of measurement instruments for<br />

the empirical assessment of competencies (3), and research on the use of diagnostic<br />

information (4)” (Koeppen, Hartig, Klieme & Leutner 2008).<br />

Given that outstanding improvements have been made in recent years with regard to cognitive<br />

modelling, psychometrics, and test design, the fourth area seems to be the one which is least<br />

understood in educational research. While a lot of – mostly critical – research has been done on<br />

effects of high-stakes, standard-based assessment systems, few researchers have studied the<br />

many varieties of educational assessment that take place in the context of everyday classroom<br />

teaching and the huge impact these practices have on student learning.<br />

Teachers make observations of students’ understanding and performance in a variety of<br />

ways: in classroom dialogue, homework assignments, and formal tests. These procedures<br />

should permit diagnosis on an individual level, in terms of understanding students’ individual<br />

solution paths, misconceptions, etc. Appropriate individual feedback is crucial to support the<br />

subsequent learning process. A number of research questions arise in this context: What kind<br />

of diagnostic information is best understood by students, and what kind by teachers? How<br />

well can teachers evaluate individual learning processes? What factors influence teachers’<br />

grading decisions? What models of competence do teachers rely on – implicitly or explicitly?<br />

How well founded and how helpful is the individual student feedback provided by the teacher?<br />

And how do all these processes interact with newly implemented assessment systems?<br />

After an overview of attempts for “instructionally sensitive assessment”, the paper will present<br />

two studies investigating everyday classroom practices in detail, based on a sample of math<br />

lessons from Germany and Switzerland. The first study examines teacher judgments about<br />

student achievement in terms of the grades awarded. It examines to which degree the grades<br />

awarded reflect different dimensions of students’ achievement and learning behavior. It also<br />

explores whether assessment and instruction are indeed aligned in the classroom, that is,<br />

whether teachers’ grading is aligned with their instruction. In the second study, we analyze<br />

how teacher evaluation affects students’ subsequent learning processes. This study utilizes<br />

feedback given to students by the teacher within classroom interaction as an indicator for the<br />

communication of student evaluation, and investigates the impact of two types of feedback,<br />

evaluative and informational, on student learning and motivation.<br />

ENAC 2008 5


Improving Children’s Rights in Assessment:<br />

issues, challenges and possibilities<br />

Ruth Leitch, Queen's University Belfast, United Kingdom<br />

Much valuable work has been achieved in recent years concerning the educational benefits<br />

of consulting with children and young people on teaching and learning (Flutter & Ruddock,<br />

2004) and as a means of assuring children their rights in education. There has been<br />

significantly less research on improving children’s rights in assessment despite a growing<br />

interest in the relationship between assessment and social justice. Rights in relation to the<br />

assessment of students’ learning or performance do not expressly exist in legislation in<br />

most jurisdictions although they are enshrined in international treaties such as the United<br />

Nations Convention on the Rights of the Child (UNCRC, 1989).<br />

This presentation will contribute a children’s rights perspective to issues of assessment<br />

focusing specifically on the legal implications and imperatives of Article 12 of the UNCRC.<br />

Data illuminating various issues will be derived from a recent ESRC/TLRP qualitative<br />

research project that consulted pupils on aspects of assessment policy and practice,<br />

including the introduction of annual pupil profiles and assessment for learning (AfL) in the<br />

Northern Ireland context (Leitch et al, 2008). A conceptual model based on a critical legal<br />

interpretation of Article 12 (Lundy, 2007) will be unpacked to illustrate some of the<br />

opportunities and obstacles afforded by students being involved more fully in the<br />

assessment of their learning. The presentation will conclude by arguing that if we are truly<br />

committed to improving children’s rights in relation to assessment, there must be a<br />

concerted approach to awareness raising on the obligations of children’s rights at all levels<br />

within the education system as part of a democratic culture shift – and Article 12 (UNCRC)<br />

is a valuable place to start.<br />

References<br />

Flutter, J. & Rudduck, J. (2004) Consulting Pupils: What's in it for Schools? London: RoutledgeFalmer.<br />

Leitch, R., Gardner, J., Mitchell, S., Lundy, L., Galanouli, D. & Odena, O. (2008) Consulting Pupils on the<br />

Assessment of their Learning. ESRC/TLRP Research Briefing Number 36, March 2008,<br />

http://wwwtlrp.org/pub/research/html<br />

Lundy, L. (2007) ‘Voice is not enough’ : The implications of Article 12 of the UNCRC for Education.British<br />

Educational Research Journal , Vol 33, No 6, 927-942.<br />

UNCRC (1989) United Nations Convention on the Rights of the Child UN General Assembly Resolution<br />

44/25 New York. United Nations.<br />

6 ENAC 2008


When is assessment learning-oriented?<br />

Dylan Wiliam, University of London, United Kingdom<br />

Educational assessments are conducted in a variety of ways and their outcomes can be<br />

used for a range of purposes. There are differences in who decides what is to be assessed,<br />

who carries out the assessment, where the assessment takes place, how the resulting<br />

responses made by students are scored and interpreted, and what happens as a result<br />

(Black and Wiliam, 2004). In particular, each of these can be the responsibility of the<br />

learners themselves, those who teach the students, or, at the other extreme, all the<br />

processes can be carried out by an external agency. Cutting across these differences, there<br />

are also differences in the functions that assessments serve.<br />

Assessments can be used to support judgments about the quality of educational programs<br />

or institutions (what might be termed the evaluative function). They can be used to describe<br />

the achievements of individuals, either for the purpose of certifying that they have reached a<br />

particular level of performance of competence, or for making predictions about their future<br />

capabilities (what might be termed the summative function). And assessment can be used<br />

to support learning (what might be termed the formative function).<br />

In this talk, I will suggest that an assessment functions formatively only when evidence<br />

about student achievement elicited by the assessment is interpreted and used to make<br />

decisions about the next steps in learning that are likely to be better, or better founded, than<br />

the decisions that would have been made in the absence of that evidence.<br />

I will further suggest that learning-oriented assessment involves five key strategies, which<br />

serve to connect assessment to other important educational processes:<br />

• Clarifying, understanding, and sharing learning intentions<br />

• Engineering effective classroom discussions, tasks and activities that elicit evidence of<br />

learning<br />

• Providing feedback that moves learners forward<br />

• Activating students as learning resources for one another<br />

• Activating students as owners of their own learning<br />

Examples of each of these strategies will be given, and the presentation will conclude by<br />

offering a set of priorities for the design of learning-oriented assessments.<br />

ENAC 2008 7


8 ENAC 2008


Invited Symposia<br />

ENAC 2008 9


10 ENAC 2008


Invited Symposium: Socio-cultural perspectives<br />

Using socio-cultural perspectives to understand and change<br />

assessment in post-compulsory education<br />

Organiser: Liz McDowell, University of Northumbria, United Kingdom<br />

This symposium focuses on the application of socio-cultural perspectives to the day-to-day<br />

practices of assessment in post-compulsory education. The importance of the research is<br />

that despite significant changes in theories of learning and teaching and in perspectives on<br />

the societal goals for educational systems, shifts in assessment thinking and practices have<br />

lagged behind these changes (Shepard, 2000). This remains true despite a number of<br />

powerful arguments making the case for change of vision from a ‘testing culture’ to an<br />

‘assessment culture’ (Wolf et al., 1991).<br />

Assessment as testing, based in the scientific measurement approach (Hager & Butler,<br />

1996) has retained its behaviourist roots much more strongly than contemporary teaching<br />

practices. Constructivist approaches should give students a more active role as participants<br />

in assessment rather than victims of the assessor (Dochy & McDowell, 1997) and there has<br />

been considerable growth in interest in formative assessment (Black & Wiliam, 1998).<br />

Nevertheless, in practice, assessment tends to be seen in a technicist way as a decontextualised<br />

and narrow type of activity, a system designed to direct students,<br />

formatively, towards performances that are summatively validated and represented by<br />

grades awarded. Hence there is considerable emphasis on effective techniques and the<br />

design of constructively aligned systems (Biggs, 2003) that channel students into the<br />

desired assessment performances.<br />

Socio-cultural approaches take a much broader view of assessment, recognising that it is a<br />

socially and contextually located set of practices. It can be seen as a structure with a<br />

complex of activities, influences and outcomes experienced by the actors within it, chiefly<br />

teachers and students, situated within a broader social, historical and cultural context.<br />

The symposium presenters draw upon research studies which have produced new insights<br />

into assessment practices. Their collective work represents a cumulative and integrated<br />

body of evidence pointing to the value of socio- cultural understandings of assessment and<br />

their utility in improving practice in classrooms, lecture rooms and examination halls. Each<br />

paper draws on a wide range of evidence but presents data from recent research studies in<br />

post-compulsory education. Assessment is viewed in terms of: its meaning for individuals<br />

with their personal histories and developing identities; its meaning at the collective level,<br />

that is, the ways that assessment is constructed in classrooms, courses, and institutions;<br />

and its longer term consequences (Boud & Falchikov).<br />

The research teams have worked with teachers to make links between new understandings<br />

of assessment and local assessment practices. Some have used action research<br />

approaches to engage teachers and involve them in the development of theory and<br />

practice. Each paper will challenge symposium participants to problematise aspects of<br />

assessment thinking and practice that may have been taken for granted and will offer ways<br />

of accommodating new understandings in assessment practice.<br />

ENAC 2008 11


Invited Symposium: Socio-cultural perspectives / Paper 1:<br />

Getting beyond the individual and the technical:<br />

How a cultural approach offers knowledge to transform assessment<br />

David James, University of the West of England, United Kingdom<br />

This paper argues that a socio-cultural perspective on assessment is an urgent and<br />

practical necessity in higher education. It is important to understand how and to what extent<br />

assessment practices (a) attempt to serve contradictory purposes, and (b) determine<br />

conceptions and practices of learning beyond those desired by tutors and students. It is also<br />

crucial to appreciate how much scope tutors have for beneficial interventions, and why.<br />

The paper begins by setting out some tools by which this might be achieved. It draws upon<br />

the methods and outcomes of the Transforming Learning Cultures in Further Education<br />

project which formed part of the national UK Teaching and Learning Research Programme.<br />

The project was the largest ever independent study of practices in further education. Its<br />

aims were to deepen understanding of the complexities of learning, to weigh up strategies<br />

for improvement, and to set in place a lasting capacity for productive enquiry amongst FE<br />

professionals. To this end, the study involved over 1000 questionnaires, 600 interviews,<br />

extensive shadowing and tutor diaries. It was informed by a range of theoretical sources, of<br />

which Dewey (Biesta and Burbules, 2003) and Bourdieu (e.g. Bourdieu, 1998; Grenfell and<br />

James, 1998) were prominent. The outcomes of the study included a tool for understanding<br />

tutor interventions, some ‘principles of procedure’ for improvement, and a new ‘cultural’<br />

theory (Hodkinson, Biesta and James, 2007). One recurrent theme in the analysis is how<br />

assessment regimes and events embody – sometimes whilst concealing – strong notions of<br />

learning and teaching. Related to this, the study demonstrated how and why individual<br />

tutors often had limited capacity to make significant improvements on their own, sometimes<br />

despite sterling efforts, and how the same intervention could have positive or negative<br />

effects depending on the specific setting.<br />

A method for interrogating learning cultures whilst keeping assessment as a core focus is<br />

presented and applied to HE practice. This research approach raises questions such as,<br />

what conception of learning is inherent in particular ways of writing learning outcomes, or in<br />

the use of academic credit, or in certain assessment events, and marking regimes? Are<br />

there conceptions of learning that are rhetorically important but then marginalized in<br />

assessment practices? The approach avoids the pretence that assessment is fundamentally<br />

a technical matter (James, 2000) and argues that the idea of constructive alignment (e.g.<br />

Biggs, 2003) is ‘too good to be true’. Instead, the paper offers a cultural view of assessment<br />

practices that takes account of power, interests, relationships, and interactions. The view<br />

advocated is compatible with the humanistic concerns in the earlier seminal work of Heron<br />

(1988) and Boud (e.g. 1990), but combines their insights with a fresh ‘take’ on the capacity<br />

of (and scope for) tutors to act. The paper argues that understanding a learning culture<br />

provides a route to realism about worthwhile and possible change to assessment events<br />

and regimes.<br />

12 ENAC 2008


Invited Symposium: Socio-cultural perspectives / Paper 2:<br />

Straitjacket or springboard?: the strengths and weaknesses of using a<br />

socio-cultural understanding of the effects of formative assessment on learning<br />

Kathryn Ecclestone, Oxford Brookes University, United Kingdom<br />

Research into formative assessment in schools and higher education has pointed to a<br />

variety of techniques that supporters claim will raise achievement, engage students with<br />

learning and promote more democratic, transparent assessment practices. Yet, a major<br />

research project exploring formative assessment in further and adult education shows that<br />

techniques, in themselves, are neither progressive or unprogressive (see Ecclestone 2002,<br />

2008; Davies and Ecclestone, 2007; Marshall and Drummond, 2006). Instead, work by<br />

James and Biesta and colleagues shows the usefulness of a socio-cultural understanding of<br />

learning (see, for example, James and Biesta 2007). The paper explores how a sociocultural<br />

approach illuminates the subtle ways in which different learning cultures within the<br />

same institution or course can produce formative assessment that leads either to<br />

instrumental compliance or deep, sustainable engagement. Sometimes learning cultures<br />

encourage instrumental assessment as a springboard to deeper forms of learning;<br />

sometimes instrumental assessment acts as a straitjacket on learning.<br />

This paper draws on recent empirical studies that have explored the links between policy for<br />

formative assessment, espoused theoretical principles and the reality of day to day<br />

practices in different contexts. It examines how teachers’ and students’ ideas about<br />

formative assessment practices cannot be divorced from the learning cultures which both<br />

shape those ideas and practices and which, in turn, are shaped by them. This illuminates<br />

tensions between instrumental and sustainable formative practice and shows possibilities<br />

for affecting practice.<br />

However, there is also a danger that a socio-cultural understanding can also overemphasise<br />

the discursive effects of formative assessment on identities and the navigation<br />

of power and relationships within assessment practices. Whilst important, effects of<br />

assessment on identities can overlook attention to the quality of educational outcomes for<br />

students.<br />

The paper aims to makes proposals about how a socio-cultural understanding of formative<br />

assessment helps teachers influence their practice in positive ways, with specific examples<br />

from recent activities and discussions with teachers in further and adult education.<br />

ENAC 2008 13


Invited Symposium: Socio-cultural perspectives / Paper 3:<br />

Formative assessment: the discursive construction of identities<br />

John Pryor, University of Sussex, United Kingdom<br />

Barbara Crossouard, University of Sussex, United Kingdom<br />

This paper relates to recent work in higher education with professional doctorate students. It<br />

builds on empirical research conducted in a number of educational contexts over the past<br />

14 years (e.g. Torrance and Pryor 1998, 2001; Pryor and Crossouard 2008; Crossouard<br />

2008). In these studies the crucial importance of issues of student and teacher identity to<br />

learning cultures and therefore to the nature and consequences of formative assessment<br />

has emerged as an increasingly important theme.<br />

The analysis draws on social theory whereby identity is not seen in terms of the<br />

individualized psychological self but more in terms of identity embedded in social processes<br />

and practices (see Hey, 2006). This is related to a sociocultural perspective on learning as<br />

happening through dialogic processes of identity construction and performance, so that it<br />

involves ‘becoming a different person [where] identity, knowing and social membership<br />

entail one another’ (Lave & Wenger, 1991, p.53). The data on doctoral students were<br />

derived from observation of formative assessment, discourse analysis of online texts,<br />

including peer discussion forum interactions and tutor email feedback, and exploration of<br />

student perceptions through in-depth interviews. This yielded a close focus on the<br />

processes of formative assessment. This was supplemented by insider perspectives<br />

generated by the fact that the researchers were the main tutor during the part of the<br />

doctoral programme under study and a doctoral candidate. Thus the project was able to<br />

include an element of action research to develop and evaluate different aspects in more<br />

detail and explicitly ground the findings in practice.<br />

Our conclusions are that issues of identity, power and culture are part of the complexity of<br />

learning. These issues may act as barriers, but formative assessment offers opportunities<br />

for what might be described as an explicit meta-discourse which may also enhance<br />

learning. Thus power differentials emerge as potentially productive when different identity<br />

positions – assessor, teacher, practitioner, learner, disciplinary expert, critic – are<br />

deliberately invoked by the tutor. Similarly student identities, both as students and in relation<br />

to their past and future lives, can be deliberately invoked. Within this play of identities the<br />

disciplinary norms against which students’ performances are judged (the rules of the game)<br />

may be highlighted. Engagement with subject matter alongside identity thus has special<br />

potential for formative assessment as a means of promoting equity in education.<br />

This work establishes formative assessment at the heart of higher education practice and a<br />

key implication that its potency should not be underestimated. Despite its complexity we do<br />

suggest ways in which the play of identities can be incorporated into the practice of teaching<br />

and learning.<br />

14 ENAC 2008


Invited Symposium: C-tests<br />

Measuring language skills by means of C-tests:<br />

Methodological challenges and psychometric properties<br />

Organisers: Alexander Robitzsch, Olaf Köller, IQB, Humboldt University Berlin, Germany<br />

Chair: Olaf Köller, IQB, Humboldt University Berlin, Germany<br />

C-tests are widely used to assess overall language skills both in foreign languages as well<br />

as in mother tongue. These tests usually consist of texts interrupted by gaps that have to be<br />

filled in by examinees. The individual gaps within each text are typically the smallest unit of<br />

analysis, which can be treated as individual test items. The analytical focus, however, of the<br />

C-test assessment is usually not the individual item, but the overall level of achievement<br />

(i.e., the number of closed gaps).While there is broad consensus that these measures are<br />

quite reliable and valid, there are several methodological challenges associates with these<br />

measures. Particularly, when IRT-models are applied to these tests, different strategies can<br />

be used. Some authors (e.g., Eckes, 2007) recommend the application of polytomous IRT<br />

models, in which all gaps of one text are building one rating scale ranging from zero up to<br />

the number of gaps. Other authors, however, prefer analyzing each gap as a single item<br />

and then applying Rasch testlet models, in which dependencies among items are in the<br />

model. All three papers in the proposed symposium focus on this issue.<br />

The first paper provided by Eckes clearly prefers polytomous IRT models when analyzing Ctests.<br />

The appropriateness of this approach is shown on C-test results from approximately<br />

5.000 examinees from 116 countries, all of which have been working on C-tests measuring<br />

German as a foreign language.<br />

The authors of the second paper (Hartig & Harsch) argue that a more adequate strategy<br />

might be applying testlet models to C-tests. In this approach more is learnt about specific<br />

dependencies among gaps. This approach is illustrated by means of data from the German<br />

DESI large scale study in which foreign language skills of about 10.000 9th graders were<br />

assessed with C-tests.<br />

The third paper by Robitzsch, Karius, and Neumann offers an extended solution of the<br />

Hartig and Harsch approach. In their study, which was part of the German national<br />

assessment program, the authors propose a more detailed testlet model which models<br />

dependency of the gaps hierarchically, i.e., items are nested within sentences and<br />

sentences are nested within C-tests. Furthermore the authors analyze relationships of Ctests<br />

with tests on other language skills. In summary, the papers widen our understanding<br />

both on how to model responses in C-tests as well as what C-tests typically measure.<br />

ENAC 2008 15


Invited Symposium: C-tests / Paper 1:<br />

Constructing a calibrated item bank for C-test<br />

Thomas Eckes, TestDaF Institute, Germany<br />

C-tests are gap-filling tests that measure general language proficiency. In terms of efficient<br />

construction of C-tests, high-quality test development, and flexible test administration,<br />

including web-based testing, it is imperative to make use of a calibrated item bank, that is,<br />

an item bank in which parameter estimates for all items in the bank have been placed on<br />

the same difficulty scale. When constructing a calibrated item bank for C-tests, two major<br />

issues arise: (a) choosing an IRT model for item calibration and linking, and (b) choosing a<br />

design for collection of item-banking data.<br />

Regarding the first issue, it is important to realize that gaps within a given text are locally<br />

dependent to a significant degree. As a consequence, texts should not be analyzed on the<br />

level of individual gaps, but should rather be construed as super-items (item bundles,<br />

testlets), with item values corresponding to the number of gaps within a given text; that is,<br />

each text should be viewed as a polytomous item. Accordingly, Rasch models such as<br />

Andrich’s rating scale model or Masters’ partial credit model would seem appropriate.<br />

With respect to the data collection issue, one widely used design is the common-item<br />

nonequivalent groups (CING) design. In this design, various test forms are linked through a<br />

set of common items. The groups are not considered to be equivalent. Alternatively, the<br />

randomly-equivalent groups design could be employed. Examinees are randomly assigned<br />

the form to be administered; linkage between the forms is achieved by assuming that the<br />

different groups of examinees taking different forms are equivalent in ability.<br />

In the present paper, I report on an ongoing study aiming at the construction of a large<br />

calibrated item bank for use with an Internet-delivered C-test, the “Online Placement Test of<br />

German as a Foreign Language” (onDaF; www.ondaf.de). Building on research into the<br />

suitability of various polytomous Rasch models to the analysis of C-tests (Eckes, 2007), the<br />

rating scale model was employed for item calibration. Adopting a CING design, itembanking<br />

data were collected in a series of 23 different test sessions, covering a total of<br />

4,842 participants from 116 countries. In each session a set of 10 texts was administered,<br />

two of which were common to all sets. Reliability indices per set ranged from .94 to .98.<br />

Texts showing unsatisfactory model fit or DIF were eliminated. The remaining 174 texts<br />

were put on the same difficulty scale through a concurrent estimation procedure.<br />

Combined with a carefully designed client-server architecture, the Rasch-measurement<br />

approach to item banking currently provides the basis for a highly flexible administration of<br />

the onDaF at licensed test centers throughout the world. When taking the onDaF, each<br />

examinee is presented with a unique set of eight texts; that is, texts are drawn from the item<br />

bank according to a linear-on-the-fly test delivery model. In each instance, test assembly is<br />

subject to the constraints of increasing text difficulty and variation in text topic. Responses<br />

are automatically scored and test results are reported to examinees immediately after<br />

completing the test.<br />

16 ENAC 2008


Invited Symposium: C-tests / Paper 2:<br />

Gaining substantive information from local dependencies between C-test items<br />

Johannes Hartig, German Institute for International Educational Research (DIPF), Germany<br />

Claudia Harsch, IQB, Humboldt University Berlin, Germany<br />

The C-test is used as a screening tool to assess global performance levels in written<br />

language competence. The individual gaps within each text of the C-test are the smallest<br />

unit of analysis, which can be treated as individual test items. Typically, however, the focus<br />

of the C-test assessment is not the individual item, but the overall level of achievement (i.e.,<br />

the number of closed gaps). A technical argument against an analysis on the item level is<br />

that the solutions of individual gaps partially depend on the solutions of the remaining gaps<br />

within the same text. This means if individual gaps are treated as items, local independence<br />

of these items, as presumed in most measurement models, is a rather unlikely assumption.<br />

For this reason, many authors prefer to analyze performance in C-tests on the text level and<br />

not on the level of individual gaps, e.g. by treating each text as separate “super item”. In<br />

contrast to this approach, this paper will focus on the substantive information that can be<br />

gained about the C-test and the underlying language competencies if performance is<br />

analyzed on the item level. We use the dependencies on text level and between individual<br />

gaps to derive information about characteristics of texts and gaps that determine students’<br />

solution processes. The aim of the study is to predict these dependencies by using a priori<br />

defined text and item characteristics.<br />

Statistical and graphical methods to examine item dependencies on text and item level are<br />

presented. Dependencies on text level can be estimated within a Rasch testlet model,<br />

assuming additional latent dimensions for each test, over and above the common<br />

underlying ability dimension. Dependencies between individual gaps can be graphically<br />

analyzed based on the correlations between residuals from Rasch analyses within each<br />

text. These methods are applied to data from a large scale assessment of English language<br />

competencies of German 9th graders. Statistics for local dependencies are estimated on<br />

text and on item level. Results of the Rasch testlet model show substantial amounts of textspecific<br />

variance, indicating general dependencies between gaps within the same text. The<br />

analysis of residuals yields strong dependencies between few specific item pairs, while in<br />

some texts almost no marked dependencies are found on item level. Dependencies on text<br />

level as well as on item level can partly be explained by text and item characteristics. For<br />

instance, the deletion frequency of gaps seems to affect dependencies on text level, and<br />

dependencies between items can be found for gaps within the same phrases. The results<br />

widen our understanding of the C-test construct; it is discussed if this knowledge can be<br />

used to systematically construct C-tests with specific properties.<br />

ENAC 2008 17


Invited Symposium: C-tests / Paper 3:<br />

C-tests for German Students:<br />

Dimensionality, Validity and Psychometric Perspectives<br />

Alexander Robitzsch, IQB, Humboldt University Berlin, Germany<br />

Ina Karius, IQB, Humboldt University Berlin, Germany<br />

Daniela Neumann, IQB, Humboldt University Berlin, Germany<br />

C-tests are integrative tests which are designed to test a person’s command of language by<br />

making use of the principle of reduced redundancy. It is assumed that language is<br />

redundant in a way that allows successful communication although possible flaws in the<br />

language transmission (unclarity, ambiguity, noise) may impede understanding. The<br />

addressee of a message is able to reconstruct the form and meaning of morphologically<br />

incomplete words supported by the local and global context of the message, provided that<br />

he or she is familiar with the vocabulary, the grammatical rules and the cultural background<br />

of the language used.<br />

In Germany, the educational standards for the subject “German” (mother tongue) are<br />

supposed to ensure that every student is able to fully participate in written and spoken<br />

interaction. In the course of the measurement of the educational standards items were<br />

developed to assess the students’ competences in this area. A total of 1700 students of all<br />

secondary school types ranging from grade level eight to ten (14-17 years old) were tested<br />

in reading and listening comprehension, writing, orthography and language use. The<br />

assessment part on language use contained, among other items, C-tests. Altogether ten<br />

different C-tests were used in a sample of 560 students. Every student filled in four C-tests<br />

which resulted in a complete balanced multi-matrix sampling design.<br />

In many cases dimensionality of C-tests is assessed by regarding each C-test as one<br />

superitem that demands all blanks to be completed. We focus on a dimensional analysis on<br />

the item level and use NOHARM and DIMTEST to assess essential dimensionality of the Ctest<br />

construct. Due to lack of local stochastic independence of the Rasch model we propose<br />

a more detailed testlet model which models dependency hierarchically i.e. items are nested<br />

within sentences and sentences are nested within C-tests.<br />

This paper estimates this multilevel item response model on the item level. In addition<br />

variances of local stochastic dependency are being explained by linguistic characteristics of<br />

the material and the student properties such as grade level and school track. To provide a<br />

deeper understanding of validity, we study relationships of C-test results with subdomains of<br />

language use, orthography and listening comprehension by using a confirmatory factor<br />

analysis.<br />

This contribution gives insight into the C-test construct for native speakers with regard to<br />

dimensionality; it provides detailed validity evidence and finally proposes an alternative<br />

psychometric scaling model.<br />

18 ENAC 2008


Invited Symposium: E-Assessment<br />

Moving forward with e-assessment<br />

Organiser: Denise Whitelock, The Open University, United Kingdom<br />

Discussant: Kari Smith, University of Bergen, Norway<br />

Technology is increasingly being used to support assessment, but its effectiveness in this<br />

area of learning is still open to question. In part, this is because there is an awkward tension<br />

between assessment and constructivist approaches to learning. Even when an entire<br />

learning experience has been designed to be constructivist and learner-centred, formal,<br />

summative assessment sits uneasily with this constructivist pedagogy; it is ‘out there’ and is<br />

not part of the process of constructing knowledge. Constructivism is used as an approach<br />

for getting better performance on conventional measures, rather than as a radical<br />

philosophy about the nature of knowledge and its acquisition. When assessment is<br />

embedded within constructivist pedagogy, learners quickly adopt strategies that optimise<br />

their cognitive load, typically guessing what is expected of them rather than constructing<br />

their own conceptual frameworks.<br />

This symposium scrutinises the laudable aims of harnessing technology enhanced<br />

assessment to help shape learners as independent thinkers, making their own judgments<br />

and decisions about their learning process in partnership with their tutors. Assessment and<br />

learning need to be properly linked. As Elton and Johnston said, “if one changes the method<br />

of teaching, but keeps the assessment unchanged, one is very likely to fail.” And Rowntree:<br />

“if we wish to discover the truth about an educational system, we must look into its<br />

assessment procedures.” Despite decades of innovation in learning theory and technology<br />

and many different approaches to the problem of building conversational relationships in<br />

education, assessment is still the core of the problem.<br />

Assessment systems define the nature of subjects, and what is worth knowing, and act as<br />

gatekeepers to progress in education and careers, synchronising understanding between<br />

an individual and the world they live in. This symposium will discuss how e-assessment can<br />

overcome the barriers that are imposed upon both students and tutors in developing ICTbased<br />

systems and are tuned to the current net generation of learners.<br />

ENAC 2008 19


Invited Symposium: E-Assessment / Paper 1:<br />

The Future of E-Assessment: E-Assessment as a Dialog<br />

Cornelia Ruedel, University of Zurich, Switzerland<br />

E-Assessment has become more and more popular over the last decade. In contrast to the<br />

German-speaking countries where the impact has been limited compared to the advances<br />

in the USA and the UK, the Universities in Switzerland, Germany and Austria have now<br />

realised e-Asessment’s potential. The exam system in German-speaking countries is<br />

traditionally dominated by a competition between exams, assignments and oral<br />

presentations. The Bologna Reform with the modularisation of the university courses has<br />

put a burden upon the exam system which has started a rethink of new assessment<br />

methods. These new assessment forms should take all the criticism about the traditional<br />

system into account in terms of validity, reliability and fairness. The usual practice of<br />

marking in the German-speaking countries is not really transparent. There is no tradition of<br />

having an external examiner for written exams, only the lecturer is responsible for the<br />

marking process. Therefore the students have sometimes an uneasy feeling about the<br />

whole system.<br />

Generally, the students’ view of assessment is quite different from the lecturers’ view. The<br />

assessment is at the heart of the student learning but it is not at the heart of teaching.<br />

Assessment should become a classroom element which motivates, encourages and<br />

stimulates student learning. E-Assessment offers all these in a variety of maintainable<br />

solutions, like self-tests, e-portfolios and peer assessment.<br />

This paper will discuss the future possibilities for E-Assessment which should be a more<br />

holistic approach with a mixing of learning, teaching and assessing. This should be diverted<br />

from the usual ‘snapshot assessing’ towards an assessment over a period of time to avoid<br />

students’ exam anxiety and occasional blackout or blip. This approach of continuous<br />

assessment is only feasible with the electronic delivery and the use of Virtual Learning<br />

Space. This space should enable the students to be in control of their own learning and<br />

even their private notes and reports. The Virtual Learning Environments are too static at the<br />

moment and do not offer the freedom students require to network with their peers.<br />

Furthermore, the learning space should allow the flexibility that the students can decide<br />

when and where they are ready to take the test, which would lead to a self-organised<br />

assessment.<br />

Students could take the formative assessments as many times as they want, their progress<br />

would be recorded and would contribute to the final mark. In-depth and targeted feedback<br />

could guide the students so that they can learn from their own mistakes / misconceptions so<br />

this would help them to develop their reflective thinking skills. Here, E-Assessment would<br />

play the major role because is it possible to assess softer skills too since researching,<br />

validating data from different sources and working in a team are becoming more important<br />

on the wider opening job market. New forms of collaborative assessment techniques will be<br />

established where wikis and blogs are only the beginning. E-Assessment will be the new<br />

approach where the students’ expectations meet the teachers’ requirements.<br />

20 ENAC 2008


Invited Symposium: E-Assessment / Paper 2:<br />

Alcohol and a Mash-up: Understanding Student Understanding<br />

Jim Ridgway, University of Durham, United Kingdom<br />

Sean McCusker, University of Durham, United Kingdom<br />

James Nicholson, University of Durham, United Kingdom<br />

Informed citizenship depends on the ability of citizens to understand and reason from evidence. In<br />

the UK at least, school statistics focuses on the mastery of technique, rather than on interpretation<br />

of results (Ridgway McCusker & Nicholson, 2007). The techniques themselves focus on the<br />

analysis of univariate and bivariate data. As a consequence, school statistics is largely useless in<br />

dealing with any data sets students might encounter in their lives outside school.<br />

There is a large literature on the problems that students and adults have with simple concepts,<br />

such as interpreting static 2D graphs, and tabular information (e.g. Batanero, et al., 1994). One<br />

might predict that working with multivariate data would be impossible for people with no statistical<br />

training. However, empirical explorations (e.g. Ridgway McCusker & Nicholson, 2006) show that<br />

computer-based 3 variable tasks are no more difficult for 12-14 year olds than are 2D paper<br />

based tasks. Du Feu (2005) has shown that much younger children can work meaningfully with<br />

multivariate data displays that they have created in the form of tactile graphs built from LEGO®.<br />

The SMART Centre has designed a number of software ‘shells’ in Macromedia Flash® that run<br />

on web browsers, and that facilitate the display of MV data (http://www.dur.ac.uk/smart.centre/). A<br />

variety of displays is available, that allow up to 6 variables to be displayed under user control. An<br />

earlier study (Ridgway, Nicholson, and McCusker, 2008) reported a study based in 13 classes of<br />

pupils aged 12-14 years, covering the range of abilities typical in their school. Resources were<br />

created on topics that included alcohol use, drug use, and sexually transmitted infections, using<br />

data from large scale surveys, together with curriculum materials designed to provoke<br />

understanding of MV data. Classroom observations showed that young pupils across the<br />

attainment range can engage with and understand complex messages in MV data.<br />

The study to be reported here presents students with a mashup comprising recent survey data on<br />

alcohol use presented in an interactive display, and links to recent newspaper articles on alcohol<br />

consumption by young people (e.g. “Young girls drink nearly twice as much alcohol as they did<br />

7 years ago” Daily Mail). Students are asked to critique the articles in the light of the data. We<br />

believe that the ability to read critically in the light of evidence is a core literacy, and a fundamental<br />

requirement for informed citizenship. We will report the findings from this study in detail, along<br />

with a list of core heuristics that are essential when exploring MV data. We will also present<br />

examples from student work that illustrate key aspects of statistical literacy, and questions that<br />

are useful to diagnose student conceptions and misconceptions.<br />

References<br />

Batanero, C., Godino, J. D., Vallecillos, A., Green, D., & Holmes, P. (1994). Errors and difficulties in<br />

understanding elementary statistical concepts. International Journal of Mathematics, Education,<br />

Science and Technology, 25(4), 527 – 547.<br />

du Feu, C. (2005) Bluebells and bias, stitchwort and statistics. Teaching Statistics, 27(2), 34-36<br />

Ridgway, J., Nicholson, J. R., & McCusker, S. (2007). Teaching statistics despite its applications. Teaching<br />

Statistics, 29(2), 44-48.<br />

Ridgway, J., Nicholson, J., and McCusker, S. (2008, in press). Reconceptualising ‘Statistics’ and<br />

‘Education’. In C. Batanero (ed.). Statistics Education in School Mathematics: Challenges for<br />

Teaching and Teacher Education. Springer.<br />

ENAC 2008 21


Invited Symposium: E-Assessment / Paper 3:<br />

E-assessment for learning?<br />

The potential of short free-text questions with tailored feedback<br />

Sally Jordan, The Open University, United Kingdom<br />

A number of literature reviews have identified conditions under which assessment supports student<br />

learning (e.g. Gibbs and Simpson, 2004). Two common themes are assessment’s ability to<br />

motivate and engage students, and the role of feedback. However, if feedback is to be effective, it<br />

must be more than a transmission of information from teacher to learner. The student must<br />

understand the feedback sufficiently well to be able to learn from it i.e. to ‘close the gap’ between<br />

their current level of understanding and the level expected by the teacher (Ramaprasad, 1983).<br />

The work described is one of a number of projects in an ‘E-assessment for learning’ initiative at<br />

the Centre for the Open Learning of Mathematics, Science, Computing and Technology<br />

(COLMSCT) at the UK Open University. Most of the projects make use of the OpenMark eassessment<br />

system, which offers students multiple attempts at each question, with the amount of<br />

feedback provided increasing at each attempt. The provision of multiple attempts with increasing<br />

feedback is designed to give the student an opportunity to act on the feedback to correct his or<br />

her work immediately and the tailored feedback is designed to simulate a ‘tutor at the student’s<br />

elbow’. (Ross et al, 2006).<br />

The current project has extended the range of e-assessment questions offered to students via<br />

OpenMark to include those requiring free-text answers of up to around a sentence in length. The<br />

answer matching is written with an authoring tool provided by Intelligent Assessment<br />

Technologies Ltd. (Mitchell et al., 2002) which uses the natural language processing technique of<br />

information extraction and incorporates a number of processing modules aimed at providing<br />

accurate marking without undue penalty for poor spelling and grammar. A significant feature of<br />

the project has been the use of student responses to developmental versions of the questions,<br />

themselves delivered online, to improve the answer matching.<br />

Evaluation has included an investigation into student reaction to questions of this type and their use<br />

of the feedback provided. A human-computer marking comparison has shown the computer’s<br />

marking to be indistinguishable or more accurate than that of six course tutors. Reasons for this will<br />

be discussed. The two facets of the evaluation are linked; if students are to engage with the<br />

questions and to learn from the feedback provided, the marking must be accurate. Also, although<br />

most students like the questions and are impressed by the sophisticated answer matching, others<br />

appear to find multiple-choice questions less demanding and to be more trusting of human markers.<br />

The purpose of these interactive computer marked assignments (iCMAs) is to provide students<br />

with instantaneous feedback, pacing, and an opportunity to monitor their own progress and to<br />

discuss this with their tutor if appropriate. The iCMAs are complemented by tutor marked<br />

assignments.<br />

References<br />

Gibbs, G. and Simpson, C. (2004) Conditions under which assessment supports students’ learning.<br />

Learning and Teaching in Higher Education, 1:3-31.<br />

Mitchell, T., Russell, T., Broomhead, P. and Aldridge, N. (2002) Towards robust computerised marking of<br />

free-text responses. 6th International CAA Conference, Loughborough,UK.<br />

http://www.caaconference.com/pastconferences/2002/proceedings/Mitchell_t1.pdf<br />

Ramaprasad, A. (1983) On the definition of feedback, Behavioral Science, 28:4-13.<br />

Ross, S.M., Jordan, S.E and Butcher, P.G.(2006) Online instantaneous and targeted feedback for remote<br />

learners in C. Bryan and K.V.Clegg (eds) Innovative Assessment in Higher Education. London:<br />

Routledge:123-131.<br />

22 ENAC 2008


Symposia<br />

ENAC 2008 23


24 ENAC 2008


Symposium: Portfolios in Higher Education<br />

Portfolios in Higher Education in three European countries –<br />

Variations in Conceptions, Purposes and Practices<br />

Organiser: Olga Dysthe, University of Bergen, Norway<br />

Chair: Nicola Reimann, Northumbria University, United Kingdom<br />

Discussant: Anton Havnes, University of Bergen, Norway<br />

Portfolio assessment has been introduced in most countries both as an alternative<br />

assessment tool and as a tool for learning. The term ‘portfolio’, however, is used in very<br />

many different ways. This variation in portfolio conception is often presented as an<br />

advantage in the sense that portfolio is a very versatile tool that can be adapted to fit a long<br />

array of purposes and contexts. But it does create confusion for students who may<br />

encounter very different practices under the same name. In a Europe with extended<br />

educational mobility, we think it is timely to investigate and discuss whether there are any<br />

patterns of use that can be distinguished between countries (or: as characteristics in each<br />

country), and/or whether differences follow disciplinary and professional lines and thus cut<br />

across borders. Specific questions that will be raised in the discussion: Is there a need for a<br />

more unified understanding or definition of portfolio, and if so, is it possible? Is there a need<br />

for a clarificatory framework? The ‘collection-selection-reflection’ framework has been<br />

widely used, but is it useful in all contexts? What is useful for students, and what is needed<br />

in European or wider international fora where portfolios are discussed and researched?<br />

In this symposium we will present research from three countries that give some indication of<br />

how portfolios are used in higher education in Belgium, Norway and England, even though<br />

we fully realise that the picture in each country is even more varied than we are able to<br />

show.<br />

In England portfolios have developed from several directions, broadly summarised as<br />

learner-specific and subject-specific. This presentation, based on a study in 2007 will focus<br />

on e-portfolios growing out of the introduction of the Personal Development Planning, which<br />

can be seen as a formative assessment approach.<br />

In Norway portfolios were introduced as an alternative form of assessment in connection<br />

with a major reform of higher education implemented from 2002. This led to a proliferation<br />

of what can be called “disciplinary content portfolios”. A national survey of portfolio practices<br />

is presented and differences in understanding, content, use and assessment of portfolios<br />

between disciplines is discussed.<br />

A research study from the Flemish part of Belgium highlights a lot of differences between<br />

portfolio applications in different higher education courses. A framework was developed to<br />

compare portfolios from eight different courses in colleges and at universities.<br />

Interpretations of the portfolio concept are divergent and a portfolio standard is absent.<br />

All the contributions are based on research studies but the implications and the questions<br />

raised are practice-related. The presenters represent three European countries, UK,<br />

Norway and Belgium.<br />

ENAC 2008 25


Symposium: Portfolios in Higher Education / Paper 1:<br />

Learning opportunities through the processes of eportfolio development<br />

Elizabeth Hartnell-Young, The University of Nottingham, United Kingdom<br />

In Higher Education in the UK, the use of e-portfolios has developed from several directions,<br />

which can be broadly summarised as learner-specific and subject-specific. The first is often<br />

known as Personal Development Planning, which can be seen as a formative assessment<br />

approach, while the other is in high stakes summative assessment, in discrete subject areas,<br />

and increasingly in competency-based contexts in fields such as medicine. Consequently the<br />

form and content of the e-portfolios, and the issues that arise in their use, differ.<br />

England’s, e–Strategy intends that learners will have ‘a digital space that is personalised,<br />

that remembers what the learner is interested in and suggests relevant web sites, or alerts<br />

them to courses and learning opportunities that fit their needs’. As well as using such<br />

spaces in schools, colleges and universities, the intention is to enable the development of<br />

‘electronic portfolios that learners can carry on using throughout life.’<br />

This paper is based on a study conducted by the author in the UK in 2007 which considered<br />

the uses of e-portfolios in school, further education colleges, universities and the National<br />

Health Service. It concluded that e-portfolio systems include repositories and a range of<br />

tools for storing and organising material for planning, reflecting, and giving and receiving<br />

feedback. The processes undertaken are opportunities for learning, while the collections in<br />

repositories build up over time, allowing selections to be offered to various audiences and<br />

assessors. Thus they can support both formative assessment (ongoing assessment for<br />

learning), and allow relevant selections to be presented for summative assessment<br />

(assessment of learning). At present, however, users have little sense of the concept of a<br />

lifelong e-portfolio.<br />

E-portfolio development can commence from one of many starting points such as online<br />

reflective practice, or planning or capturing evidence. These processes form part of an ‘eportfolio<br />

culture’: a way of thinking about personal and collaborative learning over a longer<br />

period of time than a specific course. Fragments of life experience from individual subjects<br />

and artefacts must be made more coherent as expressions of identity. However, because<br />

almost all e-portfolio products are licensed by institutions rather than individuals, they are<br />

neither portable nor interoperable, thus creating potential roadblocks on the lifelong journey.<br />

Assessment is the formal means by which eportfolios or their disaggregated contents are<br />

judged, whether by self assessment, peer assessment, tutor assessment, or university<br />

admissions officers. There are also other contexts, such as employment applications, in<br />

which audiences must recognise, acknowledge and value material in new forms. Rowntree<br />

(1977) suggested five dimensions of assessment: Why assess? What to Assess? How to<br />

assess? How to interpret? and How to respond? The last two questions are particularly<br />

apposite in light of increasing use of digital images, animations and so on, and with<br />

increasing attention being paid to e-assessment as a means of judging the outcomes of an<br />

individual’s learning experiences.<br />

References<br />

Rowntree, D. (1977). Assessing Students: How shall we know them? London: Harper.<br />

26 ENAC 2008


Symposium: Portfolios in Higher Education / Paper 2:<br />

The Disciplinary Content Portfolio in Norwegian Higher Education – How and Why?<br />

Olga Dysthe, University of Bergen, Norway<br />

Knut Steinar Engelsen, Stord/Haugesund University College, Norway<br />

Considerable changes in assessment have taken place in Norway after 2002, in the wake of<br />

a major reform of higher education inspired by the Bologna Declaration. While ‘portfolio’ was<br />

an unknown concept for most teachers and students in higher education in Norway five years<br />

ago, an evaluation report about the reform documented that considerable changes had taken<br />

place and that portfolio assessment was now used in all types of educational institutions and<br />

across disciplines (Dysthe et al 2006). The empirical basis for this paper is a nationwide<br />

survey study of portfolio practices conducted in 2006, supplemented by case studies.<br />

The aim of the research study was to get an overview of portfolio practices in Norway. We will<br />

mention some of the findings and describe characteristic aspects of ‘the disciplinary content<br />

portfolio’ and systematic differences between different types of educational institutions as well<br />

as between disciplines within each institution. The main question we raise is what<br />

conceptions of portfolios and what practices are considered useful and under what conditions.<br />

Our survey was based on a randomized selection from all public universities and university<br />

colleges in Norway was conducted in the spring 2006. The purpose was to map<br />

assessment practices across different institutions and disciplines, focusing on issues like<br />

types and number of portfolio assignments, the use of feedback, final assessment formats<br />

and the use of evaluation criteria. Teacher attitudes towards the usefulness of portfolios in<br />

relation to student and teacher workload were also investigated. The informants in both<br />

surveys were professors and lecturers responsible for portfolio assessed courses. The<br />

survey data was analysed using standard statistical methods.<br />

We found that portfolio systems varied from advanced reflection-based models which<br />

included multiple text types and flexible feedback-practices to portfolios that consisted of<br />

factual texts with rudimentary feedback procedures and no reflective texts. We found<br />

systematic variations between professional educational institutions and universities, but also<br />

between ‘soft’ and ‘hard’ disciplines within the same institutions. Portfolio practices were<br />

diverse and a common understanding seemed lacking; a finding that may be due to the<br />

early stage of implementation and the complex motivation for initiating change. Feedback<br />

was considered very important, but even when peer feedback was being used, training in<br />

how to give feedback or discussion of quality criteria were not common.<br />

The Norwegian disciplinary content portfolio falls under the category that Hartnell-Young<br />

calls “subject-specific” but tries to combine formative and summative assessment. It is often<br />

digital but differs from the learner-specific e-portfolio described by Hartnell-Young by not<br />

aiming at building up repositories over time or presentation for an out-of-class audience.<br />

We base our discussion on socio-cultural perspectives on learning and focus particularly on<br />

how macro level policy decisions in Norway have affected the use of portfolios for<br />

assessment, and how disciplinary cultures at department level have shaped both the<br />

conceptual understanding and practical use of portfolios at meso level. A question arising<br />

from this is how portfolios in different European countries are influenced by different<br />

sociocultural contexts.<br />

ENAC 2008 27


Symposium: Portfolios in Higher Education / Paper 3:<br />

Portfolio diversity in Belgian (Flemish) Higher Education –<br />

A comparative study of eight cases<br />

Wil Meeus, University of Antwerp, Belgium<br />

Peter Van Petegem, University of Antwerp, Belgium<br />

An international literature study on portfolio in higher education led to a timeline<br />

distinguishing between four modes of implementation (Meeus et al, 2006). These range<br />

from the use of portfolio in admissions to higher education, during the higher education<br />

course, on entry into the profession and for ongoing professional development. In this study<br />

we focus on portfolios used during higher education courses.<br />

There are a large number of portfolio applications in use in higher education courses in the<br />

Flemish part of Belgium. Although practitioners and scholars talk about portfolios as a<br />

standard concept, most of the portfolios they refer to seem to be very different. This study<br />

investigates the diversity of portfolio applications used within higher education in Flanders.<br />

The research questions are: In what way do portfolio applications differ within the large area<br />

of higher education courses? Can enough commonality be detected to claim the existence<br />

of a standard portfolio concept?<br />

Eight portfolio applications were randomly selected, all in different higher education courses<br />

in Flanders: elementary teacher education, secondary teacher education, graphic and<br />

digital media, speech therapy, podiatry, nursing, academic teacher education, and physical<br />

education. The first six courses are organized at colleges, the last two at universities. A<br />

comparative framework was developed while gathering information on the portfolio<br />

applications. Source triangulation was used combining document analyses, interviews with<br />

portfolio supervisors, and focus groups with students.<br />

The comparative framework defines sixteen different characteristics within five categories:<br />

phase of implementation, function, ingredients, ICT-format and mode of supervision. All<br />

eight portfolios differ remarkably. No two portfolios have identical characteristics. This leads<br />

to the conclusion that general pronouncements on portfolio in higher education are<br />

problematic given the divergent interpretations of the concept. A clear description of the<br />

portfolio characteristics should be part of all scholarly papers on portfolio if conclusions are<br />

meant to be meaningful.<br />

References<br />

Meeus, W., Van Petegem, P., Van Looy, L. (2006). Portfolio in Higher Education: Time for a Clarificatory<br />

Framework. International Journal of Teaching and Learning in Higher Education, 17(2), 127-135.<br />

28 ENAC 2008


Symposium: Group work assessment<br />

Aims, values and ethical considerations in group work assessment<br />

Organiser: Lorraine Foreman-Peck, The University of Northampton, United Kingdom<br />

In this symposium ‘group work’ is understood as assignments carried out by students,<br />

largely independently of the tutor and usually outside normal class contact time. Group work<br />

is used extensively in higher education in the UK, in a variety of ways, from assignments<br />

that are relatively short to those forming the major part of the course. They are pervasive,<br />

not only because they are seen as providing an educationally valuable learning experience,<br />

but also because they are generally believed to develop skills useful to employers.<br />

The requirement for group work assessment to be fair and transparent is problematic (e.g.<br />

Race 2001). This is most evident with group dynamics. In these instances students may fail<br />

to work optimally together, and then,undergo a negative and damaging experience. The<br />

literature suggests that these cases occur regularly, but may affect only a minority of<br />

students in any one cohort (e.g. Parsons 2004), and are dealt with in an ad hoc and opaque<br />

manner.<br />

Proposed solutions to group dysfunction are usually technical, focussing for example on the<br />

validity and reliability of different methods of allocating marks (e.g. Magin 2001). However<br />

this approach does not appear to address a host of value questions that commonly arise,<br />

such as: ‘Is it right to allow students to evict underperforming students from their groups?’;<br />

‘Ought equal marks for unequal contributions be given?’; ‘Should a whole group fail as a<br />

consequence of plagiarism committed by one student?’<br />

These and other questions point to the need to conceptualise and contextualise in more<br />

depth: the practice of fair group work assessment. the principles and rules by which groups<br />

should abide, the extent and nature of tutor facilitation, and the prevention of dysfunctional<br />

group dynamics. These issues are explored by the symposium participants in the context of<br />

their own practices.<br />

Participants in the symposium belong to a team of tutors from the Universities of<br />

Northampton and Northumberland who have been working together on group work practice<br />

since October 2007. Their approach is action research as a form of ‘practical philosophy’<br />

(Elliott 2007). This involves tutors in identifying and clarifying ethical challenges in their own<br />

teaching and evaluating possible solutions based on defensible educational values. From<br />

these coordinated case studies, grounded insights into group work practice across a range<br />

of disciplines are derived, along with suggested institutional policy recommendations.<br />

ENAC 2008 29


Symposium: Group work assessment / Paper 1:<br />

Involuntary Free Riding – how status affects performance in a group project<br />

Julia Vernon, The University of Northampton, United Kingdom<br />

Many studies (Maguire and Edmondson 2001, Mills 2003, Gupta 2004, Greenan et al<br />

1997), note the positive effects on learning which groupwork may engender, and others<br />

(Knight 2004, Hand 2001) discuss the existence of phenomena such as social loafing<br />

(Latané et al.,1979), free-riding (Albanese & Van Fleet, 1985) and the inequity of workload.<br />

The findings of Whyte (1943), Cottrell (1972), Webb (1992) and Ingleton (1995) show how<br />

the effects of group dynamics can have a positive or negative effect on the performance of<br />

group members.<br />

In this action research case study, we attempt to counter the negative effect that working in<br />

a group may have on the self-esteem and confidence of individuals, when they are teamed<br />

with students considerably more skilled than themselves. It focuses on a prolonged<br />

groupwork project, part of a Level 5 undergraduate course in Business Computing, which<br />

simulates a web development consultancy company, and involves team members taking on<br />

a variety of roles, in which they demonstrate different skills.<br />

As the project proceeds, it is noted that the status of individuals within the groups becomes<br />

polarised. The high status individuals have a strong sense of ownership, and a decreasing<br />

degree of trust in the work of others. Low-status members are inclined to defer to their<br />

team-mates and to draw back from expressing opinion in decision-making situations, or<br />

giving explanations of their own work. It is suggested in this paper that students who have<br />

every intention of contributing fully to the project, nevertheless, through these effects, find<br />

themselves in a position of being involuntary ‘free-riders’.<br />

In this research measures were introduced to support the groups and individuals, and to<br />

counter the negative effects noted. These actions took place around the middle of the<br />

period of the project, when in previous years there has been a lull in group activity, and<br />

problems have arisen. A facilitated session was arranged for the students, to bring issues of<br />

groupwork into the open, and develop strategies to improve group cohesion. Discussions<br />

followed from this and students were counselled individually to trace areas of difficulty. A<br />

formative assessment was introduced in the form of an individual presentation to the group,<br />

where each student explained their role and how they were carrying it out.<br />

Revisiting issues, after students have been able to experience them first-hand, resulted in a<br />

much more thoughtful response than when discussed early in the project. In addition the<br />

requirement to prove individual contribution brought about some task re-negotiation.<br />

Dominant members were seen to rein back in some aspects of the control they had<br />

exercised, while submissive members pushed themselves to take a lead on an important<br />

part of the project. Crucially, awareness of group dynamics had increased and was seen in<br />

less simplistic terms. The measures taken had the effect of alleviating the effects noted,<br />

supporting positive help-giving and knowledge transfer within the group, and allowing<br />

members to contribute more fully to the group task.<br />

30 ENAC 2008


Symposium: Group work assessment / Paper 2:<br />

Facilitating Group work: Leading or empowering?<br />

Julie Jones, The University of Northampton, United Kingdom<br />

Andrew Smith, The University of Northampton, United Kingdom<br />

Year two students of the Foundation Degree in Learning and Teaching, study a module<br />

relating to special educational needs/inclusion. Assessment is through a collaborative group<br />

project and a personal project diary with a reflective statement. We felt that the assessment<br />

strategy did not sufficiently discriminate between students: virtually all achieved very high<br />

grades. Concern was further prompted by awareness of recent research into issues of the<br />

fairness, justice and reliability of group work (Maguire and Edmondson 2001, Barnfield<br />

2003, Knight 2004, Skinner et al 2004) and of motivational factors including the effect of<br />

rewarding the group product or the individual contribution (Chapman 2002) and issues of<br />

inter-relationships in groups (Arango 2007).<br />

Reflective statements and evaluation feedback from the 2006/7 cohort identified concerns relating<br />

to some students acting as ‘passengers’ ,but being awarded the same high grade for the module<br />

as those members who completely engaged with the work. This is a well documented problem<br />

identified by others (Ransom 1997, Parsons 2002, Hand 2001, Cheng and Warren 2000).<br />

In addition the Course Team found tutor guidance was a complicating factor: it was felt that<br />

it was a major contributor to the high grades awarded. There was a concern that this<br />

facilitation encouraged some students’ lack of engagement by allowing them to be led<br />

rather than, as was intended, empowering them to develop their own projects.<br />

These observations prompted a reformulation of the assessment strategy for the 07-08<br />

cohort. The weightings were altered from 80% to 60% for the group assessed project and<br />

from 20% to 40% for the individual elements. In order to assess the effects of this and to<br />

gain insight into issues such as empowerment, especially those involving tutor facilitation,<br />

data was collected on the following:<br />

How the group:<br />

• formed and decided upon the project focus<br />

• sustained motivation and whether this was linked to a perception that individual<br />

contributions supported the group assessment or individual assessment, or both<br />

• managed inter-personal professional working relationships<br />

• managed equitable sharing of the work-load<br />

• their perceptions and use of the guidance available from the module tutor<br />

This involved:<br />

• analysis of 2007-08 students’ diaries and reflective statements<br />

• interviews with 2007-08 students<br />

• analysis of diaries and reflective statements from the 2006-07 cohort<br />

• interviews with students from the 2006-07 cohort asking them to reflect retrospectively<br />

on their experiences.<br />

• A reflective dairy written by the facilitating tutor for 07-08<br />

From this comparative evaluation we will explore firstly whether the amendments to the assessment<br />

weightings made a difference in students’ perceptions of the fairness of the assessment strategy<br />

and secondly the effect the level and nature of tutor facilitation had on group dynamics, especially in<br />

the areas of, communications, task sharing, empowerment and ownership.<br />

It is expected that the research will have implications for tutors’ thinking about assessment<br />

weightings and will throw light on the ethical dilemmas surrounding the issues of the guidance and<br />

facilitation of group work.<br />

ENAC 2008 31


Symposium: Group work assessment / Paper 3:<br />

Marginalised students in group work assessment:<br />

ethical issues of group formation and the effective support of such individuals<br />

Antony Mellor, Northumbria University, United Kingdom<br />

Jane Entwistle, Northumbria University, United Kingdom<br />

In this project we focus on our experience of group work assessment over a number of<br />

years on a 20 credit, year-long, option module Soil Degradation and Rehabilitation, which<br />

forms part of the final year of our BSc (Hons) Geography degree programme and has a<br />

cohort of around 30 students each year. The group assessment comprises 40% of the<br />

module marks and includes a group oral presentation and written report. We became<br />

concerned about a number of issues adversely affecting the student learning experience,<br />

such as marginalised individuals, adverse group dynamics and unequal contributions by<br />

individuals within groups (Mills 2003, Hand 2001). Of specific concern, however, were the<br />

ethics of group formation (Chang 1999, Knight 2004). Allowing self-selected groups<br />

inevitably leaves some students marginalised and in a position where they may be not only<br />

disadvantaged materially in terms of marks but also could be personally affected in a<br />

negative way. In this paper we explore to what extent is it our duty to address the needs of<br />

these students, as well as ways of maintaining equity and transparency in tutor-led support<br />

across the entire cohort.<br />

Using an action research approach (Carr 2006, Elliott 2007), we implemented four key<br />

interventions:<br />

• To make timetabled sessions available for the groups to meet and also to discuss<br />

progress with the tutor, thus addressing the practical problem of lack of opportunity to<br />

meet and facilitating group interaction early on in the process.<br />

• To allow groups to play to their strengths. We encouraged the students to think about<br />

their strengths in terms of the tasks required as part of this assignment to identify what<br />

their contribution might be and their role within that group.<br />

• To provide formative feedback on drafts of the written report. This enabled us to<br />

encourage and promote the need for a dialogue between group members where a<br />

synthesis of materials was lacking.<br />

• To include an individual critical reflection component as part of the assignment. We<br />

aimed to promote reflection on the learning inherent in the activity regardless of the form<br />

of the experience or the summative mark of the end product.<br />

Data were collected using a written teacher log, the students’ critical reflections, and a<br />

student questionnaire following completion of the project. Of the four interventions noted<br />

above, all had a positive role to play in supporting isolated and marginalised students with<br />

their experience of group-work. The fourth, that of individual critical reflection, was perhaps<br />

the least successful across the cohort as a whole because the students were relatively<br />

inexperienced in this way of thinking and writing, coming largely from a scientific<br />

background. It did however provide a platform for student grievances and issues to be<br />

raised, and facilitated their ability to develop different approaches to solving more abstract<br />

problems. Outcomes from this intervention will also be considered in the planning of this<br />

group assessment in future years.<br />

32 ENAC 2008


Symposium: Multidimensional measurement models:<br />

Multidimensional measurement models of students’ competencies<br />

Organiser: Johannes Hartig, German Institute for International Educational Research<br />

(DIPF), Germany<br />

Most measurement models applied in traditional educational assessments implicitly or explicitly<br />

assume that test results can be described in terms of single ability dimensions. That is,<br />

individual performance differences in all assessment tasks are attributed to differences in one<br />

common ability dimension. These unidimensional models are useful in many contexts,<br />

especially if the performance domain of interest is relatively narrow, or if the goal of the<br />

assessment is a mere summative description of student achievement. Large scale<br />

assessments, for instance, keep the dimensionality of their instruments low by purpose since<br />

their goal is the description of achievement levels of large groups in broad content domains.<br />

However, if performance in a more complex domain of competence is to be assessed, or if the<br />

goal of the assessment is a deeper understanding of the underlying individual differences, the<br />

use of unidimensional measurement models may be unsatisfactory. For instance, performance<br />

in a complex domain of competence may be attributed to multiple, distinguishable abilities and<br />

the goal of an assessment may be to obtain differentiated individual profiles of these abilities.<br />

Or, the assumption that all tasks used in an assessment measure the same single ability<br />

dimension for all students may not be realistic because different students may draw on different<br />

knowledge and strategies to arrive at the same solutions.<br />

If the goal of the assessment is a deeper understanding of observed performance<br />

differences, or if unidimensional models fail to adequately explain test outcomes in complex<br />

tasks, more complex, multidimensional measurement models can be employed as an<br />

alternative to unidimensional models. These models can be used to identify systematic<br />

causes for violations of the unidimensional model, and to test more differentiated theoretical<br />

models of students’ competencies. In the latter case, the analysis of multidimensional<br />

models requires stronger theoretical assumptions than unidimensional models, the<br />

application of more advanced statistical techniques, and typically larger sample sizes. In<br />

exchange, multidimensional measurement models hold considerable promise for the<br />

empirical examination of differentiated models of performance in complex domains and<br />

heterogeneous populations.<br />

The symposium will present different approaches and applications of multidimensional<br />

measurement models. The first paper focuses on methods to systematically identify and<br />

explain violations of the assumption of unidimensional constructs in the domain of<br />

mathematical problem solving. Variables interacting with psychometric properties of single<br />

items or subgroups of items are identified in order to achieve a better understanding of the<br />

assessed competence. The second paper presents an application of cognitive diagnosis<br />

models (CDMs) to a mathematic test for elementary school. These multidimensional latent<br />

class models allow the construction of differentiated models of response processes, taking<br />

into account multiple basic abilities. The third paper focuses on a differentiated diagnosis of<br />

basic abilities in a foreign language assessment. Performance in listening comprehension<br />

items is decomposed into general text comprehension and auditory processing abilities<br />

using a two-dimensional IRT model.<br />

The papers will be discussed with respect to the potential benefit of multidimensional<br />

measurement models in different contexts of application, and the theoretical requirements<br />

of different models.<br />

ENAC 2008 33


Symposium: Multidimensional measurement models / Paper 1:<br />

Evaluation of non-unidimensional item contents using diagnostic results from Raschanalysis<br />

Markus Wirtz, Freiburg Universit of Education, Germany<br />

Timo Leuders, Freiburg Universit of Education, Germany<br />

Marianne Bayrhuber, Freiburg Universit of Education, Germany<br />

Regina Bruder, Darmstadt Technical University, Germany<br />

Competence scales, which have been developed and evaluated by means of Rasch analysis,<br />

possess optimal properties if diagnostic results are sought to reflect systematic and reliable<br />

differences between students in competence domains. Such scales allow a strictly<br />

unidimensional assessment of competencies: The response probability for all items is determined<br />

by only one latent dimension and thus person characteristics can be interpreted unambiguously.<br />

Hence, a fair and meaningful comparison of subjects or subgroups is admissible.<br />

Unidimensionality can be statistically tested because of the local independence of items within<br />

Rasch homogeneous scales: items and persons are calibrated on a common latent trait, and<br />

the position of items and persons on the latent trait (i.e., item difficulties and individual abilities)<br />

must suffice to predict the observed data structures. For summative large scale assessments<br />

as PISA and TIMSS it is important that items fulfil these criteria. It has been argued, however,<br />

that competence constructs become restricted if items covering more than one ability are<br />

systematically eliminated. This may especially pose a problem if diagnostic results are to be<br />

interpreted and used in a formative manner in classroom contexts. If scales are supposed to<br />

identify didactically relevant information about students’ competencies and individual potentials<br />

for development, multidimensional item contents may be desirable. Such “unscalable” items<br />

may be particularly important to enable teachers to identify processing problems and failures.<br />

In order to enhance the practical benefits of applying the Rasch model in competence<br />

diagnostics, strategies will be presented and discussed which allow to systematically analyse<br />

violations of the assumptions of the Rasch model. Differential Item Functioning (DIF) and<br />

Mixed-Rasch-Analysis both provide techniques to identify systematic violations of model<br />

assumptions. Furthermore person-fit-measures can be used to identify covariates, which predict<br />

the fit of individual student’s answer profiles to the model, and to identify conspicuous profiles.<br />

Variables that affect statistical properties of single items or item groups may be identified on<br />

the item (e.g. including or excluding specific tasks in the classroom by different teachers) or<br />

the student level (e.g. processing strategies, preference for different mental<br />

representations). Effects of these variables can provide important information concerning<br />

the structure of the competence to be assessed (e.g. different student types or existence of<br />

typical erroneous conceptions).<br />

Data will be presented from a research project on models for heuristic competencies in<br />

mathematical problem solving. The use of problem solving strategies which demand the<br />

application of different representations (numerical, graphical, symbolic and verbal) and the<br />

systematic change between these representations are assessed by psychometric scales.<br />

The item pool is based on a sound didactical framework. Possible causes of item misfits will<br />

be evaluated in order to enhance the knowledge about the according competence domain.<br />

The purpose of the study is twofold: A Rasch-homogeneous assessment of sub-dimensions<br />

of the use of problem solving strategies will be developed, and a sophisticated diagnostic<br />

instrument for the identification of areas for special support needs will be provided. Within<br />

this talk systematic psychometric strategies are discussed, which may allow to achieve both<br />

of these goals.<br />

34 ENAC 2008


Symposium: Multidimensional measurement models / Paper 2:<br />

Modelling multidimensional structure via cognitive diagnosis models: Theoretical<br />

potentials and methodological limitations for practical applications<br />

Olga Kunina, IQB, Humboldt University Berlin, Germany<br />

Oliver Wilhelm, IQB, Humboldt University Berlin, Germany<br />

André A. Rupp, IQB, Humboldt University Berlin, Germany<br />

In large educational studies like PISA usually unidimensional probabilistic models of latent<br />

traits are used, assuming that the observed test results can be sufficiently explained by a<br />

single latent ability. However, if a deeper understanding of the underlying cognitive basic<br />

skills is intended this approach seems to hold some limitations in terms of adequate<br />

mapping of the complexity of the addressed abilities. Most constructs assessed in<br />

educational studies (e.g. language comprehension, mathematic performance) supposedly<br />

require different cognitive skills to succeed on an item or in the test. Cognitive diagnosis<br />

models (CDMs) can yield individual profiles of relevant basic skills. Based on the profile<br />

information detailed feedback can be provided and used in teaching classes or formative<br />

interventions.<br />

In methodological terms CDMs are confirmatory multidimensional latent-variable models<br />

suitable for efficiently modelling within-item multidimensionality. They usually contain discrete<br />

latent variables that allow for a multivariate classification of respondents. Importantly, in<br />

prototypical applications of CDMs the definitions of the latent “attributes” or “skills” are based<br />

on a cognitively grounded theory of response processes at a fine grain size.<br />

In this contribution we will first discuss these key features of CDMs. We will then illustrate<br />

how CDMs can be used in large-scale educational assessment by applying a variety of<br />

models to data from a newly developed diagnostic mathematics assessment for elementary<br />

school children (3rd and 4th grade). The mathematics assessment comprises counting and<br />

modelling tasks requiring addition, subtraction, multiplication, and division skills and aims to<br />

provide a differentiated profile of counting and modelling skills in basic arithmetic<br />

operations.<br />

Specifically, we will compare multidimensional profiles of children from selected<br />

compensatory and non-compensatory CDMs. Competing measurement models are<br />

compared in terms of absolute and relative model fit, attribute difficulty distributions, and<br />

latent class membership probabilities. To provide some evidence for methodological<br />

generalizability of the results, we will then compare the discrete profiles from the different<br />

CDMs with continuous multidimensional profiles from item response theory and<br />

confirmatory factor analysis models. In combination, these analyses will provide empirical<br />

insight into the cost-utility trade-offs of CDMs as well as into the conditions under which<br />

their theoretical potential can be realized in large-scale educational assessment practice.<br />

ENAC 2008 35


Symposium: Multidimensional measurement models / Paper 3:<br />

Modelling Specific Abilities for Listening Comprehension in a Foreign Language<br />

with a Multidimensional IRT Model<br />

Jana Höhler, German Institute for International Educational Research (DIPF), Germany<br />

Johannes Hartig, German Institute for International Educational Research (DIPF), Germany<br />

Multidimensional Item Response Theory (MIRT) provides an ideal foundation to model<br />

performance in complex domains, simultaneously taking into account multiple basic<br />

abilities. In MIRT models with a complex loading structure, mixtures of different abilities can<br />

be modeled to be necessary for specific items. These models allow to investigate the<br />

relative significance of different ability dimensions for specific items, i.e. what kind of ability<br />

is required to what extent for solving a specific item. Hence, sound theoretical assumptions<br />

about the interaction between the person and the test items and the nature of the relevant<br />

ability dimensions are required. Often these assumptions are hard to test empirically, since<br />

different complex models may be equivalent in terms of model fit. However, assumptions<br />

about the demands of specific test items allow the prediction of which items should be<br />

particularly strong related to specific ability dimensions. The aim of this paper is to illustrate<br />

how theoretical assumptions about the nature of different ability dimensions represented in<br />

MIRT models can be validated by testing these predictions, i.e. by relating MIRT model<br />

parameters to characteristics of the item content.<br />

The data of our empirical application is from a German large-scale assessment of 9th grade<br />

students’ language competencies. The analyses are based on the data from reading and<br />

listening comprehension tests of English as a foreign language. The listening<br />

comprehension items are very similar to the reading comprehension items, and it is<br />

reasonable to assume that they require similar abilities. Both tests require the decoding and<br />

understanding of English, as well as the processing and integration of the information<br />

retrieved. Both tests require the reading of written text, the multiple-choice items being<br />

presented in written English. Consequently, one latent ability dimension can be assumed to<br />

represent the abilities required for both tests. However, the listening comprehension test<br />

additionally requires the processing and understanding of spoken language. It therefore<br />

appears reasonable to assume a second latent dimension representing the abilities required<br />

exclusively for the listening comprehension items.<br />

A two-dimensional two-parameter (2PL) IRT-Model is applied to the data. The first<br />

dimension represents the abilities common to the reading and listening comprehension<br />

tests (“general text comprehension”), while the second dimension represents the abilities<br />

specific to listening comprehension (“auditory processing”). The focus of our analysis is the<br />

strength of the loadings of the listening comprehension items on the auditory processing<br />

dimension. In order to identify items that draw particularly on this dimension, a priori defined<br />

task characteristics are used to predict the respective items’ loadings. It can be shown that<br />

the loading on the auditory processing dimension is related to specific item characteristics,<br />

e.g. the complexity of the relevant text passage and the speed of speech. The results<br />

provide support for the presumed nature of the “auditory processing” dimension.<br />

Additionally, groups of students differ in their relative strength on both dimensions,<br />

illustrating the benefit of a differentiated analysis of basic ability dimensions in applied<br />

contexts.<br />

36 ENAC 2008


Symposium: Computer-based assessment:<br />

Recent Developments in Computer-Based Assessment:<br />

Chances for the measurement of Competence<br />

Organiser: Thomas Martens, German Institute for International Educational Research<br />

(DIPF), Germany<br />

During the last decade Computer-Based Assessment (CBA) has become more and more<br />

popular in the international testing and educational community. The major reason is that<br />

Computer-based tests can include elements that cannot be rendered on paper.<br />

For conducting CBAs powerful software systems have become available providing support<br />

to the entire assessment process, i.e., item and test development, test delivery and result<br />

reporting (cf. presentation by Latour, Martin, Plichart, Jadoul, Busana, and Swietlik-Simon).<br />

Moreover, to assess innovative constructs user-friendly tools for authoring complex<br />

interactive stimuli have been developed (cf. presentation by Goldhammer, Martens,<br />

Naumann, Rölke, and Scharaf).<br />

Basically, computer-based tests allow for a greater diversity of test stimuli and test<br />

interaction than Paper–Based Assessments (PBAs). This especially holds true with regard<br />

to the assessment of competencies (cf. presentation by Naumann, Jude, Goldhammer,<br />

Martens, Roelke, and Klieme). With PBAs it is almost impossible to measure competencies<br />

that involve a dynamic situation or settings that are drawn from real life. In contrast, CBA<br />

can integrate multimedia test content and complex interaction modes that simulate real-life<br />

situations, and, thereby, the validity with regard to the measured competencies can be<br />

increased.<br />

Regarding testee-item interaction CBAs enables automatic recording of reactions and<br />

response times, which could not be accomplished with printed material. In combination with<br />

interactive stimuli this set-up allows for the performance-based assessment of, for example,<br />

ICT literacy.<br />

Another test format which can only be administered using computers is computer-adaptive<br />

testing (CAT). Here the difficulty of the items is tailored to the individual competence level of<br />

the test taker, as to ensure that the subject does not receive items that are clearly too easy<br />

or too difficult for him. This method saves time and allows for a more accurate estimation of<br />

the test taker’s ability.<br />

In an educational context students increasingly use computers to study and to complete<br />

their tasks. The computer may even have become the standard tool for studying and<br />

problem solving, so using PBAs to assess these students might be inappropriate. For<br />

educational monitoring, CBAs and even web-based CBAs have become feasible and the<br />

benefits of applying CBA seem to far outweigh the challenges. Also, for some test persons<br />

CBAs might have positive influences on motivation during the test.<br />

In sum, CBA offers a great potential and is most probably the testing mode for<br />

competencies in the future.<br />

ENAC 2008 37


Symposium: Computer-based assessment / Paper 1:<br />

Enlarging the range of assessment modalities using CBA:<br />

New challenges for generic (web-based) platforms<br />

Thibaud Latour, Raynald Jadoul, Patrick Plichart, Judith Swietlik-Simon,<br />

Lionel Lecaque, Samuel Renault<br />

CRP Henri Tudor, Luxembourg<br />

It has long been advocated that Computer-Based Assessment (CBA) bears significant<br />

advantages with respect to paper-and-pencil instruments on both the testee as well as at the<br />

logistic and management levels. However, CBA does not cover all existing assessment<br />

modalities and will hardly replace other delivery modes and contexts, or human-restricted tasks.<br />

Taking full benefit of advanced computer and information technologies when simply shifting from<br />

paper-and-pencil tests to computerized instruments remains challenging on various aspects.<br />

Security issues related to both organisational and technological aspects are prevalent in high-stake<br />

testing, but also in most situations were strict measurement validity is crucial, such as large-scale<br />

assessments and monitoring. Security challenges range from test, item and content protection,<br />

processes integrity and secrecy, diffusion and validity control of tests and items, cheating detection,<br />

identity management, etc. Depending on the testing context, these issues are more or less easily<br />

tackled. However, when considering networked and loosely constrained testing situation addressing<br />

these issues becomes more challenging and technologically demanding.<br />

Advanced Result Exploitation techniques and models are necessary to take full benefit of<br />

the various kinds of user tracking capabilities enabled by the use of IT platforms. The<br />

challenge includes discovery, extraction, analysis and exploitation of potential patterns in<br />

the huge number of behavioural and chronometric data recorded during test execution<br />

together with the identification of their psychometric significance.<br />

New Forms of Testing and new form of instruments to perform social, collective and situational<br />

skill assessments are now becoming more conceivable following the maturation of so-called<br />

ambient intelligence technologies (pervasive computing, ubiquitous computing and advanced<br />

user experience). These kinds of assessments can be achieved through the use of simulations<br />

and games, 3D and immersive technologies, ubiquitous and mobile testing, collaborative testing,<br />

etc. Assessing business-related skills and jobs in an economically viable manner, i.e., reducing<br />

the testing time and test development costs with respect to the number of dimensions that<br />

compose a job description in terms of competencies; this requires the design of new multidimensional<br />

instruments, including techniques to rapidly screen the subject capabilities<br />

respective to a series of jobs reference descriptions.<br />

The Intelligent Management of e-Testing Resources becomes a key element for the<br />

generalisation of CBA in collaborative management settings where contents, models, possibly<br />

data, items, tests, etc are shared by remotely located stakeholders. Improving the capacity to<br />

qualify, annotate, exchange, and search e-testing resources in a distributed community will<br />

soon become a key element in the item and test production capacity enhancement. This<br />

challenge includes collaborative aspects in stakeholder networks, including P2P frameworks,<br />

semantic annotations of multimedia resources and definition of related ontologies, query<br />

propagation and advanced semantic searches, ruled-based item creation support, etc.<br />

In this contribution, we shall explore these challenges we consider important for the future and<br />

provide hints and potential roadmap to address these challenges from both a technological and<br />

psychometric perspective. The TAO (the French acronym for technology-based assessment)<br />

framework provides a general and open architecture for computer-assisted test development<br />

and delivery, with the potential to respond to most of the raised issues.<br />

38 ENAC 2008


Symposium: Computer-based assessment / Paper 2:<br />

Developing stimuli for electronic reading assessment: The hypertext-builder<br />

Frank Goldhammer, Thomas Martens, Johannes Naumann,<br />

Heiko Rölke, Alexander Scharaf<br />

German Institute for International Educational Research (DIPF), Germany<br />

Reflecting the increasing prevalence of technology in peoples’ everyday lives, the<br />

conception of reading literacy has evolved into a more comprehensive concept. More<br />

specifically, reading literacy referring to printed and mostly linear texts has been extended<br />

to also include the ability to successfully navigate and process non-linear electronic<br />

documents (hypertexts). During the last decade reading of electronic texts has become an<br />

activity of increasing importance amongst youths as well as adults.<br />

Against this background, sound research on the cognitive processing of electronic text is<br />

necessary both on fundamental and applied levels. To promote and facilitate research on<br />

reading electronic text, we have developed an authoring system to create electronic reading<br />

stimuli.<br />

The purpose of the present paper is twofold. First, we present a new graphical front-end tool<br />

for the computer based assessment platform TAO (the French acronym for technologybased<br />

assessment), the “Hypertext Builder”, which was developed to author items for<br />

electronic reading assessment. The Hypertext Builder was designed to facilitate the rapid<br />

development and implementation of complex electronic reading stimuli, covering all major<br />

text-types encountered in electronic reading such as websites, e-mail client environments,<br />

forums, or blogs.<br />

Second, after presenting the Hypertext Builder itself and demonstrating its features, we<br />

report first evidence for the proposition that Hypertext Builder created text stimuli capture<br />

specific features of electronic reading. We used Hypertext Builder created materials in an<br />

experiment designed to test the assumption that a greater degree of executive control is<br />

needed for the processing of hypertext compared to linear text because of the navigation<br />

demands imposed by hypertext. Sixty students read hypertexts and linear texts (withinsubject<br />

factor) under three secondary task conditions that either imposed no additional load,<br />

general dual-task load, or executive control load (between-subject factor).<br />

ENAC 2008 39


Symposium: Computer-based assessment / Paper 3:<br />

Component skills of electronic reading competence<br />

Johannes Naumann, Nina Jude, Frank Goldhammer, Thomas Martens,<br />

Heiko Rölke, Eckhard Klieme<br />

German Institute for International Educational Research (DIPF), Germany<br />

With the Internet having become a ubiquitous means for dissemination and distribution of<br />

opinions, news, and all other kinds of information around the world, skill in reading<br />

electronic documents may well be regarded as a key competence required for successful<br />

participation in society. Electronic documents are typically represented as non-linearly<br />

structured hypertexts. This means, on the one hand, compared to the reading of traditional<br />

printed text, that successful processing of electronic documents poses a number of<br />

additional demands on readers, such as making decisions whether to follow a certain link or<br />

not, or, if a link is followed, to keep in mind the original reading goal. On the other hand,<br />

electronic text allows for the implementation of signalling devices that may in fact facilitate<br />

processing, provided they are used adequately. Thus, reading of electronic text is a<br />

competence that cannot be easily mapped upon traditional text-processing skills. Rather, to<br />

successfully use electronic text, readers must have ample working memory resources to<br />

simultaneously accommodate for text processing and navigation in the first place. For that,<br />

basic reading processes need to be well-routinized as well, so that available working<br />

memory does not have be devoted to basic operations of text processing such as word<br />

recognition or semantic parsing. In addition, to deal with an electronic text’s non-linearity,<br />

e.g. to efficiently use navigational aids such as overviews or typed links, readers must have<br />

at their disposal adequate metacognitive strategies. Finally, computer skills may affect<br />

electronic reading competence, in that for reading electronic text readers must have at least<br />

some very basic computer knowledge, such as how to access an internet address or how to<br />

use a mouse.<br />

The present paper investigates which of these component skills actually affect individual<br />

competence in reading electronic documents. To assess electronic reading competence<br />

and related component skills, newly developed tests were implemented in an new testingtool.<br />

To assess electronic reading competence, this tool presents interactive stimuli that<br />

mimic real-life web sites, e.g. a medical or a job search site, with corresponding testquestions.<br />

Subsequently, students’ basic reading skill (lexical access and sentence<br />

comprehension), working memory capacity, metacognitive strategies and computer skill<br />

were assessed. A total of three-hundred students were sampled from 30 German schools.<br />

Test sessions lasted for three hours. Electronic reading competence was regressed on<br />

working memory capacity, basic reading skill, knowledge of metacognitive strategies, and<br />

computer skill. Using hierarchical linear models with students as level-1-units and schools<br />

as level-2-units, a substantial proportion of variance in electronic reading competence was<br />

explained by the proposed set of predictor variables. In addition, both level-1-intercepts and<br />

regression weights were found to vary between level-2-units (schools). As a consequence,<br />

future research should address not only which school-level conditions cause high average<br />

levels of electronic reading competence, but also the conditions under which electronic<br />

reading skill is more or less dependent on stable trait variables that cannot be changed<br />

easily, such as working memory capacity.<br />

40 ENAC 2008


Symposium: High-Stakes Performance-Based Assessment:<br />

Issues in High-Stakes Performance-Based Assessment of Clinical Competence<br />

Organiser: Godfrey Pell, University of Leeds, United Kingdom<br />

Chair/Discussant: Trudie Roberts, University of Leeds, United Kingdom<br />

In the current era of audit and accountability in medicine, stakeholders want assurance that<br />

graduates have attained the required level of clinical competence to be awarded their<br />

degree and a licence to practise. Final exit examinations are designed to provide that<br />

assurance.<br />

Within the field of medical education, the Objective Structured Clinical Examination (OSCE)<br />

is currently favoured for assessing clinical skills as it has been shown to be the most<br />

reliable, valid, fair and defensible format for this purpose (Harden and Gleeson;Newble).<br />

Assessing students’ clinical skills just prior to graduation provides the assurance that they<br />

have achieved the minimum level of clinical competence required for a licence to practice<br />

by both the General Medical Council (General Medical Council) and the Medical Council of<br />

Canada (Reznick, Blackmore, Cohen et. al., 1993).<br />

In an OSCE, candidates rotate through a series of time-limited ‘stations’ and perform a<br />

particular clinical task at each one. These tasks include clinical examination skills,<br />

communication skills or practical procedures (Boursicot and Roberts). In this way,<br />

candidates are tested across the range of skills and patient problems required for<br />

graduation and this is equated to ‘clinical competence’.<br />

Each station is observed by an examiner (usually a clinician or a trained observer) who<br />

scores the candidate using a checklist in which the steps of the particular clinical skill being<br />

assessed are listed as individual items. The examiners mark whether a candidate has<br />

performed each step correctly and the overall mark is the summation of the checklist item<br />

scores for any one station.<br />

The basic structure of an OSCE may vary; for example, the length of time allowed per<br />

station, whether checklists or rating scales are used for scoring, who scores (a clinician, a<br />

standardised patient, a trained lay observer) and whether real patients or manikins are<br />

used. However, the fundamental principle is that every candidate has to complete the same<br />

assignments in the same amount of time and is marked according to structured scoring<br />

instruments.<br />

Although the symposium is based on the OSCE, the issues being discussed should be of<br />

interest to delegates who use performance-based assessments for assessing professional<br />

competence.<br />

ENAC 2008 41


Symposium: High-Stakes Performance-Based Assessment / Paper 1:<br />

Lessons Learned from Administering a National OSCE for Medical Licensure<br />

David Blackmore, Medical Council of Canada, Canada<br />

Rationale<br />

Canada was the first country in the world to use a performance-based objective structured<br />

clinical examination (OSCE) as part of a national medical licensing examination. The<br />

Medical Council of Canada (MCC) awards the Licentiate (LMCC) which, in turn, is used as a<br />

prerequisite for licensure by the Canadian medical regulatory authorities. The pilot work and<br />

early testing of this examination took place in the late 1980s and examination<br />

implementation occurred in 1992. Since then, the Medical Council OSCE has undergone<br />

many changes in both its format and procedures. The experience gained from over<br />

35 administrations is the basis for the lessons learned.<br />

Methodology<br />

To be awarded the LMCC, an examinee must complete a two-part MCC Qualifying<br />

Examination (MCCQE). The MCCQE Part I is a one-day, computer-administered<br />

examination which the examinees usually take upon graduation from medical school. The<br />

examinee must then complete at least 12 months of postgraduate training before attempting<br />

the MCCQE Part II which is a multi-station OSCE administered to over 3000 examinees<br />

annually across 16 examination sites within Canada. Each OSCE consists of multiple<br />

stations where a physician examinee interacts with a standardized patient. A physician<br />

examiner observes the encounter in real-time and scores a checklist and global ratings.<br />

Some stations are followed by a written exercise and some stations contain a structured<br />

oral question administered by the physician examiner at the end of the encounter.<br />

Discussion<br />

In 1992, the MCCQE Part II consisted of 20 stations: 10 ten-minute patient-encounter<br />

stations and 10 five-minute patient-encounter stations followed by five-minute written<br />

exercises known as post encounter probes (PEPs). Two forms of this examination were<br />

constructed where one form was administered on a Saturday and the second form was<br />

administered on the following day. Two forms were required to accommodate the many<br />

examinees which needed to be tested concurrently across Canada. By 2008, the MCCQE<br />

Part II has evolved into an examination consisting of 14 stations: 7 ten-minute stations,<br />

5 five-minute stations followed by PEPs, and 2 pilot or pretest stations.<br />

Many lessons have been learned from 16 years of testing tens of thousands of physicians<br />

at multiple examination sites in a high-stakes licensing examination. Issues related to<br />

examination administration such as standardized patient recruiting and training; examiner<br />

recruitment, training, and retention; examination implementation, scoring, standard setting,<br />

and reporting have arisen over time. In addition, there have been several challenges arising<br />

from administering a multi-site examination across six different time zones. Some<br />

examination techniques have worked out better than others and the MCC examination has<br />

improved as it has matured. The MCC is also addressing new challenges related to<br />

changing examination content such as measuring professionalism and team participation.<br />

This presentation outlines the challenges and solutions to performance testing that have<br />

presented themselves as a result of the MCC employing the OSCE format in a high-stakes<br />

licensing examination.<br />

42 ENAC 2008


Symposium: High-Stakes Performance-Based Assessment / Paper 2:<br />

Quality Assurance through the OSCE Life Cycle<br />

Sydney Smee, Medical Council of Canada, Canada<br />

Rationale<br />

While the Objective Structured Clinical Examination (OSCE) is a format that allows for valid,<br />

reliable and fair testing of clinical skills, creating an OSCE that meets these criteria requires<br />

effort. When that effort is made, decisions based on the OSCE scores become defensible.<br />

A range of quality assurance measures can be taken throughout the life cycle of an OSCE<br />

to ensure that it meets testing standards such as those set by the American Educational<br />

Research Association (AERA), American Psychological Association (APA) and the National<br />

Council of Measurement in Education (NCME). Which quality assurance measures and to<br />

what degree should they be implemented is a judgment call based on the consequences of<br />

the decisions for test takers and test users.<br />

The Medical Council of Canada’s Qualifying Examination Part II is an OSCE scored by<br />

physicians and administered across multiple sites to candidates who have successfully<br />

completed 12 months of post-graduate training. This OSCE is a prerequisite for medical<br />

licensure in Canada and so considerable effort is made to ensure the testing process is fair<br />

and the scores are sufficiently valid and reliable for making high-stakes pass-fail decisions.<br />

The Medical Council’s quality assurance practices and the rationale behind them are<br />

discussed so the value of these practices for other settings can be considered.<br />

Methodology<br />

The processes used by the Medical Council at each of five stages in an OSCE cycle will be<br />

described, with links to the 1999 Standards for Educational and Psychological Testing<br />

(AERA, APA, & NCME):<br />

1. Validity through Case Development<br />

2. Improving Reliability with Standardized Patient Training, Staff Orientation and Examiner<br />

Briefing<br />

3. Steps to Ensure Fair OSCE Administration<br />

4. Validity and Reliability - Psychometrics and Standard Setting<br />

5. More about Fairness - Incidents and Appeals<br />

Discussion<br />

The discussion will look at the importance of these quality assurance processes, the<br />

Medical Council’s rationale for certain approaches and the implementation challenges that<br />

have been encountered over sixteen years experience with the Part II OSCE. Quality<br />

assurance minimizes the risk of false positive and false negative pass-fail decisions.<br />

Without quality assurance, pass-fail decisions are not defensible. Therefore the discussion<br />

will consider which of the approaches used for this high stakes OSCE could be adapted to<br />

other settings; for example, medical schools.<br />

ENAC 2008 43


Symposium: High-Stakes Performance-Based Assessment / Paper 3:<br />

Investigating OSCE Error Variance when measuring higher level competencies<br />

Godfrey Pell, University of Leeds, United Kingdom<br />

Richard Fuller, University of Leeds, United Kingdom<br />

Rationale<br />

Standardization and reliability are major concerns with Objective structured Clinical<br />

Examinations (OSCEs), but quality metrics permit deeper analysis of examination<br />

performance. This paper investigates the relationship between OSCE structure and error<br />

variance (i.e., variance due to factors other than student performance), building on previous<br />

research into sources of error variance.<br />

Methodology<br />

Analysis of recent 3rd, 4th & 5th (final) year OSCE results from the University of Leeds is<br />

considered to highlight the important problems that exist with error variance in OSCE<br />

scores. The impact of revisions to examiner instructions and item checklists / mark sheets,<br />

most notably the inclusion of intermediate grade descriptors and a reduction in the number<br />

of checklist items, is then assessed using 2007 and 2008 5th year OSCE data.<br />

Discussion<br />

Although error variance may be simply defined as that variance which is due to factors other<br />

than differences in performance caused by varying student ability, it remains possible to<br />

construct a variety of models of differing complexity to quantify this error. Discussion will<br />

include consideration of which of these models may be the most appropriate.<br />

Other questions to be addressed include:<br />

• What can we learn about problem OCSE stations from the metrics available to us, and<br />

how might this inform us with respect to development of improved assessments?<br />

• Should we have long or short checklists?<br />

• How can we measure higher level competencies?<br />

• What effects do these and other issues have on reliability?<br />

44 ENAC 2008


Symposium: High-Stakes Performance-Based Assessment / Paper 4:<br />

Beyond checklist scoring –<br />

clinicians’ perceptions of inadequate clinical performance<br />

Katharine Boursicot, University of London, United Kingdom<br />

Trudie Roberts, University of Leeds, United Kingdom<br />

Jenny Higham, Imperial College London, United Kingdom<br />

Jane Dacre, University College London, United Kingdom<br />

Rationale: There has been concern among medical educators and practitioners that OSCEs<br />

test only practical technical skills and do not scrutinise the deeper layer of understanding<br />

supporting those skills. The notion of having a checklist of items to measure the performance of<br />

clinical skills has been criticised for being reductionist and failing to capture the higher-order<br />

nature of clinical judgement and diagnosis. To try to understand what elements assessors felt<br />

were not captured by existing checklists we instituted the use of ‘Cause for Concern’ forms<br />

whereby examiners could report a candidate’s performance they found to be unacceptable.<br />

Methodology: Four medical schools introduced the use of the Cause for Concern forms as an<br />

addition to the checklists in each of their respective graduation OSCEs . The OSCEs consisted<br />

of 17 to 26 different stations across the four schools. The examiners were clinicians, some of<br />

whom were academics from medical school faculties. All were medical or other healthcare<br />

practitioners involved in the teaching and assessment of medical students; all were familiar with<br />

the standards expected at graduation and current professional medical practice. All examiners<br />

were offered training sessions on examining at OSCEs. Examiners were briefed to use the<br />

‘Cause for Concern’ forms about students whose performance in any area would mean that<br />

patient care could be compromised but where the issues causing concern were not captured by<br />

the checklists. After the OSCEs, the forms were collected and analysed. Three raters<br />

independently reviewed the comments from all the medical schools and derived recurring<br />

themes. These were refined into seven themes. The raters independently ascribed all the<br />

comments to each of the themes and dominant themes were identified.<br />

Results: The total numbers of forms completed from all four medical schools was 152 out of<br />

a total of 25,800 student-examiner encounters; this represents a reporting rate of 0.6%. The<br />

seven themes identified by the three raters were: Clinical skills–poor technique; Clinical<br />

skills–failure to elicit or recognise correct signs; Poor diagnostic ability (interpretation of<br />

signs); Poor/inadequate knowledge; Professional behaviour–personal (e.g. anxiety, lack<br />

of/over-confidence, appearance); Professional behaviour–towards patient (e.g. rough,<br />

inappropriate attitude); Poor communication skills (written and oral).<br />

Discussion: There was much commonality in the themes reported across the four medical<br />

schools. The themes reflected the anecdotal evidence which prompted this study, in that<br />

clinician examiners were concerned that professional behaviours and higher level<br />

diagnostic skills were lacking in students who nonetheless managed to pass the OSCE<br />

based on their checklist score. The main areas of concern which were reported related to<br />

fundamental medical skills or professional behaviour. Overall the reporting rate was very<br />

low, indicating that the overwhelming majority of students satisfied the clinician examiners<br />

with their clinical competence and professional behaviour. When making decisions about<br />

graduation, the ‘Cause for Concern’ forms could be used in addition to checklists to gain a<br />

fuller perspective on those students where there are concerns about meeting minimum<br />

clinical competence requirements and unacceptable professional behaviours.<br />

ENAC 2008 45


Symposium: Peer Assessment:<br />

Towards (quasi-) experimental research on the design of peer assessment<br />

Organiser: Dominique Sluijsmans, Open University, The Netherlands<br />

Peer assessment is an arrangement where equal status students judge a peers’<br />

performance with a rating scheme or qualitative report (Topping, 1998), and stimulates<br />

students to share responsibility, reflect, discuss and collaborate with their peers (Boud,<br />

1990; Orsmond, Merry, & Callaghan, 2004). To date, most studies on peer assessment<br />

treat the quality of peer assessment – i.e., performance improvement and learning benefits<br />

– as a derivative of the accuracy of peer marks compared to teacher marks (Falchikov &<br />

Goldfinch, 2000). In addition, many researchers advocate that peer assessment has a<br />

positive effect on learning, but the empirical evidence is either based on student self-report<br />

ratings or anecdotal evidence from case studies and not on standardised performance<br />

improvement measures. Furthermore, peer assessment is rarely studied in (quasi-)<br />

experimental settings (comparing an experimental group to a control or baseline group),<br />

which considerably limits the claims and evidence regarding specific conditions that are<br />

believed to affect learning. Hence, the empirical support for learning effects, as well as for<br />

specific peer assessment conditions, is scarce.<br />

In this symposium, three contributions are presented that are interlinked in three ways: 1)<br />

they investigate peer assessment from a (quasi) experimental perspective; 2) they address<br />

the value of peer assessment for the individual learner, and 3) they aim at providing clear<br />

guidelines on the design of peer assessment.<br />

In the first contribution by van Zundert et al., it is studied how task complexity and the<br />

structure of the peer assessment formats used by students to conduct the peer<br />

assessment, affects students’ domain-specific learning and their peer assessment skill. In<br />

addition, the impact of cognitive load, students’ attitudes towards peer assessment, and<br />

transfer are considered. Finally, generalisability analyses will performed to determine<br />

reliability of peer assessments in relation to different formats. In the second contribution by<br />

Sluijsmans et al. peer assessment is investigated from the perspective of group work. The<br />

effects of four peer assessment design variations on individual marks and the reliability of<br />

peer assessment are investigated. The results show that the design of a peer assessment<br />

method strongly influences the transformation of a group mark into individual marks. The<br />

third contribution by Strijbos et al. studies the impact of peer feedback content and<br />

characteristics of the sender. Previous studies show that students express concerns about<br />

the fairness and usefulness of peer assessment, and Strijbos et al. hypothesise that this<br />

finding may be related to sender characteristics that feedback perception may influence the<br />

effect of peer feedback and subsequent performance.<br />

For peer assessment research to advance, identifying the gap between what we know<br />

about peer assessment and what we claim about peer assessment is crucial. In this respect<br />

we advocate more (quasi) experimental research – enabling the investigation of specific<br />

components and conditions as compared to holistic evaluations via case studies. Although<br />

the thick description in such case studies of specific peer assessment provide a wealth of<br />

evidence for hypothesis generation, these theories or guidelines should subsequently be<br />

tested in controlled experimental settings to warrant generalisations.<br />

46 ENAC 2008


Symposium: Peer Assessment / Paper 1:<br />

The effects of peer assessment format and task complexity on learning and<br />

measurements<br />

Marjo van Zundert, Open University, The Netherlands<br />

Dominique Sluijsmans, Open University, The Netherlands<br />

Jeroen van Merriënboer, Open University, The Netherlands<br />

Former research by Van Zundert, Sluijsmans, and Van Merriënboer (in progress)<br />

emphasised that variety in peer assessment practices and holistic reports (i.e., without<br />

specifying all variables) in peer assessment research reveal the nesessty to specify what<br />

exactly contributes to learning and measurements (e.g., reliability). Moreover, it was shown<br />

that the share of (quasi-) experimental peer assessment studies is insufficient. The current<br />

study examined the effects of peer assessment formats and task complexity on learning<br />

(domain skill, peer assessment skill, and student attitudes) and measurements (agreement<br />

between peer and expert assessment, between multiple peer assessments, and between<br />

qualitative and quantitative assessments). It was assumed that a highly structured peer<br />

assessment format increases learning and measurements. High structured formats differed<br />

from low structured formats by integration of first and higher order skills, whole-task<br />

approach, and low cognitive load. It was additionally assumed that a high structured format<br />

is especially beneficial for complex tasks. Complex tasks induce a higher cognitive load,<br />

which was achieved by editing simple tasks according to three principles of element<br />

interactivity of Cognitive Load Theory. Participants were 110 secondary education students.<br />

They worked in an electronic learning environment through a series of questionnaires and<br />

tasks. The students were randomly assigned to one of the four conditions: low/high<br />

structured formats – simple/complex tasks. After an introduction students logged on to a<br />

computer and completed an attitude questionnaire. Then they studied four study tasks<br />

accompanied by a peer assessment format. In the tasks, which consisted of short<br />

descriptions of biology research, students were supposed to learn to recognise the six steps<br />

of scientific research (i.e., observation, problem statement, hypothesis, experimental stage,<br />

results, and conclusions). The study tasks were read carefully. After each study task<br />

students reported the cognitive load measure of Paas, Van Merriënboer and Adams (1994).<br />

Next, they solved two transfer tasks (attaching the steps of research to the matching<br />

research description), again followed by the cognitive load measure. Subsequently students<br />

received two peer assessment tasks (evaluating the solution of a fictitious peer), and<br />

reported the cognitive load measure. Finally the attitude questionnaire was completed again<br />

and students logged out. Data will be analysed by ANOVA and generalisability analyses. As<br />

opposed to much previous research, this study attempted to clarify peer assessment effects<br />

by applying quasi experimental research, and by using specific instead of holistic reports.<br />

More (quasi-) experimental research is required in the future, to provide transparency in<br />

peer assessment variety and to account for peer assessment effects.<br />

ENAC 2008 47


Symposium: Peer Assessment / Paper 2:<br />

Modelling the impact of individual contributions on peer assessment during<br />

group work in teacher training: In search of flexibility<br />

Dominique Sluijsmans, Open University, The Netherlands<br />

Jan-Willem Strijbos, Leiden University, The Netherlands<br />

Gerard Van de Watering, Eindhoven University of Technology, The Netherlands<br />

During collaborative learning students work together to accomplish a specific group task,<br />

e.g. performing an experiment, writing a collaborative report, carrying out a group project or<br />

a group presentation. These group tasks aim to facilitate peer learning and the development<br />

of collaboration skills. However, since the assessment strongly influences learning in any<br />

course, utilising collaborative learning must have assessment that promotes collaboration<br />

(Frederiksen, 1984). Social loafing (tendency to reduce individual effort when working in<br />

groups compared to individual effort expended when working alone; see Williams & Karau,<br />

1991) and ‘free riding’ (an individual does not bear a proportional amount of the group work<br />

and yet s/he shares the benefits of the group; see Kerr & Bruun, 1983) are two often voiced<br />

complaints by students regarding unsatisfactory group-work experiences (Johnston & Miles,<br />

2004). Positive interdependence and individual accountability (the latter explicitly introduced<br />

to counter free-riding) play a crucial role during group work In order for a group to be<br />

successful, all group members need to understand that they are each individually<br />

accountable for at least one aspect of the group task. Teachers regard a peer assessment<br />

as a valuable and practical tool to reduce social loafing and free-riding effects. Moreover, it<br />

can serve as a tool to increase students’ awareness of individual accountability and to<br />

promote positive interdependence.<br />

Although a fair amount of studies acknowledges the significance of individual contributions<br />

in groups via peer assessment (Lejk & Wyvill, 1996), there are two serious weaknesses in<br />

the design of the methods that are used to transform group marks into individual marks<br />

using peer ratings. First, they take a psychometric perspective (calculate) rather than an<br />

edumetric (design) perspective – which fits better with contemporary developments such as<br />

competency-based education. Second, they are weak when it comes to flexibility of peer<br />

assessment in group work: the students are not involved in choosing criteria, weighting of<br />

criteria and their participation in peer assessment is obligatory. In this study, the effects of<br />

four peer assessment design variations on individual marks and the reliability of peer<br />

assessment are investigated. These variations are modelled using the baseline dataset with<br />

self- and peer assessment ratings of 72 teacher training students in their fourth year for the<br />

Bachelor of Education. The results show that 1) the design of a peer assessment method<br />

strongly influences the transformation of a group mark into individual marks, and 2) that the<br />

reliability of a peer assessment depends on the weight of the criteria, the rating scale, the<br />

inclusion of self-assessment, and maximum deviation of an individual mark from the group<br />

mark. A more in-depth discussion of the goal of peer assessment and its implications for the<br />

design of peer assessment with respect to the flexible and adaptive use of peer assessment<br />

in group work is required.<br />

48 ENAC 2008


Symposium: Peer Assessment / Paper 3:<br />

Peer feedback in academic writing: How do feedback content, writing ability-level and<br />

gender of the sender affect feedback perception and performance?<br />

Jan-Willem Strijbos, Susanne Narciss, Mien Segerss<br />

Leiden University, The Netherlands<br />

The shift towards student-centered learning places a high emphasis on students to assume<br />

responsibility for their learning. Peer assessment is well-suited in this respect: equal status<br />

students judge a peers’ performance with a rating scheme or a qualitative report (Topping,<br />

1998). Many peer assessment researchers stress that feedback is essential for<br />

performance improvement and learning benefits, but the evidence for performance and<br />

learning effects is scarce. Moreover, the impact of feedback content types is hardly studied.<br />

Students also express concerns about the fairness and usefulness of peer assessment<br />

(Cheng & Warren, 1997), which appears related to sender characteristics that may<br />

influence the effect of peer feedback (Leung, Su, & Morris, 2001).<br />

We conducted two studies to investigate the impact of feedback content and sender<br />

characteristics (writing ability-level and gender) using a factorial pre-test treatment post-test<br />

control group design in the context of academic writing in higher education. Study 1<br />

consisted of a two-way factorial design (Nexp = 71, Ncontrol = 18) and Study 2 had a threeway<br />

factorial design (Nexp = 160, Ncontrol = 19). In each study subjects in the experimental<br />

condition received a scenario in which a fictional student received fictional peer feedback.<br />

Subjects’ feedback perception (i.e., fairness, usefulness, acceptance, willingness to improve<br />

and affect) and performance (text revision quality) were investigated.<br />

Study 1: Subjects in experimental conditions received concise evaluative feedback (CEF) or<br />

elaborated informative feedback EIF) and ability-level of the sender was high or low. A<br />

principal component analysis revealed the latent factor ‘Perceived Adequacy of Feedback’<br />

(PAF, comprising fairness, usefulness and acceptance, 9 items, α = .89). MANOVA<br />

revealed that EIF is perceived as more adequate. A two-way interaction for affect (AF, 6<br />

items, α = .81) revealed that students with EIF by a high-ability peer express more negative<br />

affect compared to CEF by a high-ability peer, and the opposite was observed for feedback<br />

by a low-ability peer. A repeated measures MANOVA showed that performance increases<br />

in all conditions over time, but performance for EIF by a high-ability peer was significantly<br />

lower compared to the control condition.<br />

Study 2: In addition to the variations in Study 1, the experimental conditions also varied with<br />

respect to gender of the sender (typical male/ female name: Joost versus Astrid). As in<br />

Study 1, a principal component analysis revealed the latent factor PAF (α = .90). MANOVA<br />

revealed that EIF is perceived as more adequate. A three-way interaction effect revealed<br />

that subjects’ willingness to improve (WI, 3 items, α = .70) for the feedback by ability-level<br />

combinations appears to be different for gender of the sender. The performance data are<br />

currently being analysed and will be presented by the time of the conference.<br />

The results of both studies reveal that feedback perception is an important aspect to be<br />

considered with respect peer assessment. It should be noted that both samples were<br />

female dominated (7 to 1 ratio), which limits strong generalisations. However, a comparative<br />

study with a more even gender distribution is currently conducted in secondary education.<br />

ENAC 2008 49


Symposium: Assessment in kindergarten classes:<br />

Assessment in kindergarten classes:<br />

experiences from assessing competences in three domains<br />

Organiser: Marja van den Heuvel-Panhuizen, FIsme, Utrecht University, The Netherlands/<br />

IQB, Humboldt University Berlin, Germany<br />

Chair/Discussant: Kees de Glopper, University of Groningen, The Netherlands<br />

Assessment of young children is a challenging endeavour. Defining and measuring<br />

development and learning in young children is quite complex, for a variety of reasons (see<br />

e.g. Shepard, 1994). Test performance of 4- and 5-year-olds can be highly variable. With<br />

young children, mismatches between the content of tests and children’s existing knowledge<br />

and experiences lie in wait. Young children may also encounter difficulties in participation,<br />

due to unfamiliarity with rules and conventions for obtaining responses. In response to<br />

these problems, several guiding principles for the assessment of young children have been<br />

proposed: assessments should bring about benefits for children, they should reflect and<br />

model progress toward important learning goals, their methods must be appropriate to the<br />

development and experiences of young children, and they should be tailored to a specific<br />

purpose.<br />

This symposium discusses experiences with assessing mathematical, literary and socialemotional<br />

competences in kindergartners. Children’s learning in these domains is<br />

investigated in three interlinked research projects that are part of the PICO research<br />

programme (PIcture books and COnceptual development). Each project aims to determine<br />

the instructive value of picture books and follows the same approach. First, potential<br />

contributions of picture books to children’s development are identified through analyses of<br />

picture books. Second, we develop ‘keys’ that help teachers to unlock the richness of<br />

picture books and to establish engaging and instructive interaction. We use design-based<br />

research to develop keys for 24 picture books. Third, we do a quasi-experimental study to<br />

assess the developmental yield of the picture books and their corresponding keys.<br />

To evaluate the effect of the intervention program each PICO-project developed procedures<br />

and tasks for assessment, trying to adhere to the abovementioned principles and, at the<br />

same time, doing justice to the nature of the competence domain that is assessed. This<br />

resulted in a set of tools for assessing young children in three quite diverse domains of<br />

children’s development in which different assessment formats are used ranging from<br />

individual to group assessment, from oral to written assessment, and from open questioning<br />

to multiple-choice questioning. What all the assessment tools have in common is that they<br />

are grounded in a picture-book context.<br />

In the symposium we like to share with the audience our experiences with developing the<br />

assessment tools and analyzing the collected data, and the knowledge we gained regarding<br />

the children’s development and the way to assess this. We hope to draw the audience into<br />

a discussion about the obstacles and opportunities in assessing young children’s<br />

development in different competence domains and to touch issues of further research.<br />

50 ENAC 2008


Symposium: Assessment in kindergarten classes / Paper 1:<br />

A picture book-based tool for assessing literary competence in 4 to 6-year olds<br />

Coosje van der Pol, Tilburg University, The Netherlands<br />

Helma van Lierop-Debrauwer, Tilburg University, The Netherlands<br />

Introduction: The paper addresses the challenging topic of assessing literary competence of<br />

young children. The study reported here is part of the PICO-li project, in which we<br />

investigate whether and how picture books contribute to the literary development of<br />

kindergartners. Literary competence starts by looking with children at picture book stories<br />

as aesthetic compositions of text and pictures. The composition of a story is based on<br />

literary and social codes and conventions. The PICO-li project investigates three subdomains<br />

of literary competence: understanding story characters, suspenseful story<br />

elements and ironic humour.<br />

Development of the assessment tool: The developed tool is partly based on the Narrative<br />

Comprehension (NC) task for assessing children’s comprehension of narrative picture<br />

books (Paris & Paris, 2001). The NC-task has been adapted in order to concentrate on<br />

literary codes and narrative conventions and their aesthetic evaluation by the reader.<br />

Description of the assessment tool: The PICO-li assessment tool uses the picture book<br />

Cottonwool Colin (2007) by Jeanne Willis and Tony Ross to elicit the children’s responses.<br />

This book covers all three sub-domains of literary competence. The assessment has three<br />

parts: first, the child is invited to go through the book and respond spontaneously. In the<br />

second round, the book is taken away and replaced by an electronic version. During the<br />

assessment a child views the scanned pages of the real book on a computer screen whilst<br />

listening to the text being read aloud through the speakers. This ensures that the story is<br />

read to all the children in exactly the same way. Afterwards the child is asked to retell the<br />

story. In the third round, the child answers ten questions related to the three sub-domains.<br />

The final question is a productive question about the story’s main character. His<br />

development from small and weak to tall and strong is the story’s main theme. At the end of<br />

the story Cottonwool Colin no longer seems an appropriate name for the protagonist. The<br />

child is asked to think up a more appropriate name for him.<br />

Data collection and analysis: Almost 100 children from 18 different classrooms have been<br />

tested individually using the PICO-li assessment tool. Their responses have been videorecorded<br />

and transcribed onto answer sheets. Data from the first round are analysed for<br />

spontaneous comments on pictures, story line and attempts at interpretation. The retellings<br />

from the second round are analysed for six story structure elements: setting; characters;<br />

goal/initiating event; problem/episodes; solution; resolution/ending. Scoring rubrics have<br />

been created with higher scores reflecting a more literary stance towards the story than<br />

lower scores.<br />

Discussion: Although we are still in the process of analyzing data, our first impression is that<br />

the tool reveals valuable information on the children’s development of literary competence.<br />

At the conference we will concentrate on what we found out about the children’s concepts of<br />

story characters. In our presentation we will discuss the implications of our analyses for the<br />

practice of assessment and the problems and benefits that may arise when assessing<br />

literary competence in young children.<br />

ENAC 2008 51


Symposium: Assessment in kindergarten classes / Paper 2:<br />

Assessing the social-emotional development of young children by means of<br />

storytelling and questions<br />

Aletta Kwant, University of Groningen, The Netherlands<br />

Jan Berenst, University of Groningen, The Netherlands<br />

Kees de Glopper, University of Groningen, The Netherlands<br />

Introduction: Social and emotional competences are important for adequate functioning.<br />

This is true for human beings at almost any age including kindergartners. Unclear is where<br />

and how young children can learn these competences. The PICO-sem (PIcture books and<br />

COncept development in the social and emotional domain) project is aimed at investigating<br />

whether the use of picture books in kindergarten classes can contribute to the development<br />

of these competences. The children are read a series of picture books that address a<br />

number of events that are identified as highly instructive for understanding and using social<br />

and emotional behavior. To evaluate the effect of the picture book program a tool is<br />

developed to assess the children’s social and emotional and to some extend moral<br />

development.<br />

The PICO-sem assessment tool: The tool consists of a series of about 40 tasks which are<br />

all connected to short sketches in which social and emotional components play a role.<br />

The complete administration of the test took about half an hour split up in two parts. The<br />

tasks do not require writing and reading skills. To avoid a too strong dependence from<br />

verbal skills, we used not only production tasks but also recognition tasks. In the latter, it is<br />

assessed with the help of pictures whether the children can recognize aspects of social and<br />

emotional behavior. Because the children might be influenced by these pictures and their<br />

description, these recognition tasks come after the production tasks.<br />

Data collection and analysis: In January and May 2008 about 110 children from 20<br />

kindergarten classes have been tested individually by two trained research assistants. The<br />

collected data are scored for the use of social-emotional expressions by the children. In the<br />

analysis we tried to explore whether there is a difference between the production and<br />

recognition tasks.<br />

Results and discussion: In our presentation we will focus on questions about the feelings of<br />

the main character of a short story. The results from the production tasks will be compared<br />

with them from the recognition tasks in which the children have to react to a set of depicted<br />

emotions. The data are all from the assessment in January. In this test, that was<br />

administered before the program was carried out, we found that only asking the children to<br />

tell about emotions did not reveal their deep understanding. When the children had to react<br />

to pictures their answers were more differentiated then in the production tasks.<br />

Assessing young children is rather challenging. An important issue for us is to know to what<br />

extent our assessment really catches the children’s understanding and knowledge about<br />

emotions. In the discussion we will address the question whether our approach to assess<br />

kindergartners’ social and emotional development is a fruitful avenue to follow.<br />

52 ENAC 2008


Symposium: Assessment in kindergarten classes / Paper 3:<br />

Assessing mathematical abilities of kindergartners:<br />

possibilities of a group-administered multiple-choice test<br />

Sylvia van den Boogaard, Utrecht University, The Netherlands<br />

Marja van den Heuvel-Panhuizen, FIsme, Utrecht University, The Netherlands/<br />

IQB, Humboldt University Berlin, Germany<br />

Introduction: Knowledge of children’s mathematics development is of crucial importance for<br />

offering them support for further learning. In the case of kindergartners (i.e. 4- and 5-yearolds<br />

who have not yet entered formal education), observations and interviews are mostly<br />

used to collect this information. Group-administered, multiple-choice tests often are not<br />

considered to be an adequate assessment tool for young children (e.g. Fuson, 2004).<br />

Nevertheless, this assessment format does have potential to provide relevant information<br />

about children’s development, as is shown earlier (see Van den Heuvel-Panhuizen, 1996).<br />

The present study builds on these previous experiences and aims at increasing our<br />

knowledge about kindergartners’ mathematical understanding and designing a multiplechoice<br />

test to reveal this understanding. The test is developed in the context of the PICOma<br />

(PIcture books and COncept development in mathematics) project that investigates<br />

whether and how picture books contribute to the mathematical understanding of<br />

kindergartners.<br />

The PICO-ma Test: The mathematical content included in the test covers three subdomains<br />

of early mathematics: number (with special attention to “structuring numbers”),<br />

measurement (in particular the theme “growth”), and geometry (with the focus on “taking a<br />

point of view”). The guiding principle for developing the test is offering children a meaningful<br />

and familiar context in which they can show their understanding regarding these content<br />

domains. Therefore, all items are presented in a picture-book-like style; most of the<br />

questions are inspired by picture-book stories.<br />

For most items, four alternative solutions are given. The children have to put a line under<br />

the correct solution. We made sure that there is no need for the children to read text or<br />

numbers. The accompanying questions are read aloud to the children.<br />

A draft version of the test was tried out on a number of individual children. After this, the<br />

final selection of items was made and several items were revised. The final test contains<br />

42 items; 14 items for each sub-domain.<br />

Data collection and analysis: In January 2008, about 400 children from 18 kindergarten<br />

classes took the test. The test was administered in two sessions with an interval of one<br />

week. Half the children of one class were assessed at a time. A trained research assistant<br />

led the children, who marked the right answer in their own test booklets, through the test.<br />

The collected data are scored as correct or incorrect and analyzed in connection with<br />

scores on a standardized mathematics test including classification, seriation, and<br />

comparison, and with information about age, sex, and socio-economic status.<br />

Presentation of results and discussion: In our presentation, we share our experiences with<br />

designing, administering, and analyzing the test, and give details about the psychometric<br />

quality of the test. We present our findings regarding the kindergartners’ mathematical<br />

understanding in the three sub-domains and how the sub-domains are related. In the<br />

discussion we like to reconsider the potential of group-administered paper-and-pencil tests<br />

for assessing young children’s mathematical development.<br />

ENAC 2008 53


54 ENAC 2008


Papers<br />

ENAC 2008 55


56 ENAC 2008


Ethical Dilemmas:<br />

‘Insider’ action research into Higher Education assessment practice<br />

Linda Allin, Northumbria University, United Kingdom<br />

Lesley Fishwick, Northumbria University, United Kingdom<br />

Newman (2000) documents that investigative enterprises with a focus on practice inquiry<br />

have emerged over the last decade, and have been identified as teacher research<br />

(Cochran-Smith and Lytle, 1993), action research (Winter, 1987) and reflective practice<br />

(Schon, 1987). The key aim of such studies is to try to solve the immediate and pressing<br />

day-to-day problems of practitioners. Within university departments, how to provide<br />

authentic, innovative and student centred assessment for learning within the context of<br />

increasing student numbers is one of the most pressing problems for programme managers<br />

and lecturers. Recent national student surveys highlight assessment and student feedback<br />

to be two key areas of dissatisfaction for students. We suggest that insider action research<br />

to understand staff experiences of setting assessments is a valuable starting point in<br />

evaluating and then improving assessment practices. The main aim of this paper is to<br />

examine tensions of teaching in Higher Education in relation to identifying the constraints<br />

and pressures which impact on lecturer’s daily work. The focus is on understanding the<br />

various influences and decision-making as education professionals in terms of uncovering<br />

assumptions which drive our assessment practices. The methodology is in-depth interviews<br />

asking staff their views on the purpose of assessment, the barriers to setting innovative<br />

assessments and their concerns over assessment processes. The study has led to a series<br />

of unanticipated ethical issues relating to conducting of qualitative research with colleagues.<br />

Oliver and Fishwick (2003) argue that there is room for discussion and debate about ethical<br />

considerations in qualitative work. Such debates focus on key principles including not doing<br />

harm (nonmaleficence), justice, autonomy and research related benefits for participants.<br />

Several papers discuss practical ethical problems, but they are often are more oriented<br />

towards issues with involving students, rather than staff, as participants (Ferguson, Yonge<br />

and Myrick, 2004; Hammack, 1997). In this paper, we aim to stimulate discussion into some<br />

of the ethical dilemmas faced by ‘insider’ research into assessment. We highlight the<br />

experiences of a CETL assessment for learning team in developing and implementing<br />

research into staff views and experiences of assessment within one particular university<br />

department. Such ethical dilemmas include insider/outsider perspectives, role conflicts,<br />

accessing staff and the process of interviewing as well as the more usually identified ethical<br />

concerns relating to informed consent, anonymity, confidentiality and the right to withdraw.<br />

A key concern for participants became the issue of trust and safeguarding privacy as well<br />

as assuring anonymity. In the paper we reflect on discussions within the team following the<br />

pilot study, and identify actions taken to address some of the dilemmas encountered. We<br />

identify the need to take a critically reflective stance on research into assessment practices<br />

and highlight the way in which minimising power relations and creating an atmosphere of<br />

trust are central if such research is to reach its purpose of enhancing assessment practice.<br />

ENAC 2008 57


Reciprocal Peer Coaching as a Formative assessment strategy:<br />

Does it assist student to self regulate their learning<br />

Mandy Asghar, Leeds Metropolitan University, United Kingdom<br />

Research has shown that cognitive gains are significantly higher in pairs that work together<br />

when compared to students studying independently (Ladyshewsky 2000, Topping 2005).<br />

Higher achievement, more caring and supportive relationships, greater psychological<br />

health, social competence and self esteem are all valuable consequences of introducing<br />

peer assisted learning strategies into the curriculum. In reciprocal peer coaching students<br />

goals are inter-related and the most successful outcome is dependent on mutual coaching.<br />

Reciprocal peer coaching is used as an innovative formative assessment strategy to test<br />

the competency of physiotherapy students’ abilities to carry out the practical skills required<br />

to become a successful therapist. Traditionally these skills were assessed exclusively in a<br />

summative format at repeated points throughout level 1 which resulted in many students<br />

trailing failure throughout the year. Module evaluation provided anecdotal evidence of the<br />

benefits of this change in the assessment strategy. A subsequent qualitative research<br />

project has explored “students’ perceptions of reciprocal peer coaching as a strategy to<br />

formatively assess practical skills”. Individual interviews and focus groups were used to<br />

collect data which has been analysed from a phenomenological perspective that considers<br />

the lifeworld as a lens through which to view the students’ lived experience (Ashworth 2003)<br />

Initially 4 themes have emerged from the data and include Motivation and Learning, the<br />

Emotional Experience of Learning, Learning Together and Contextualising the Learning<br />

Experience. Although students valued the feedback about their knowledge and abilities from<br />

the formative assessment process they expressed frequently a willingness to engage with<br />

reciprocal peer coaching as it provided that “pressure” which made them study. When<br />

considering the theoretical models of self regulation of learning many of the participants<br />

described a view that fitted with this model. Key aspects of which include self efficacy,<br />

motivation and emotion, the nature of an individuals goals and the ability to engage<br />

metacognitive processes. Issues identified included time management and students<br />

tendency to procrastinate as they find it hard to set themselves short term goals but that<br />

they felt that this formative assessment strategy helped them to manage. The variance<br />

between student goals some seeking mastery goals, others performance goals and the<br />

associated emotions of angry, anxiety, frustration, and relief associated with this<br />

assessment were all reported by these level 1 students.<br />

Self regulation in new environments and subject areas can be difficult for students who as<br />

novices often fail to employ metacognitive strategies to set goals for themselves and self<br />

assess their progress, many tending to compare themselves with others in order to judge<br />

the need to learn (Zimmerman 2002). It is suggested that self regulation is not easy and one<br />

that requires a scaffolding of strategies to encourage its development. (Pintrich 1999)<br />

Although recognised for its valuable role in the provision of feedback (and indeed the<br />

influence this may have on students self –efficacy), formative assessment has a role in<br />

assisting students being able to self regulate their learning, learning how to learn. It is this<br />

theory related dimension to formative assessment that I would like to discuss.<br />

58 ENAC 2008


Using an adapted rank-ordering method to investigate<br />

January versus June awarding standards<br />

Beth Black, Cambridge Assessment, United Kingdom<br />

Aims: The dual aims of this research were (i) to pilot an adapted rank-ordering method and<br />

(ii) to investigate whether UK examination awarding standards diverge between January<br />

and June sessions.<br />

Background: Standard maintaining is of critical importance in UK qualifications, given the<br />

current ‘high stakes’ environment. At qualification level, standard maintaining procedures<br />

are designed to ensure that a grade A in any particular subject in one year is comparable to<br />

a grade A in another year, through establishing the equivalent cut-scores on later versions<br />

of an examination which carry over the performance standards from the earlier version.<br />

However, the current UK method for standard setting and maintaining - the awarding<br />

meeting as mandated by the QCA Code of Practice (2007) - introduces the potential for the<br />

awarding standards of the January and June sessions to become disconnected from each<br />

other. Additionally, from a regulatory and comparability research point of view, the January<br />

sessions have been largely ignored, despite the increasing popularity of entering candidates<br />

for January units since Curriculum 2000.<br />

Given the difficulties in quantifying relevant features of the respective cohorts, (e.g. the<br />

January candidature is more unstable), and the problems in meeting the assumptions<br />

necessary for statistical methods (e.g. Schagen and Hutchison, 2008), arguably the best<br />

way to approach this research question is to use a judgemental method focusing on<br />

performance standards. In this study, the chosen method involves expert judges making<br />

comparisons of actual exemplars of student work (‘scripts’).<br />

Method: A rank-order method was employed adapted from Bramley (2005). Archive scripts<br />

at the key grade boundaries (A and E) from the previous six sessions (comprising three<br />

January and three June sessions) from two AS level units in different subjects were<br />

obtained. Whilst previous rank order exercises (e.g. Black and Bramley, in press) required<br />

judges to rank order ten scripts per pack spanning a range of performance standard, in this<br />

study each exercise involved scripts which were, at least notionally, of more similar quality<br />

(e.g. all exactly E grade borderline scripts) and therefore an adaptation was required. Rankordering<br />

sets of three scripts retains many of the advantages of rank-order over traditional<br />

Thurstone paired-comparisons as well as an additional advantage - asking of judges a more<br />

natural psychological task: to simply identify, on the basis of a holistic judgement, the best,<br />

middle and worst script.<br />

Analysis: Rasch analysis of the rank order outcomes produced a measure of script quality<br />

for each script and, using an ANOVA, it was possible to examine effects of session and<br />

session type (i.e. January versus June). The research indicated that the two AS units<br />

displayed different patterns of performance standards for January versus June.<br />

Discussion: The discussion will be research-related and practice-related. The paper will<br />

address the potential for using this method to investigate comparability in a variety of<br />

contexts, and implications for standard-maintaining processes.<br />

ENAC 2008 59


Generating dialogue in coursework feedback:<br />

exploring the use of interactive coversheets<br />

Sue Bloxham, University of Cumbria, United Kingdom<br />

Liz Campbell, University of Cumbria, United Kingdom<br />

This two year study examines feedback designed to create a dialogue between tutor and<br />

student without additional work for staff. Research is developing conceptual understanding<br />

regarding how feedback can effectively contribute to student learning (Higgins 2000, Gibbs<br />

& Simpson 2004, Nicol & Macfarlane-Dick (2004), Brown & Glover, 2006). Emphasis is<br />

being paid to the notion of feedforward (Hounsell, 2006) designed to reduce the gap<br />

between the standards students are expected to achieve and their current level of<br />

performance (Sadler 1998). Studies have examined the extent to which different types of<br />

tutor feedback better enable students to ‘close the gap’ (Brown & Glover 2006). Evidence<br />

suggests that feedback tends to focus on assignment ‘content’ whereas students find<br />

comments on their ‘skills’ to be more useful for future writing (Walker, 2007). Furthermore,<br />

feedback which elaborates on corrections is rare (Millar, 2007), but is considered more<br />

likely to help students make the link between the feedback and their own work (Brown &<br />

Glover).<br />

Failure to understand feedback is also associated with the tacit discourses of academic<br />

disciplines (Higgins 2000). However, learning tacit knowledge is an active, shared process,<br />

and thus writers such as Ivanic et al (2000) and Northedge (2003a) stress the importance of<br />

feedback which seeks to engage the student in some form of dialogue. This theoretical<br />

approach suggests that tutor-student dialogue could significantly aid feedback for learning,<br />

enabling students to understand feedback so that they can act on it to ‘reduce the gap’.<br />

This study emerged from concerns that staff on an Outdoor Studies Programme were<br />

devoting inordinate amounts of time to written feedback whilst students were reporting that<br />

they did not receive enough, nor was there evidence that feedback was being used to<br />

improve future assignments. Consequently, staff attempted to set up a dialogue with<br />

students by providing written feedback in response to students’ questions about their work,<br />

requested on their assignment coversheets. In the second year of the experiment, training<br />

was given in asking effective questions.<br />

Data was collected in the form of their feedback questions, interviews with staff,<br />

administration of the Assessment Experience Questionnaire (Dunbar-Goddet,Gibbs, 2006)<br />

and a supplementary questionnaire asking students for their preferences for guidance and<br />

feedback. Coding of students’ questions indicated that they differed markedly in the quality<br />

of their questions and rarely posed queries about the ‘content’ of their assignment, being<br />

much more concerned with their ‘skills’. The quantitative data indicated high mean scores<br />

for ‘quantity and quality of feedback’ and ‘use of feedback’ although students gave mixed<br />

preferences for different types of feedback. Staff reported achieving a sense of dialogue,<br />

finding it easier to write feedback in response to specific questions. Evidence of the impact<br />

of ‘question’ training will also be presented.<br />

Discussion will consider whether the research findings support conceptual models regarding<br />

the place of dialogue in creating learning-oriented assessment. It will also consider the<br />

practical implications of attempting to create a dialogue with students, given resource<br />

constraints, tutor and student expectations and quality assurance.<br />

60 ENAC 2008


Reforming practice or modifying Reforms?<br />

The science teacher’s responses to MBE and to assessment teaching in Chile<br />

Saul Alejandro Contreras Palma, Chile<br />

In 1996, the biggest Education Reform in the history of Chile was about to be implemented.<br />

Its aim was to strengthen the teaching profession. Therefore a framework for good teaching<br />

(MBE) was developed, which was focused on the need to know what and how to teach and,<br />

on an assessment system establishing the standards for the teachers’ work. However, little<br />

is known about the impact that the reform and its tools have had on how teachers think<br />

about teaching and learning, particularly science subjects. As Smith and Southerland (2007)<br />

said a "missing link" has been the investigation and understanding of the interaction<br />

between the teachers’ internal structures and the externally imposed ones.<br />

In this context, it makes sense to investigate what teachers think and do, and the relation<br />

between this reality and the one proposed by education reforms. This study examines the<br />

interactions between teachers’ beliefs and their actions concerning what and how to assess,<br />

and what is the degree of coherence between the teachers’ models and the ones proposed<br />

by the reform. This exploratory study analyzed and compared six Chilean high school<br />

science teachers who all participated in the reform and its training programs (PPF). The<br />

dates were obtained through the MBE document, a questionnaire, an interview and a nonparticipant<br />

observation. The information presented here was subjected to a content analysis<br />

centered in the assessment category.<br />

Our results indicated three important issues: First, behind the framework for good teaching<br />

(MBE) exists a constructivist model that indicates what and how teachers should assess or<br />

evaluate their students. Second, unlike the teachers thinking, their practice –independently<br />

of their subject– is traditional and inconsistent with the proposals of the reform. Third, the<br />

teachers’ thinking is organized in several levels, which differ from each other. There is a<br />

difference between what teachers "think they do, they think that should be done and what<br />

they say do”. This difference is more consistent with the fact that the teachers do not<br />

implemented the reform in their classes although having agreed to and participated in the<br />

reform. Therefore, the teacher thinking influences the interpretation of the –sometimes<br />

contradictory– messages of the proposals of the reform.<br />

In consequence, what we are discussing is not the reform and its tools, because we<br />

recognize that they have led to an advance by introducing the assessment culture and the<br />

measure concept. However, we believe that it would have been much more efficient to put<br />

the fases of the process of change process in another order: First, to explore what teachers<br />

know and what they are able to do and then to set standards that determine the<br />

professional knowledge to be obtained. In other words, it is necessary to determine what<br />

and how certain aspects of the teachers’ model support or impede the implementation of<br />

reforms, their instruments and the professional development of teachers, and to work on<br />

these aspects to achieve a real impact.<br />

ENAC 2008 61


Expanding student involvement in Assessment for Learning:<br />

A multimodal approach<br />

Bronwen Cowie, Alister Jones, Judy Moreland, Kathrin Otrel-Cass<br />

University of Waikato, New Zealand<br />

Assessment for learning (AfL) encompasses actions to assess student learning and steps to<br />

move that learning forward (Black and Wiliam, 1998). These actions can undertaken by<br />

teachers or students but the ultimate goal is that students are actively engaged in<br />

monitoring their learning. AfL was initially theorized within a cognitive frame. In the last<br />

decade research has begun to consider the implications of sociocultural views of learning<br />

(Gipps, 2002). These expand the possibilities for student participation in AfL. In this paper<br />

we use data generated in primary science and technology classrooms to illuminate the<br />

affordances of multimodal assessment practices.<br />

This paper reports on one outcome of the classroom Interaction in Science and Technology<br />

Education (InSiTE) study. The study involved 12 primary teachers and over 800 students<br />

over three years. A key goal was to understand AfL interactions around science and<br />

technology ideas and practices and the factors that afforded these interactions. Student and<br />

teacher reflective interviews, and teacher and researcher joint planning, reflection and data<br />

analysis meetings complemented the classroom work. The classroom data generation<br />

methods were videos of teacher interactions; audio taping of teacher and student talk; field<br />

notes; and the collection of teacher documents and student work. Post lesson discussions<br />

and meeting days provided a forum for data analysis.<br />

The InSiTE teachers and students employed multiple and multimodal means (Kress et al.,<br />

1999) to make and communicate meaning. Their interactions encompassed talk, text, action<br />

and the visual mode. The teachers explicitly developed students’ oral language proficiency<br />

but almost invariably talk was augmented by action, writing and the visual mode. Written<br />

text anchored and augmented talk. Drawing was useful for illustrating and complementing<br />

talk when students could not express tentative ideas by talk alone. Actions, including<br />

gesture, were useful for demonstrating and illustrating skills and practices. A combination of<br />

modes multiplied meaning (Lemke, 2001). In the full paper we will present examples from a<br />

Year 1-3 students learning about fossils, Year 1 students designing and making kites, and<br />

Year 7-8 students designing and making musical instruments.<br />

Teacher and student talk plays a pivotal role in AfL interaction but talk is invariably<br />

anchored and augmented by other modes. When multiple modes are used in combination<br />

teacher-student AfL interactions are enriched. The likelihood that diverse groups of students<br />

are able to express what they know and can do is increased when classroom interaction is<br />

deliberately multimodal. Students do benefit from multimodal opportunities to gain feedback<br />

and to consider the ideas of others as part of their active engagement in AfL.<br />

References<br />

Black, P. & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education, 5(1), 7-74.<br />

Gipps, C. (2002). Sociocultural Perspectives on Assessment. In G. Wells & G. Claxton (Eds.) Learning for<br />

Life in the 21st Century (pp. 73-83). London: Blackwell Publishing Ltd.<br />

Kress, G., Jewitt, C., Ogborn, J. & Tsatsarelius, C. (2001). Multimodal teaching and learning: The rhetorics<br />

of the science classroom. London: Continuum.<br />

Lemke, J. (1990). Talking science: language, learning, and values. Norwood, N.J.: Ablex Pub.<br />

62 ENAC 2008


Assessment Center Method to Evaluate Practice-Related<br />

University Courses<br />

Julian Ebert, University of Zurich, Switzerland<br />

Introduction: To provide high-quality education at universities courses are evaluated. Mostly,<br />

written tests and subjective ratings are used to check for learning effectiveness of the course<br />

and students’ satisfaction. This discriminates courses with didactic concepts that are based<br />

upon activity and interaction (e.g. problem-based learning; Schmidt & Moust, 2000) because<br />

teaching and testing modus are different (Sternberg, 1994). Declarative and factual<br />

knowledge can be tested by written tests but procedural knowledge and skills require different<br />

assessment methods. Due to changes in curricula as a result of the “Bologna reformation<br />

process” Universities are increasingly challenged to foster their students’ meta-disciplinary<br />

competencies. Therefore, courses that intend not only to teach factual knowledge but rather<br />

train skills have to be given and evaluated concerning their efficiency. To evaluate the<br />

efficiency of such a practice-related university course concerning social and method skills in<br />

project management given at the University of Zurich an assessment center (AC) was<br />

developed, applied and analysed. The aim of the current paper is to present and discuss this<br />

innovative evaluation method for practice-related courses at universities.<br />

Methodology: 80 students and professionals of different subjects participate twice in 1-day-<br />

ACs before and after course attendance. The applied assessment methods varies from<br />

dyadic role-plays over planning and presentation tasks (for the evaluation of skill<br />

improvement) to written tests on procedural knowledge (to check for concordance of both<br />

applied measures). Additionally, questionnaires on motivation and work related selfassessments<br />

are being used. The assessors are specifically trained psychologists who<br />

achieved satisfying inter-rater reliability (ICC=. 75 on average). The assessees are – among<br />

others – tested on their abilities to delegate, lead discussions and argue, organize and<br />

present project plans, solve conflicts, mediate, and give feedback.<br />

Results: Results show significant increases in both procedural knowledge and skills. The<br />

baseline-corrected effect sizes for the subtasks of the procedural knowledge tests vary from<br />

d=.45 to d=2.07 (d=1.10 on average), those for the different assessment dimensions of the<br />

skill tests vary from d=.14 to d=1.32 (d=.57 on average). Nevertheless, increased<br />

knowledge did not automatically result in increased skills, which supports the transfer<br />

problem hypothesis. It also re-rises the question, whether written tests (even on procedural<br />

knowledge) sufficiently inform about the actual ability to perform the respective skills. Also,<br />

almost no gender effects have been found, i. e. male and female participants benefit equally<br />

from the course.<br />

Discussion: The findings indicate that different didactical concepts and teaching methods<br />

require different effectiveness testing and assessment centers seem appropriate to<br />

demonstrate the efficiency of skill-focused versus factual knowledge-focused courses.<br />

Students highly appreciated the detailed individual feedback that they received afterwards.<br />

We consider the opportunity to provide students with feedback beyond scores and grades<br />

the most important advantage compared to usual assessments at universities.<br />

Nevertheless, assessment centers are complex, time-consuming and expensive<br />

assessment methods and therefore not (yet) established in the field of course evaluation.<br />

We are interested in sharing our experiences with others who also work on innovative<br />

evaluation methods for innovative courses.<br />

ENAC 2008 63


Democracy, Assessment and Validity. Discourses and practices concerning<br />

evaluation and assessment in an era of accountability<br />

Astrid Birgitte Eggen, University of Oslo, Norway<br />

The paper is an empirically based discussion of the relationship between multiple<br />

understandings of democracy with the multiple purposes and practices of assessment.<br />

Assessment is seen as an asset of the overall evaluation processes at school and municipal<br />

levels. The conceptualization is inspired by three broad democratic evaluation orientations:<br />

elitist democratic evaluation, participatory democratic evaluation and discursive democratic<br />

evaluation as well as four dimensions of democracy (agency, voice, audience and<br />

influence). Underpinning the discussions are the various validity concerns of the democratic<br />

orientations, emphasizing in particular consequential, communicative, reflective and<br />

catalytic validity in additional to the traditional validities.<br />

This paper is presenting the main results of three ethnographic research projects among<br />

teachers and school leaders in secondary education concerning assessment and evaluation<br />

practices and discourses. These projects have been developed in cooperation with three<br />

municipal educational authorities and 20 school communities. The schools in the<br />

surroundings of Oslo have been participating in R&D projects during a phase of<br />

implementing “Kunnskapsløftet” and the “National evaluation program” (National curriculum<br />

(2006) combined with National strategy for assessment and evaluation) with possibilities for<br />

both summative and formative strategies. A consequence of national steering has been a<br />

cry for building assessment and evaluation literacy at both municipal and school level.<br />

A critical ethnographic research methodology based on emancipative, developmental and<br />

progressive ideology is signalling research as democratic enterprise. Data gathering has<br />

been twinned with in-service training of teachers and school leaders. Hence methods of<br />

instruction are closely connected to methods of inquiry. Issues grounded in both practices<br />

and theory has been accountability, democracy and ideological aspects like equity, equality,<br />

justice, values and ethics. The paper focuses on the democratic challenges of evaluation<br />

and assessment in an era of market driven accountability, however multiple accountabilities<br />

as well as multiple contents of democracy are identified in the participating communities.<br />

A situated learning perspective has been applied in order to view evaluation and<br />

assessment as joint enterprises depending on shared vocabulary and repertoire of<br />

assessment and evaluative tools in each community of practice. Consequently, the<br />

relevance of the traditional dichotomies of evaluation and assessment (summative and<br />

formative, internal and external etc) is questioned based on the findings, and boundary<br />

objects are introduced as an alternative analytical tool. The school communities find<br />

themselves within an overall ideological and epistemological controversy between a drive<br />

for goal oriented new public management steering combined with “evidence based”<br />

practices on one hand, and on the other hand the emancipative bottom up developmental<br />

strategies. Hence the projects points towards several tensions between the central and<br />

local governmental vocabulary and strategies for outcome measures and the discourses<br />

and practices in these schools. These projects have been feeding documentation for the<br />

development of the methodology and content of a program for teacher educators<br />

emphasizing assessment and evaluation literacy.<br />

64 ENAC 2008


Using Assessment for Learning:<br />

exploring student learning experiences in a design studio module<br />

Kerry Harman, Northumbria University, United Kingdom<br />

Erik Bohemia, Northumbria University, United Kingdom<br />

This paper explores the relationships between assessment for learning elements and<br />

student learning experiences in a design studio module. Our focus is on the Global Studio<br />

(Bohemia & Harman, 2008 forthcoming), a design module recently conducted at<br />

Northumbria University. Using a case study methodology with the aim of compiling rich,<br />

practice-based knowledges (Denzin & Lincoln, 2005; Gherardi, 2006), we draw on data<br />

gathered throughout the development and delivery of a particular design studio module in<br />

order to undertake our analysis.<br />

In the first part of the paper we briefly describe the Global Studio with a focus on the overall<br />

aims and the structure of the module. One aim of the Global Studio was the development of<br />

distance communication skills, thereby preparing students for work in geographically<br />

distributed workgroups. Thus, an important aspect of the course was the incorporation of<br />

the element of distance between geographically distributed student design teams. We also<br />

outline the assessment for learning elements that we use in our analysis. These include an<br />

emphasis on authentic assessment tasks, the extensive use of ‘low stakes’ confidence<br />

building opportunities, the provision of a learning environment that is rich in both formal and<br />

informal feedback and the development of students’ abilities to evaluate their own progress<br />

(McDowell et al., 2006; Sambell, Gibson, & Montgomery, 2007).<br />

Using the above assessment for learning framework we map various assessment for<br />

learning elements used in the Global Studio. We suggest in the first section of the paper<br />

that a number of assessment for learning elements were implicitly embedded in the<br />

structure and delivery of this particular module.<br />

In the second part of the paper we explore the relationships between assessment for<br />

learning elements (implicitly) used in the module and student learning experiences. Drawing<br />

on student evaluation data, both qualitative and quantitative, collected throughout the<br />

module we examine the learning experiences of students undertaking the module. Our<br />

focus here is on what students considered useful in the module in terms of learning and the<br />

links with assessment for learning elements. This analysis contributes to the collection of<br />

‘rich’ case study material on assessment for learning in Higher Education with a focus on<br />

the subject area of design.<br />

We conclude that the assessment for learning elements used in the analysis provided a<br />

useful frame for examining student learning experiences in this particular design studio<br />

module. Therefore, we suggest that assessment for learning may provide a useful language<br />

for developing ongoing discussion and research in relation to teaching and learning in the<br />

subject area of design. For example, the following research questions might be explored:<br />

does the design studio, in general, incorporate assessment for learning elements? And if<br />

so, how are these contributing to enhanced student learning experiences?<br />

ENAC 2008 65


Chasing Validity – The Reality of Teacher Summative Assessments<br />

Christine Harrison, Paul Black, Jeremy Hodgen, Bethan Marshall, Natasha Serret<br />

King's College London, United Kingdom<br />

The King’s-Oxfordshire-Summative-Assessment-Project (KOSAP) was an 18 month project<br />

aimed to investigate how to help teachers enhance the validity and reliability of their<br />

assessments so that these can play a significant and trustworthy part in all summative<br />

assessments of their students. This was a collaborative development between teachers in<br />

three Oxfordshire schools, their schools’ managements, assessment and subject advisers<br />

in the Local Education Authorities, and experts in school assessment from King’s College<br />

London. It involved investigation of the possibilities and practicability of assessment of year<br />

8 (Y8) pupils within the domains of English and of mathematics. Key research foci were the<br />

constraints and affordances that arise as teachers take a more active part in designing,<br />

using and evaluating summative assessment tools and activities and the subsequent effects<br />

in the classroom as the formative-summative interface is brought closer together.<br />

The intention of this mixed method research (through interviews, field notes, teacher writing<br />

and transcripts from teacher meetings) was to work with teachers to discover their working<br />

assessment practices, how they judged and valued the assessment tools that they used<br />

and whether they could be supported in improving their assessment tools and practices.<br />

Our interest lay in teachers’ perceptions, skills and practices and we wanted to do more<br />

than simply evaluate whether teachers could implement assessments that were provided for<br />

them. Rather, we wanted to understand the ways in which they interlaced assessment with<br />

curriculum and pedagogy through allowing them to explore for themselves how they might<br />

develop and evolve better teacher assessment<br />

The research revealed how the pervasiveness of tests constrained teachers’ own<br />

summative assessments. Teachers felt the pressures of the external tests system, in part<br />

through the priority given to the published test results by school managements and by<br />

parents and pupils, both to achieve in these tests, but also and to report on learning in<br />

terms only of a single level or grade. We also found that the teachers’ grasp and application<br />

of the principles that guide quality in assessment, notably the concept of validity, seemed<br />

weak. Linked to this was the generally conservative attitude that the teachers have towards<br />

the task of making summative judgments, and this was recognised by the project teachers.<br />

There is a general acceptance of the tests and tasks that they already do, despite their<br />

concerns that these assessment tools may not be fair, valid nor reliable in measuring the<br />

capabilities of their students. We therefore conclude that the more ambitious aim, of<br />

establishing the quality of teachers’ own summative assessments so that they may claim to<br />

supplement or even replace formal tests externally set and marked (ARG 2006), will be<br />

difficult to achieve without considerable professional development. We suggest it would take<br />

several years of modest steps towards such an aim, before that aim could be approached<br />

or a new system could be designed and implemented to meet the multiple purposes of<br />

public assessment.<br />

66 ENAC 2008


There is a bigger story behind.<br />

An analysis of mark average variation across Programmes<br />

Anton Havnes, University of Bergen, Norway<br />

In a UK university some undergraduate programmes have been consistently above the<br />

University average, others consistently below. Preliminary analyses have controlled for level<br />

entry grades, gender, group size, the assessment weighting on modules between<br />

coursework and exams, and assessment forms (coursework vs. exam). None of these<br />

factors explain the variation in average marks. Students who take a combined degree with<br />

one Field in the high mean group (HM) and another in the low mean group (LM) on average<br />

get higher marks in their HM modules than in their LM modules. One possible explanation is<br />

that the variation is due to diverse assessment and marking cultures. This project took<br />

another potential explanation as the starting point: Are there variations between<br />

Programmes in the way coursework, formative assessment and feedback is organised that<br />

make it reasonable to expect that students in the HM Fields probably will reach the<br />

standards of their Field, while it is less likely that students in the LM Fields do? If so, it is<br />

also reasonable to expect that the HM Fields should have higher average marks than the<br />

LM Fields. Also, there should be something to learn from the HM Fields that shed light on<br />

potentials for improvement of the educational programmes across the whole University.<br />

Because of the sensitivity of this issue I was engaged to do this study as an external and<br />

non-UK researcher. Four categories of data was obtained:<br />

• documents: Study Guides, Module Descriptions, Coursework tasks<br />

• semi-structured interviews with Field Chairs, representing two HM and two LM Fields,<br />

the fifth Field chair represented a Field that has risen from LM to mean (taped,<br />

transcribed and analysed)<br />

• examples of written feedback on students’ coursework<br />

• mark transcripts for each Field and each module in each Field<br />

The recruitment for students was not successful, unfortunately. The main restriction was the<br />

Data Protection Act, which prevents the obtainment of contact details for those students<br />

who have not already agreed to be contacted by a researcher.<br />

The analysis shows that the teachers in all Programmes comply to the University<br />

assessment regime and the guidelines for marking and feedback. It is hard to identify<br />

essential differences in the assessment cultures, instead, it seems that there is one<br />

assessment culture that dominates across the University. The analysis points to a series of<br />

contextual and conceptual factors that varies systematically between HM and LM Fields.<br />

• the consistency of the conceptual construct that students’ learning is about (the core<br />

around which students’ learning rotate throughout the whole degree programme),<br />

represented by the consistency of what teaching, assessment and feedback relate to<br />

across modules.<br />

• the relationship between learning activities and the students potential future professional<br />

and/or academic practice<br />

• the way the complexity of the Field (at the Programme level) and its thematic<br />

components (at the module level) are laid out as a Field and as a trajectory of learning<br />

across modules<br />

• inter-modular planning, coordination and communication<br />

• the integration of feedback in lectures, seminars and coursework.<br />

ENAC 2008 67


Course design and the Law of Unintended Consequences:<br />

Reflections on an assessment regime in a UK “new” University<br />

Anton Havnes, University of Bergen, Norway<br />

Assessment is known to drive learning. The attempt to improve students’ learning has led to<br />

revisions of assessing students to institute more diverse and more learning-oriented<br />

assessment strategies. In many universities in UK and elsewhere coursework has become<br />

the most common assessment form. Coursework assessment offers the opportunity to<br />

ensure diversity and frequent feedback. In a study of three UK universities Gibbs (2007) he<br />

found that the assessment environments differed “very widely” across institutions. The<br />

variation was particularly large in the volume of formative-only assessment. Butler (1987)<br />

has documented that formative-only feedback has a significant influence on learning<br />

contrary marks-only feedback. The importance in increasing and improving formative<br />

feedback to support student learning is stressed in policy documents and research (e.g.<br />

Hattie & Timperley, 2007; Nicol & Macfarlane-Dick, 2006).<br />

This paper is based on a study of coursework, marking, formative assessment and<br />

feedback practice in a UK University. 15 teachers (five Field Chairs and 10 lecturers) in five<br />

undergraduate programmes were interviewed, Institutional Guidelines, Study Guides,<br />

Module Descriptions and Coursework Tasks were collected and analysed. Interviews were<br />

transcribed and analysed to identify how assessment and feedback supported students’<br />

learning. Finding show that in spite of the fact that markers invested a vast amount of<br />

resources in writing feedback to the students, only a small number of students actually<br />

collected their feedback. The analysis explains why this neglect of feedback by the students<br />

turns out to be a regrettable but rational response on the assessment system. Firstly,<br />

assessment was predominantly associated with marking mark justification: “Done that.”<br />

Secondly, the links between what was assessed in one module often did not link to what<br />

was assessed in another module. Likewise, coursework would often cover different thematic<br />

fields, they would be assessed in relation to different criteria and the modes of assessment<br />

would vary. This triple-variation made feedback of minor interest (except of students who<br />

had to re-sit and the very engaged students) created inconsistency in students’ learning<br />

trajectory: “The next assessment task is on something different, some other criteria and you<br />

have to perform in a different way. The problem fundamental is not any individual teacher’s<br />

assessment, the assessment of a specific achievement or any given feedback given to an<br />

achievement. Instead the problem is how the various teachers assessment interrelate<br />

thematically and rotate around a set of core criteria. Another problem is the marking.<br />

Lecturers argued that they could not give formative assessment on a piece of work that was<br />

subject to summative assessment.<br />

These and other finding will be discussed in the perspective of research on formative<br />

assessment, the use of criteria and the influence of feedback on learning. The findings –<br />

though the assessment system was expected to increase formative feedback – will also be<br />

discussed in the perspective of “Murphy’s law of unintended consequences”: goal-oriented<br />

activities will generate unexpected and often counterproductive results that can nullify the<br />

desired outcomes, or, things will go wrong in any given situation, if you give them a chance.<br />

68 ENAC 2008


Reliability and validity of the assessment of web-based video portfolios:<br />

Consequences for teacher education<br />

Mark Hoeksma, Judith Janssen, Wilfried Admiraal<br />

ILO Graduate School of Teaching and Learning, University of Amsterdam, The Netherlands<br />

In this paper, we will evaluate the quality of using web-based video portfolio for the<br />

assessment of competences in teacher training. The web-based video portfolio has been<br />

designed and tested in the DiViDossier-project, which was financed by the National eLearning<br />

Programme of SURF, the Dutch foundation for ICT in higher education. Since 2003, our<br />

institute has been using an electronic portfolio system in order to provide a realistic portrait of<br />

a student's abilities, offer an opportunity for a student’s self-reflection and, communicate a<br />

student's performance to others. These portfolios contained written documents, for example<br />

reflections about one’s own behaviour and classroom situations. A major advantage of a<br />

video portfolio is that students are able to demonstrate their competences in authentic<br />

professional situations. Therefore, the system of web-based video portfolio is expected to<br />

improve the quality of assessment in teacher education: compared to written texts, videos can<br />

give a realistic, or a more valid, view of teaching competences.<br />

Two characteristics of our video portfolio system:<br />

• The video portfolio includes video-narratives in which teacher trainees can demonstrate<br />

both integrated competences and their growth in these competences.<br />

• The video portfolio includes reflections and narratives demonstrating knowledge of<br />

methodological and pedagogical approaches.<br />

The use of a video portfolio has serious consequences for our procedures of assessment. A<br />

list of fairly ‘open’ criteria is used by two teacher educators, the students’ supervisor and an<br />

uninvolved teacher educator. They assess the professional competences of student<br />

teachers and the congruency between their performance on video and their reflections.<br />

In this study, our main question is how to enhance the quality of our assessment procedure<br />

of web-based video portfolio, in terms of reliability and validity. We will gather the following<br />

data:<br />

1. Individual interviews with 10 teacher educators on their assessment procedures of video<br />

portfolios<br />

2. Individual think-aloud interviews with 10 teacher educators on their assessment of a<br />

particular video portfolio<br />

3. Assessment forms with the assessments of 40 portfolios (80 forms, 2 per portfolio)<br />

The first set of data will result in general information of assessment procedures, evaluation<br />

of the criteria used, and justifications of the assessments. The second and the third set of<br />

data will be used for the analysis of reliability and validity. The reliability will be reported in<br />

Cohen’s kappa; the validity will be analysed by using qualitative research methods. The<br />

validity will be examined by investigating whether the assessors are biased in their<br />

assessment of the performance or reflections (cf., Heller Sheingold, & Myford, 1998). In<br />

addition, we will examine the consistency between their beliefs and practice of assessment.<br />

References<br />

Heller, J. I., Sheingold, K., Myford, C. M. (1998). Reasoning about evidence in portfolios: cognitive<br />

foundations for valid and reliable assessment. Educational Assessment, 5, 5-40.<br />

ENAC 2008 69


Diversity in patterns of assessment across a university<br />

Jenny Hounsell, University of Edinburgh, United Kingdom<br />

Dai Hounsell, University of Edinburgh, United Kingdom<br />

Over the last quarter-century, there has been a far-reaching transformation in the practices<br />

and processes of assessment in higher education. What was once a rather limited diet of<br />

essays, reports and exams has undergone a remarkable diversification and today's<br />

university teachers have before them an abundance of possible ways of assessing their<br />

students' progress and performance. Keeping track of these changes has proved far from<br />

easy: assessments need to be tailored not only to subject requirements but also to level of<br />

study, degree programme aims and the learning outcomes for a given course unit or<br />

module. This in turn means that within universities, responsibilities for designing and<br />

conducting assessments are in various respects devolved to departments. While there have<br />

been some surveys of changes and developments in assessment, the mapping that has<br />

been done has been mostly global rather than localised, and has tended to focus on<br />

changes that have been considered worthy of documenting in the literature (Bryan and<br />

Clegg, 2006; Hounsell et al., 2007; James et al., 2002).<br />

This paper reports the findings of a study which was distinctive in its attempt to survey<br />

undergraduate assessment methods and weightings across a large and long-established<br />

university in which subject areas enjoyed a considerable degree of autonomy in devising<br />

patterns of assessment. It draws on data that has recently become much more readily<br />

available following the introduction of a new computerised database on degree programmes<br />

and course units and through departmental websites. It focuses on two aspects of current<br />

practices: methods of assessment, and weighting of examinations and coursework. A total<br />

of 91 methods of assessment (68 types of coursework and 23 kinds of exam) were found to<br />

be in use across 20 subject areas, while the total number of methods deployed within a<br />

subject area ranged from 10 to 48.<br />

The choice of methods was to a significant extent a function of subject area: a small number<br />

of assessment methods was found across the subject range, while a much larger number<br />

were confined to a limited number of departments. There were also striking differences in<br />

how assessments were weighted across subject areas and over successive years of<br />

undergraduate study. Four contrasting models were identified, differing in terms of whether<br />

weightings were uniform or variable from unit to unit and from one year to the next, and the<br />

extent to which coursework or exams was preponderant.<br />

The paper concludes by exploring the implications of these findings, both for assessment<br />

practice within the university concerned and in higher education more generally.<br />

References<br />

Bryan, C. and Clegg, K. (2006) (eds.) Innovative Assessment in Higher Education. London: Routledge<br />

Hounsell, D., Blair, S., Falchikov, N., Hounsell, J., Huxham, M., Klampfleitner, M. and Thomson, K. (2007)<br />

Innovative Assessment Across the Disciplines: An Analytical Review of the Literature. York: Higher<br />

Education Academy<br />

James, R., McInnis, C. and Devlin, M. (2002) Assessing Learning in Australian Universities. Melbourne:<br />

University of Melbourne.<br />

70 ENAC 2008


Learning-oriented assessment: A critical review of foundational research<br />

Gordon Joughin, University of Wollongong, Australia<br />

The concept of ‘learning-oriented assessment’ draws attention to a range of conceptual,<br />

research and practice issues concerning the relationship between assessment and the<br />

process of learning in higher education. The conceptual framework for learning-oriented<br />

assessment proposed by Carless, Fun and Joughin (2006) provides a convenient<br />

framework for highlighting these issues. In this paper, one of the authors of that framework<br />

draws attention to, and challenges two propositions that have become maxims in the<br />

literature of assessment and learning, namely that assessment drives learning and that<br />

feedback through formative assessment is critical to the learning process. A careful review<br />

of repeatedly cited research casts doubt on the first of these propositions. For example, the<br />

treatment of the research reported in frequently cited works such as Making the Grade<br />

(Becker, Geer, & Hughes, 1968) and The Hidden Curriculum (Snyder, 1971) has often<br />

oversimplified, and thus misrepresented, the research findings, leading to singular<br />

interpretations of complex, multi-faceted phenomena. Other research suggesting serious<br />

limitations to the capacity of assessment per se to improve students’ approaches to learning<br />

is often under-emphasized, leading to the risk of exaggerated claims for the capacity of<br />

‘alternative’ forms of assessment to foster effective learning processes in students. Finally,<br />

research on students’ experience of assessment contrasts with the prominence accorded to<br />

feedback in learning and assessment theory, highlighting a worrying gap between theory<br />

and practice.<br />

This paper provides a critical review of the empirical research basis of the above<br />

propositions regarding the roles of assessment and feedback in directing and forming<br />

students’ learning. On the basis of this review, the paper proposes an empirical research<br />

agenda that addresses what seems to be serious gaps in our understanding of fundamental<br />

aspects of the interactions between assessment and learning.<br />

ENAC 2008 71


Implementing standards-based assessment in Universities:<br />

Issues, Concerns and Recommendations<br />

Patrick Lai, The Hong Kong Polytechnic University, China<br />

Studies were made to identify practices of implementation of standards-based assessment.<br />

Tan & Prosser (2004) conducted a phenomenographic study of academics’ conceptions of<br />

grade descriptors. This study illustrates that academic staff understand grade descriptors in<br />

markedly different ways, ranging from conceptualizing the descriptors as having nothing to<br />

do with standards to understanding those which are directly related to standards. Sadler<br />

(2005) conducted another study to find out the grading practices of universities. None of the<br />

approaches identified delivered the aspirations of standards-based assessment. However,<br />

there is a need to shift the focus from criteria to standard. Whilst these two representative<br />

studies emphasize only the final grading step, there is a need to have a more thorough<br />

investigation into each of the implementation steps of standards-based assessment.<br />

A series of focus group interviews with 51 academic staff from 21 departments and two<br />

open forums were conducted. Participants were invited to comment on the issues and<br />

problems encountered at the preparation, marking and post-marking stages of standardsbased<br />

assessment. This paper summarizes the issues and concerns identified in the<br />

discussions with various stakeholder groups.<br />

In developing criteria and performance standards for assessment tasks, often the<br />

assessment task is selected first and matched to the learning outcomes. It is also difficult to<br />

set clear criteria that can be understood easily by assessors and students and that will<br />

discriminate effectively between the good students and the weaker ones. Setting up and<br />

grading students’ work based on a matrix, with descriptors for each criterion at different<br />

performance levels, is tedious and becomes unmanageable.<br />

In making assessment criteria and standards explicit to assessors and students, the key<br />

concern here is that colleagues are not clear about the depth of detail required by students.<br />

If colleagues give too many detailed examples of the expected responses for different<br />

award levels it may become too much like a model answer and will not help students<br />

develop independent study habits.<br />

In ensuring the consistency in marking and grading, the concern expressed by staff is the<br />

need to ensure consistency in marking and grading when there are multiple markers. In<br />

case if there is a large number of assignments to be marked, the marker’s perceptions of<br />

the criteria and standards may “drift” from start to finish.<br />

Finally there are two issues that can affect consistency in the development of standards.<br />

One issue concerns the different perceptions of minimum passing standards held by<br />

academic colleagues. It is quite a commonly-held perception that a skewed distribution of<br />

students’ grades in a particular subject is abnormal and should be “normalized”.<br />

Based on a series of feedback-collecting exercises, this paper will also present a number of<br />

strategies for setting assessment tasks, marking and post-marking mechanisms that can be<br />

utilized to address the concerns expressed by university academic staff about their<br />

endeavours to implement standards-based assessment. Recommendations on strategies<br />

and support to facilitate academics to implement standards-based assessment made in this<br />

paper certainly add to the literature in higher education.<br />

72 ENAC 2008


Test-based School Reform and the Quality of Performance Feedback:<br />

A comparative study of the relationship between mandatory testing policies and<br />

teacher perspectives in two German states<br />

Uwe Maier, University of Education Schwäbisch Gmünd, Germany<br />

Comparative research on test-based school reform revealed that positive impact of<br />

performance feedback information on school improvement depends on accountability policy<br />

and testing system in the respective jurisdiction (Firestone, Winter & Fitz 2000; Cheng &<br />

Curtis 2004; Herman 2004). Test-based school reform is meanwhile a prominent instrument<br />

of educational policy in Germany. But state-mandated testing systems vary from jurisdiction<br />

to jurisdiction since the German federal constitution guarantees state autonomy in<br />

educational policy. Particularly the testing systems in the two German states Baden-<br />

Württemberg and Thüringen are rich in contrast. State-mandated tests in Baden-<br />

Württemberg (Vergleichsarbeiten) are not based on competency models, provide little<br />

feedback information (raw data) and teachers are responsible for data analysis. Statemandated<br />

tests in Thüringen (Kompetenztests) are based on competency models,<br />

performance feedback includes value-added data, and external support is high. Both<br />

obligatory tests were given at the end of Grade 6 in core subjects including German<br />

language and mathematics.<br />

The hypothesis was that the elaborated and value-added feedback information in Thüringen<br />

is more accepted by schools and can rather prompt teachers to reflect upon professional<br />

improvement. Random samples of schools in both states were approached for data<br />

collection. A total of 1136 teachers completed the questionnaire (nBW=825; nThü=311).<br />

Measurement of the dependent variables is based on a quantitative survey instrument. An<br />

exploratory factor analysis revealed seven scales: General acceptance of mandatory testing<br />

(6 Items, alpha = .89), mandatory testing as a burden for schools (4 Items, alpha = .79),<br />

curricular alignment of the test (4 Items, alpha = .84), performance feedback supports<br />

diagnostic activities (5 Items, alpha =.89), performance feedback supports grading (5 Items,<br />

alpha =.80), performance feedback indicates further revision (3 Items, alpha =.90),<br />

performance feedback indicates curricular changes (4 Items, alpha =.77). The hypothesis<br />

proved to be correct. General test acceptance and the use of performance indicators for<br />

diagnostic activities and reflection upon teaching were substantially higher among teachers<br />

in Thüringen. By contrast, teachers in Baden-Württemberg show higher average scores on<br />

the scale "performance feedback supports grading". The results show again that sound<br />

testing policies are a crucial precondition for standard-based school reforms.<br />

References<br />

Cheng, L./Curtis, A. (2004): Washback or Backwash: A Review of the Impact of Testing on Teaching and<br />

Learning. In: Cheng, L./Watanabe, Y./Curtis, A. (Eds.): Washback in Language Testing. Research<br />

Contexts and Methods. Mahwah/London: Lawrence Erlbaum, pp. 3-17.<br />

Firestone, W. A./Winter, J./Fitz, J. (2000): Different assessments, common practice? Mathematics testing and<br />

teaching in the USA and England and Wales. In: Assessment in Education, 7, 2000, 1, pp. 13-37.<br />

Herman, J. L. (2004): The Effects of Testing on Instruction. In: Fuhrman, S.H./Elmore, R.F. (Eds.):<br />

Redesigning Accountability Systems for Education. New York/London: Teachers College Press, pp.<br />

141-166.Lesh, R. (1999) The Development of Representational Abilities in Middle School<br />

Mathematics. In Sigel (Ed.) Development of Mental Representation: Theories and Applications<br />

(pp.323-349). London: Lawrence Erlbaum Associates, Inc.<br />

ENAC 2008 73


Motivational aspects of complex item formats<br />

Thomas Martens, Frank Goldhammer<br />

German Institute for International Educational Research (DIPF), Germany<br />

Some benefits from Computer-Based Assessments (CBAs) are generally accepted, e.g.,<br />

shorter testing time or instant scoring. However, the question of whether CBA is more<br />

enjoyable for students is still in the focus of research. For example, Computer-Adaptive<br />

Tests (CATs) certainly allow for shorter testing times, but CATs may also irritate testees by<br />

administering constantly items with a fixed solution probability like 50% despite of the<br />

testees’ perceived test effort (see Frey, 2006). Another line of research (Björnsson, 2007)<br />

shows that 15 year old students from Denmark, Iceland and Korea enjoyed CBA more than<br />

Paper-Based Assessment (PBA) and - if they could freely chose their personal test mode –<br />

they would select the CBA mode only (53,2%) or a combination of CBA mode and PBA<br />

mode (37,6%). We assume that complex item formats like browser simulations are even<br />

more enjoyable for students than the items used by Björnsson (2007).<br />

In a first study with N=70 students the computer based assessment platform TAO (the<br />

French acronym for technology-based assessment) and the “Hypertext Builder” were used<br />

to develop and deliver complex electronic reading stimuli, covering all major text-types<br />

encountered in electronic reading such as websites, e-mail client environments, forums, or<br />

blogs. Furthermore motivational state and trait variables (see Rheinberg, 2003) as well as<br />

ICT literacy were assessed. First comparisons with older PBA studies that used the same<br />

motivational items revealed that the students like the complex stimuli much more and<br />

therefore reported that they tried harder to solve the corresponding test items. This selfreported<br />

effort is partly an effect of general computer motivation and ICT-literacy but<br />

nevertheless a direct positive effect of the used stimuli for electronic reading remains.<br />

This first result - that test items that try to mimic real world settings, like browsing a website,<br />

are more attractive for students - is not very surprising. However, it has also to be<br />

investigated, whether influences that might spoil test reliability of PBAs like boredom or<br />

refusal will be replaced by other bothering random influences that stem from complex CBAs<br />

like loosing time control or going into unimportant details. These side effects that might be<br />

related to the testee’s interaction with complex stimulus material are under ongoing<br />

research using thinking-aloud and eye tracking techniques. Systematic results from this<br />

research (N=20) will also be presented at the conference.<br />

74 ENAC 2008


Remarkable Pedagogical Benefits of<br />

Reusable Assessment Objects for STEM Subjects<br />

Michael McCabe, University of Portsmouth, United Kingdom<br />

Reusable assessment objects (McCabe, 2007) have been used to transform the learning of<br />

STEM (Science, Technology, Engineering and Mathematics) subjects by making eassessment<br />

more dynamic. The resulting “peer moderation” has improved student<br />

engagement through closer cooperation with the lecturer in the development of learning<br />

resources.<br />

The concept of peer moderation for summative examinations seems absurd. How can<br />

mathematics or science questions be released to students before an exam? If solutions are<br />

known in advance, the incentive to learn is lost. Traditional written exams are prepared in<br />

advance and moderated by academic staff under tight security. Traditional e-assessment<br />

exams require even more care over moderation and security, since they need to be<br />

checked both for their academic content and technical correctness.<br />

One possible approach is to use large e-assessment question banks. If hundreds or<br />

thousands of questions are available, then it may be possible to release them to students in<br />

advance of a formal exam involving a small subset of, say ten, questions. Students are<br />

motivated to try a large number of the formative questions and receive feedback on their<br />

progress. Lecturers can perform item analysis on the questions to derive e.g. facility and<br />

discrimination, based upon the trials. They can also identify pedagogical or technical<br />

mistakes at an early stage and then modify questions accordingly. The end result is an<br />

improvement in the quality of the question bank, but at the expense of revealing the precise<br />

questions to students. Of course, a student willing to attempt the complete question bank<br />

might be regarded as worthy of success, regardless of their ability! I have used large<br />

question banks available from publishers and national projects in this way, e.g. for<br />

mathematics and astronomy. Unfortunately, individuals rarely have the time to develop<br />

sufficient questions for large banks.<br />

Reusable assessment objects are different. They automatically generate computer-based<br />

questions, guidance, hints and feedback for students, through the use of random<br />

parameters and algorithms. The random parameters can include numbers, characters,<br />

words, text, diagrams, graphs, pictures, algebraic expressions, mathematical operators,<br />

equations, variables, functions and symbols. The algorithms specify how these random<br />

parameters interact and can include conditions applied to questions and answers. An<br />

interesting example of their use is in statistical hypothesis testing where several<br />

intermediate questions or steps lead up to a final decision. The final decision changes<br />

according to the data provided in the question. Algorithms are also useful for defining<br />

questions with open-ended answers, such as asking for an example which satisfies a set of<br />

criteria.<br />

Lecturer benefits include a reduced need for large question banks and the remarkable<br />

opportunity for students to peer moderate their summative assessment questions. Student<br />

benefits include greater motivation to attempt formative test questions, better feedback<br />

(Nicol and Milligan, 2005), greater involvement in the assessment process itself, higher<br />

quality questions and opportunities to assess their own progress more accurately.<br />

Examples of reusable assessment objects generated using MapleTA<br />

http://perch.mech.port.ac.uk/classes will be used to illustrate these ideas.<br />

ENAC 2008 75


Demystifying the assessment process:<br />

using protocol analysis as a research tool in higher education<br />

Fiona Meddings, Christine Dearnley, Peter Hartley<br />

University of Bradford, United Kingdom<br />

Marking and assessing student submissions is a fundamental part of contemporary<br />

education. To the casual observer undertaking assessment of student work may appear to<br />

be a simple process, after all the student has done the hard part - engaging with the<br />

process by completing the required assessment task. On closer examination however it<br />

seems that very little is known about the actual lecturer process of marking. Limited<br />

literature exists to inform us about how lecturers come to the decisions they do, and what<br />

influences them in reaching those decisions. What we do know is that marks are provided<br />

for the student to give an indication of their success or otherwise at the assessment task. In<br />

some cases marks are accompanied by written feedback, in the guise of qualitative<br />

statements sitting alongside the quantitative (possible alphanumeric) mark. Outcomes<br />

following the marking process depend upon the purpose for which it is seen i.e. lecturing<br />

staff may feel it reflects the quality of the educational process and student engagement,<br />

whereas the student may see the mark and feedback as giving them an idea of individual<br />

achievement and whether it relates to their own self assessment of their abilities.<br />

Although the problem of assessment does feature in the literature it is often concerned with<br />

what we do to assess students (Nicol 2007) i.e. the actual assessment approach or the tool<br />

to be used e.g. portfolio, examination etc or coursework, seminar etc, respectively; with<br />

blurring between the two. What is known is that good assessment choices will consider the<br />

teaching methods as well as the subject matter. Less attention has focused on the impact of<br />

assessment feedback on students (Higgins et al 2001). Others identify how improved<br />

feedback can assist student learning (Nicol and MacFarlane Dick 2006) or highlight<br />

feedback as an important feature (Gibbs and Simpson 2004). What remains unclear from<br />

the literature is how this feedback, which is of high importance to students (National Student<br />

Survey U.K. 2007), is constructed.<br />

This paper presentation will explore the potential of protocol analysis as a method of data<br />

collection used to uncover the thought processes involved in marking and assessing<br />

undertaken by lecturing staff at a higher education institution. This is a method of gathering<br />

concurrent verbal reports of lecturer judgements during the marking and assessing process,<br />

by recording their verbalised thoughts. Marking is almost always a solitary process, limited<br />

interactions between markers, with little being known about the cognitive processes<br />

undertaken. In a study undertaken at this university protocol analysis was used to uncover<br />

the thinking processes related to a marking and assessing task; by asking participants to<br />

speak aloud and to verbalise their cognitive processes (Ericsson and Simon 1993). A study<br />

by Orrell (2006), uses a similar approach (in Higher Education) therefore validating the use<br />

of this data collection method. The paper will examine the concepts of validity and reliability<br />

as well as provide some opportunity for discussion and examination of generalisability of<br />

findings from the utilisation of this data collection tool.<br />

76 ENAC 2008


Challenging the formality of assessment: a student view of<br />

‘Assessment for Learning’ in Higher Education<br />

Catherine Montgomery, Northumbria University, United Kingdom<br />

Kay Sambell, Northumbria University, United Kingdom<br />

This paper explores students’ understandings of ‘Assessment for Learning’ (Black et al,<br />

2003) or ‘Learning-oriented assessment’ (Carless, 2006) by outlining some of the findings of<br />

a systematic university-wide, cross-disciplinary study into student perceptions of this<br />

approach. Whilst the concept of ‘Assessment for Learning’ (AfL) has developed related<br />

theoretical bases over the last decade, there is little research that reveals students’<br />

conceptions of the meanings associated with the term. Previous research has focused on<br />

specific elements of AfL such as self and peer assessment (Dochy, Segers and Sluijsmans,<br />

1999), feedback (Higgins, Hartley and Skelton, 2002) or on specific approaches to<br />

assessment (Birenbaum, 1996). However, there has been relatively little research aiming to<br />

illuminate students’ understandings of the concepts of AfL. Early research has indicated that<br />

students construct different concepts about the meanings of assessment tasks and this<br />

‘hidden curriculum of assessment’ can sometimes be at odds with the ‘formal’ curriculum<br />

(Sambell and McDowell, 1998). This paper contributes to the rapidly growing literature that<br />

explores the socio-cultural context of learning through investigating students’ experiences of<br />

assessment for learning as ‘lived’ (Orr, 2007; Montgomery, 2007).<br />

The paper forms part of a wider study that employs a multi-site case study design with each<br />

case site across the disciplines of Engineering, Education and English, representing an<br />

implementation of AfL in a learning context. Multiple methods of data collection are used<br />

with interview, observation and focus groups generating data within an interpretive<br />

approach. The approach is predicated on the claim that the activities of learning and<br />

teaching are best understood if they are investigated as activities in their ‘natural’, sociocultural<br />

context, rather than on the basis of ‘experimental’ interventions, or on the basis of<br />

actor-related variables, such as student characteristics or motivations (Haggis, 2007).<br />

The findings suggest that although tutors were likely to draw upon assessment-related<br />

discourse, such as ‘self-assessment’, ‘feedback’ and ‘peer-evaluation’ to refer to AfL, it was<br />

notable how far students did not. Students did, however, construct AfL as markedly different<br />

from more traditional teaching, learning and assessment experiences. Their heightened<br />

engagement with learning and assessment tasks invested their experiences with personal<br />

meanings, enabling them to see the ‘real world’ application of their learning. For the students,<br />

AfL meant they were no longer simply ‘jumping through hoops’. AfL was viewed by the students<br />

as being part of an informal, personal and social context for learning where conversation and<br />

the informal exchange of views were highly prized and student emphasis lay with ‘talk’,<br />

‘listening’ and ‘seeing’ in relation to informal dialogue. Students noted that their learning was<br />

often characterised by ‘informal chat’ and that it was ‘light-hearted banter’ and ‘story-telling’.<br />

This paper may stimulate discussion of Hawe’s (2007) point that in an aligned teaching,<br />

learning and assessment setting staff and students should engage in meaningful dialogue<br />

about elements of teaching, learning and assessment. This may contribute to a shared<br />

language with which to engage in dialogue because without this AfL may risk constructing<br />

discrete and, to students, ‘foreign’ assessment systems.<br />

ENAC 2008 77


Secondary students’ motivation to complete written dance examinations<br />

Patrice O'Brien, The University of Auckland, New Zealand<br />

Mei Kuin Lai, The University of Auckland, New Zealand<br />

There is little research that focuses on students’ experience of national examinations in<br />

non-traditional curriculum such as dance. In this standard, we examine why New Zealand<br />

students sitting national examinations (NCEA) in dance, were not attempting one of two<br />

written standards of their assessment. A non-attempt by a student present at the<br />

examination is referred to as a void in New Zealand.<br />

The research used a case study approach in order to gain rich information directly from<br />

students. Initially, data was obtained from questionnaires that were completed by students<br />

(n=26) of a national cohort (n=516) as they left the examination room, in three secondary<br />

schools. This enabled students to report immediately on what they had done in the<br />

examination and the reasons for their decisions. We conducted in-depth interviews with all<br />

students (n=4) who voided a standard and also interviewed a randomly selected<br />

comparison group representing a range of achievement levels (n=5) that did not void<br />

standards.<br />

Results showed that the greatest difference between these two groups involved the<br />

students’ belief in their ability to succeed. In line with attribution theory (Weiner, 1985),<br />

students who attempted both standards attributed their success to internal factors such as<br />

the effort they put into study. Students who voided a standard attributed their results to<br />

external factors such as the appeal of the topic they had studied, the difficulty of the<br />

questions or the layout of the standard. Interestingly some students who voided a standard<br />

could provide correct answers to the interviewer.<br />

The research also found that lack of success with school practice examinations, which are<br />

intended to assist students’ preparation for NCEA examinations, had the unintentional effect<br />

of contributing to the beliefs of students who voided standards that they were not capable of<br />

success. Initially, researchers and teachers made assumptions such as lack of literacy<br />

skills, or having already achieved sufficient credits for a certificate, impacted on students’<br />

motivation to attempt dance standards but this research found that these factors did not<br />

significantly influence students to void standards.<br />

The implications of this research reinforce the importance of testing assumptions and<br />

obtaining feedback from students as a way of supporting their learning. It also reinforces the<br />

importance of contextualizing research results to each school’s unique situation. In a<br />

national study on student motivation to complete NCEA, Meyer and colleagues (2006)<br />

predicted that students would be influenced by the number of credits they had accumulated<br />

but credits did not affect students’ motivation in this case study. The study also revealed the<br />

negative effects of grade-focused assessment feedback on students with low expectations<br />

of success.<br />

References<br />

Meyer, L., McClure, J., Walkey, F., McKenzie, L., & Weir, K., (2006). The impact of NCEA on student<br />

motivation. Wellington: Victoria University of Wellington.<br />

Weiner, B., (1985). An attributional theory of achievement motivation and emotion. Psychological Review,<br />

92, 548-537.<br />

78 ENAC 2008


Mind the gap:<br />

assessment practices in the context of UK widening participation<br />

Michelle O'Doherty, Liverpool Hope University, United Kingdom<br />

This paper reports on findings from research funded by the UK Higher Education Academy;<br />

the study aimed to explore staff and student perceptions of quality feedback within the<br />

context of transition between educational sectors. Whilst seminal research has been<br />

conducted on the assessment experience of students in schools (Black and Wiliam, 1998;<br />

Black et al, 2003) and universities (Hounsell, 2003), there are relatively few studies that<br />

investigate the impact of the former on the latter. This qualitative study makes this cross<br />

sector connection, presenting data collected across nine education institutions (three<br />

schools, sixth forms and universities respectively) on perceptions of assessment. As a<br />

result, our findings address a gap in the current literature, positioning first year<br />

undergraduate expectations of quality feedback within the context of their prior experience<br />

of formative assessment.<br />

Current theory conceptualises assessment as a dialogic process (Higgins et al, 2001) in<br />

which quality feedback is the most powerful single influence on student achievement<br />

(Hattie, 1987); therefore, the provision of quality feedback is perceived as a key requirement<br />

of effective teaching in higher education (Ramsden, 2003). In practice, lecturers often<br />

believe their feedback to be more useful than students do (Careless, 2006; Maclellan, 2001)<br />

and feedback has consistently been identified as the least satisfactory aspect of the student<br />

experience in UK universities (National Student Survey, 2007, 2006, 2005). As a<br />

consequence of this mis-match in staff and student perceptions, assessment in UK higher<br />

education is being challenged.<br />

Frameworks for good practice in assessment have been developed, but attempts to<br />

conceptualise quality feedback within the context of higher education have been positioned<br />

within a formative rather than a summative process (Gibbs & Simpson, 2004-5; Nicol &<br />

Mcfarlane-Dick, 2004; 2006). However, resource constraints coupled with a widening<br />

participation agenda of mass expansion in the UK have limited the opportunities for<br />

formative assessment to be practised (Yorke, 2003; Gibbs, 2007), At the same time, within<br />

the school sector a formative Assessment for Learning Culture (Assessment Reform Group,<br />

1999) has been developed which means students experience a significant cultural gap in<br />

feedback practices between educational sectors. In particular, our findings reveal students<br />

perceive quality feedback as part of a dialogic, guidance process rather than a summative<br />

event. Conversely, in higher education concerns relating to the ‘dumbing down’ of<br />

Independent learning through spoonfeeding (Haggis, 2006) are leading to increasing<br />

tensions between the theory of good practice and the practice of assessment.<br />

This longitudinal study reports on the consequences of these conflicting expectations of<br />

guidance and independent learning for first year undergraduates and their tutors in three<br />

subject disciplines. These findings have informed recent initiatives to scaffold students’<br />

autonomous learning through formative assessment and the presentation will provide an<br />

opportunity to discuss these interventions. Thus, the presentation of our cross sector<br />

findings aims not only to reframe the context of the debate challenging current assessment<br />

practices in UK higher education, but also to contribute to the re-conceptualisation of<br />

feedback practice for future learning (Hounsell, 2007; Boud & Falchikov, 2007).<br />

ENAC 2008 79


Measuring writing skills in large-scale assessment:<br />

Treatment of student non-responses for Multifaceted-Rasch-Modeling<br />

Raphaela Oehler, IQB, Humboldt University Berlin, Germany<br />

Alexander Robitzsch, IQB, Humboldt University Berlin, Germany<br />

Researchers in large-scale assessment need to make decisions about how to handle<br />

student non-responses in their analyses. Particularly when the sample contains of lowachieving<br />

students, a considerable amount of responses might be missing by intention or is<br />

not interpretable. Reducing costs by not giving raters such texts is a common procedure.<br />

However, if rater effects are to be analysed and item difficulties are to be obtained using<br />

Multifaceted-Rasch-Modeling, a coding by the test administrators and thus introducing a<br />

new hypothetical “rater” is problematic for using standard IRT programs.<br />

One central project of the IQB is the development of large item pools on assessing<br />

students’ foreign language skills, particularly the reading, writing, and listening<br />

comprehension skills. Item development is based on the Common European Framework for<br />

Languages (CEF) which proposes six proficiency levels (A1 to C2). Item development for<br />

testing writing skills at the IQB follows the uni-level approach that means that writing tasks<br />

are developed for each CEF-level. The rating scales also demand the rater to judge texts<br />

within one level, i.e. performing quite well in an A2-task means that students’ writing skills<br />

are at least on A2. The rating criteria are either dichotomous (e. g. text organisation) or<br />

polytomous (e.g. global impression).<br />

A sample of N = 2.700 students from five different school types in Germany at grade eight<br />

to ten were tested within a multi-matrix design in 2007. Along with reading and listening<br />

comprehension items, 17 writing tasks were distributed. The study to be presented was<br />

carried out to analyse whether the newly developed rating approach works, particularly<br />

whether the order of the item difficulties obtained for the different rating criteria corresponds<br />

the CEF levels. In order to cope with the large amount of student non-responses, especially<br />

for the A1- and A2-task and to scale the data using Multifaceted-Modeling, a first approach<br />

was the distribution of the codings '8' (not interpretable texts) and '9' (empty pages) to the<br />

raters of one tasks (four ratings by six raters) percentwise according to the number of texts<br />

they had to rate in total. First results of an ad-hoc procedure of the IRT-analyses in<br />

ConQuest using the data for which the codings of the blank and not interpretable responses<br />

were distributed among the raters and when polytomous variables were recoded into binary<br />

codes showed that the order of the item difficulties fit the aimed at CEF-levels. We contrast<br />

this approach with an analysis in WinBUGS where every blank (i.e. a '9') is not modelled<br />

with a contamination of a true response by a rater effect, because rater discrepancies<br />

cannot occur by definition (besides lack of rater concentration). In addition, we extend the<br />

classical multifaceted Rasch analyses to include all rating criteria. A confirmatory factor<br />

analysis on these several rating scales and included rater effects will be presented.<br />

Along with a presentation of the rating system developed in the French project based on a<br />

uni-level approach, general implications for the treatment of student non-responses testing<br />

writing skills in large-scale assessment contexts are discussed.<br />

80 ENAC 2008


Collaborating or fighting for the marks?<br />

Students’ experiences of group assessment in the creative arts<br />

Susan Orr, York St John University, United Kingdom<br />

This paper reports on a research project on group assessment in creative disciplines in<br />

higher education that is funded by the university’s Centre of Excellence in Teaching and<br />

Learning (CETL). The central premise of this CETL is that creativity is enhanced through<br />

participation in collaborative activity.<br />

In the UK the National Student Survey identifies that students who study in arts-based<br />

subjects register lower levels of satisfaction in the areas of assessment and feedback. In<br />

addition, students and lecturers in the arts express concerns about the fairness of group<br />

assessment practices (Bryan 2004).<br />

Group assessment usually aims to measure the product created and the skills and efforts<br />

put in by members of the group (Bloxham and Boyd 2007). The effort and skills element can<br />

also be referred to as the process or the contribution. Cowdray and de Graaf (2005) point<br />

out that in arts education process and product are valued, however, process is an elusive<br />

concept. For example, as Heathfield (1999) points out, the term ‘contribution’ might refer to<br />

a student’s contribution to the task, or their contribution to group dynamics.<br />

Taking the view that assessment is a socially situated practice informed by, and mediated<br />

through, the socio-political context within which it occurs (Layder 1997), this research takes<br />

the form of an ethnographic study employing semi-structured interviewing and semiparticipant<br />

observation (Silverman 2004). I explore the ways that group assessment is<br />

experienced by students and lecturers in the subjects of dance, performance, music, and<br />

film.<br />

Across the disciplines studied I identified variation in the ways that marks were allocated to<br />

students for the process and product elements. These marking approaches represent local<br />

disciplinary and historical norms.<br />

Jacques and Salmon (2007) remind us that the process element of group work can take<br />

place out of the view of the lecturer. As a consequence, lecturers in this study have devised<br />

assessment strategies to help them assess process elements. For example, some students<br />

are asked to write about the process of the group work project in learning journals or<br />

production logs. However, students reported that they sometimes felt disadvantaged when<br />

they were asked to represent process in a text. As one student asked, ‘if it is a film, why<br />

write it?’. The written element is introduced by lecturers to help them disentangle individual<br />

contribution, however by asking students to represent process in this way lecturers may be<br />

creating unintended barriers to high achievement for some of our most creative visual<br />

students. As Smart and Dixon (2002:192) observe ‘those who are best able to articulate the<br />

collaborative […] process in a written form might gain an advantage even though their<br />

creative contribution may have been poor’.<br />

My analysis suggests that students recognise the importance of group work in terms of its<br />

vocational authenticity but that they are keen for lecturers to recognise and reward<br />

individual contribution fairly. This paper will initiate discussion about the role of process and<br />

how it might be assessed fairly and rigorously.<br />

ENAC 2008 81


Constructing a new assessment for learning questionnaire<br />

Ron Pat-El, M.Segers, P. Vedder, H. Tillema<br />

Leiden University, The Netherlands<br />

Aims/goals: In many countries, during the past decade, researchers and educationalists<br />

have put assessment on the agenda. More specifically, since the pivotal review study by<br />

Black and Wiliam (1998), the value of implementing assessment as a tool to support<br />

student learning, has been stressed. Based on qualitative studies in secondary education in<br />

the UK, the Assessment Reform Group (2002) has formulated 10 principles of assessment<br />

for learning (AfL), a constructivistic assessment strategy, where assessment is made a part<br />

of learning, and where emphasis is taken away from grading.<br />

Although reports have been published on the increasing extent to which AfL is implemented<br />

in schools, it is argued by researchers such as Black and Wiliam (1998) and McLellan<br />

(2001) that teachers tend to overestimate how well they use assessment as a tool to<br />

achieve learning gains in students. However, questionnaires used in AfL-research often<br />

suffer from methodological shortcomings such as low internal consistency of scales (e.g.<br />

Gibbs & Simpson, 2003), low factor loadings (e.g. James & Pedder, 2006) or inability to<br />

match student and teacher results (e.g. McLellan, 2001). Researching congruency of<br />

perceptions of AfL practice requires a valid instrument that enables direct comparisons<br />

between teachers and their students, and is based on a widely recognized<br />

operationalization of AfL. Therefore, this study aims to evaluate a new questionnaire, based<br />

on the principles of AfL as put forward by the ARG (2002), in which perceptions of<br />

implemented AfL-practices between teachers and their students can be compared. Based<br />

on pilot-results on a prototype questionnaire, a 48-item questionnaire is proposed that<br />

operationalizes four out of ten principles of AFL, namely: assessment for learning should (1)<br />

be central to classroom practice; (2) promote understanding of clear goals and criteria; (3)<br />

help learners know how to improve; and (4) develop the capacity for self-assessment. It is<br />

the aim of this study to evaluate whether a four-factor model based on the four mentioned<br />

principles of AfL can be confirmed.<br />

Method<br />

Procedure: A prototype self-report questionnaire was constructed and piloted in conjunction<br />

with educational experts at Leiden University at the department of Education and Child<br />

Studies. The initial 111-item questionnaire was administered as a semi-structured interview.<br />

Based on pilot-results, a prototype 48-item questionnaire was constructed and administered<br />

in conjunction with students participating in a bachelor-thesis project.<br />

Sample: The prototype self-report questionnaire was administered in 88 junior vocational<br />

high schools in the Netherlands to 1422 students (49% girls, 51% boys), who were on<br />

average 14.6 years old (SD = 1.52), and 237 teachers (43% females, 57% males), who<br />

were on average 42.3 years old (SD = 11.89).<br />

Results: Confirmatory factor analysis showed that the hypothesized four-factor model<br />

provided a good fit on the data (RMSEA = .05) for the student questionnaire and teacher<br />

questionnaire (RMSEA = .07). In the four-factor model all items corresponded to their<br />

intended factor. Cronbach’s alphas for the subscales in both teacher and student<br />

questionnaires were high.<br />

82 ENAC 2008


Assessing Professional Learning: the challenge of the<br />

UK Professional Standards Framework<br />

Ruth Pilkington, University of Central Lancashire, United Kingdom<br />

This paper addresses issues of assessment from the perspective of assessing academic<br />

professional learning.<br />

Within the UK since 2006 there has been a Professional Standards Framework with three<br />

standards descriptors which can be used to recognise the performance and professional<br />

standing of academic members of staff in HE. It is proposed that Institutions of Higher<br />

Education in the UK should adopt these standards descriptors when establishing continuing<br />

professional development (CPD) frameworks for academic staff. This initiative extends the<br />

existing range of postgraduate certificates widely used across the HE sector to structure<br />

initial professional development.<br />

Three issues emerge from this relating to the assessment themes in this conference:<br />

1. Most existing PG Certificates for recognising initial professional development for<br />

academics are accredited at M- level – as masters study. What does this mean for a<br />

professional context framed by professional values which embraces both formal and<br />

non-formal learning?<br />

2. How do you assess performance against standards in a way that is meaningful,<br />

developmental, acceptable to the academy and which is NOT competence-based?<br />

3. Professional performance and development is reliant on a notion of professional<br />

reflection and learning that is challenging for certain discipline cultures. Does the<br />

research provide sufficiently rigorous and flexible models that can adapt to more<br />

professionally appropriate tools of assessment than the current reliance on written<br />

reflective documents?<br />

The paper explores assessment at University of Central Lancashire (UCLan), UK, where an<br />

initial professional development award, the PG Certificate in Learning and Teaching in HE,<br />

has been in operation since the 1990s. The PG Certificate was originally graded using<br />

percentages but shifted to a simpler pass/refer system of grading against achievement of<br />

learning outcomes. This system has refined over the years to provide a rigorous model that<br />

has also been adopted across a Masters in Education (Professional Practice in HE). This<br />

Masters award forms a formal component of the academic CPD framework currently being<br />

developed at UCLan. Assessment of professional development within the wider framework<br />

is based on a professional dialogue using outcomes designed around the UK Professional<br />

Standards Framework descriptor statements.<br />

Recent involvement in a literature review of reflective practice as part of a national project<br />

has prompted a number of questions about the assessment of professional academic<br />

practice (Kahn et al). Within the literature I identified valuable tools for assessing academic<br />

development which explored professional learning in relation to stages of teacher<br />

development (Bell, 2001; Manouchehri, 2002; Kreber, 2004). This complements models<br />

structuring levels of reflective engagement (Moon,2004; Van Manen,1991; Hatton & Smith,<br />

1995).<br />

Practice emphasises a particular level of reflective engagement and engagement with<br />

literature to set assessment parameters appropriate to masters’ study. This is applied even<br />

when marking against learning outcomes. The spoken word shifts the parameters for<br />

measurement and judgement especially where it is part of a developmental process. What<br />

criteria will suit assessment of professional learning against standards and how will the<br />

criteria inform judgement within professional dialogues?<br />

ENAC 2008 83


Feedback – all that effort but what is the effect?<br />

Margaret Price, Karen Handley, Berry O’Donovan<br />

Oxford Brookes University, United Kingdom<br />

As resource constraints in higher education impact on the student experience, the<br />

importance of effectiveness of our practices is brought into sharp focus. This is particularly<br />

true for formative feedback which is arguably the most important part of the assessment<br />

process in its potential to affect student learning and achievement and develop deeper<br />

understanding of assessment standards. The process of giving and receiving feedback is<br />

considered limited in its effectiveness (Gibbs & Simpson, 2002; Lea & Street, 1998). This<br />

paper argues that measuring the effectiveness of feedback is fraught with difficulties.and<br />

draws on findings from a 3 –year project addressing student engagement with assessment<br />

feedback to illustrate staff and student views of effectiveness and engagement<br />

Effectiveness can only be judged if the feedback’s purpose is clear and the outcomes (e.g.<br />

learning, or student engagement) are measurable. Our study reveals the difficulties of easily<br />

evaluating feedback given the variability in staff views about the purpose of feedback and<br />

student expectations about what feedback really 'is' .Such diversity of views will rarely result<br />

in a perfect match between assessor and assessee which may explain the high levels of<br />

dissatisfaction and ineffectiveness (National Student Survey).<br />

If, as is widely accepted, feedback should primarily support future learning, its effectiveness<br />

should ideally demonstrate impact on learning. However the problem of isolating the effect<br />

of feedback within the multifaceted learning environment means that causal relationships<br />

are difficult if not impossible to prove (Salamon, 1992). Our study revealed that generally<br />

staff had no real expectation of measuring feedback’s effectiveness. There was extensive<br />

use of passive feedback methods which had no mechanisms to monitor engagement with or<br />

the effect of the feedback provided. In addition fragmented course structures limited the<br />

opportunity for the monitoring of future application of feedback.<br />

Our study confirmed the well documented and largely negative student view of feedback<br />

(Holmes & Smith 2003; McLellan 2001; Hounsell 1987) but also revealed student<br />

disillusionment with passive methods which they saw as only justifying the grade and<br />

precluded the opportunity for dialogue. For many students this led to disengagement with all<br />

feedback, engendering an impossible situation for staff seeking to engage them in the future.<br />

Simple performance measures for the effectiveness of feedback are not obvious. We may<br />

have to settle for measures of engagement rather than effects on learning but even<br />

engagement is difficult to evaluate. However meeting the students’ strong desire for more<br />

opportunity for dialogue may offer a way forward. Dialogue offers staff the opportunity to<br />

check effectiveness of feedback provided as well as an indication of student engagement.<br />

Resource constraints will not allow the return to traditional approaches to engendering<br />

dialogue but innovative ways can and must be found if feedback is be effective and<br />

demonstrably useful. Discussion will address the pitfalls of some traditional feedback<br />

processes and suggest approaches which provide performance measures of feedback<br />

within the process of increasing engagement.<br />

84 ENAC 2008


Student teachers on assessment:<br />

First year conceptions<br />

Ana Remesal, Universidad de Barcelona, Spain<br />

In the last decade there was a strong claim for formative assessment and important<br />

attempts of changing assessment practices have been made both at multiple national levels<br />

and also at an international level (Black & Wiliam, 2005; Coll et al. 2000). Nevertheless, in<br />

the author’s opinion, any attempt to change school practices confronts at least two big<br />

challenges. On the one hand, the evaluation practices at an institutional level often do not<br />

really support formative practices in the classroom; on the other hand, the teachers’ own<br />

conceptions of assessment often hinder the implementation of innovative practices<br />

(Remesal, 2007). Some studies have been carried out up to now in order to investigate<br />

teachers’ and secondary students’ about assessment (Brown, 2005, Remesal, 2006).<br />

These previous studies point at the key importance of the step from being just a student to<br />

starting to become a teacher in the professional career. In this paper the author wants to<br />

present results of the application of one scale on teachers’ conceptions of assessment<br />

(Brown, 2006). 450 teacher-student freshmen from a European country were asked to<br />

respond to a Likert questionnaire. The instrument consists of 27 items with 6 options<br />

response, positively packed. The questionnaire is based on a 4 conceptions-model:<br />

assessment as a tool for improving teaching and learning, assessment as a certifying tool,<br />

assessment aimed at accounting functions and assessment with no use at all on education.<br />

Results show a wide diversity of conceptions among student teachers and some significant<br />

differences related to the students previous educational experience (whether they were in<br />

first or second career). These results put an important challenge to us, teacher educators, if<br />

we aim at changing school practice from the root. In this latter sense, as a close future line<br />

of research, the author proposes a second answering to the questionnaire in 3 years time,<br />

when these 450 students will finish their university studies in order to identify the occurrence<br />

of changes along the teacher education program.<br />

References<br />

Black, P. & Wiliam, D. (2005). Lessons from around the world: how policies, politics and cultures constrain<br />

and afford assessment practices. The Curriculum Journal, 16(2), 249-261.<br />

Coll, C., Barberà, E., & Onrubia, J. (2000). La atención a la diversidad en las prácticas de evaluación.<br />

Infancia y Aprendizaje, 90, 111-132.<br />

Brown, G. T. L. (2006). Teachers’ conceptions of assessment: Validation of an abridged instrument.<br />

Psychological Reports, 99, 166-170.<br />

Brown, G. T. L. (2005). Teachers' conceptions of assessment: Overview, lessons, & implications. Invited<br />

NQSF Literature Review for the Australian National Quality<br />

Remesal, A. (2006). Los problemas en la evaluación del aprendizaje matemático en la educación<br />

obligatoria: perspectiva de profesores y alumnos. Tesis Doctoral, Universidad de Barcelona.<br />

Remesal, A. (2007). Educational reform and primary and secondary teachers’ conceptions of assessment.<br />

The Spanish instance, building upon Black & Wiliam (2005). The Curriculum Journal. 18(1). 27-38.<br />

ENAC 2008 85


Testing our citizens.<br />

How effective are assessments of citizenship in England?<br />

Mary Richardson, Roehampton University, United Kingdom<br />

The idea that citizenship education might provide some kind of solution to social problems is<br />

nothing new (Greenwood and Robins, 2003; Faulks, 2000; 2006). Over a decade ago,<br />

following the publication of the White Paper Excellence in Schools (DfEE, 1997) and ‘Crick’<br />

Report (QCA, 1998), citizenship became a statutory part of the National Curriculum for<br />

England. There appears to be no opposition to the idea of educating young people about<br />

citizenship, but there are issues that have arisen from the decision to make it a mandatory<br />

subject in maintained secondary schools (Kerr et al, 2003). The most significant of these is<br />

assessment.<br />

There is at present a paucity of literature that focuses on the assessment of citizenship<br />

education and an assessment ‘deficit’ within the subject is becoming apparent. In its 2006<br />

report, Ofsted found sparse evidence of coherent and effective assessment and Kerr et al<br />

(2007) claim that assessment of citizenship continues to be problematic. The challenge for<br />

citizenship educators identified by Tudor (2001) and Jerome (2002) amongst others,<br />

includes the need to construct meaningful assessments that relate to the beliefs and values<br />

under discussion. Teachers are presented with a framework for assessing citizenship, but<br />

citizenship is a new, and different subject and apply modes of assessment to content such<br />

as active participation and voluntary activities is not straightforward.<br />

This research study seeks to develop:<br />

• knowledge and understanding of the assessments of citizenship education in<br />

maintained English secondary schools;<br />

• an understanding of the general perceptions of assessments by their primary user<br />

groups – teachers and students; and<br />

• an evidence base for policy in regard to the citizenship curriculum and its assessment.<br />

This paper describes the current structure of assessment for citizenship in secondary<br />

education in England and discusses the rationale for the assessment of citizenship.<br />

Philosophical and sociological literatures inform the conceptual analysis of definitions of<br />

citizenship; curriculum theory underpins an evaluation of teaching materials, policy and<br />

curriculum development documentation; and the literature of assessment informs the<br />

interrogation and discussions around specifications, examination papers and assessment<br />

documentation from a range of sources.<br />

An empirical evaluation of citizenship assessment from the perspective of the key user<br />

groups, teachers and pupils, was central to this research. Pilot investigations found no<br />

uniform approach to assessment and this has a significant effect upon the status of the<br />

subject (Richardson, 2006). A mixed-method approach combined a questionnaire survey<br />

sent to teachers and pupils in secondary schools across England; and interviews with pupils<br />

(Years 9-11) and teachers in 18 schools around England. The findings include a discussion<br />

of pupils’ attitudes towards end of key stage assessments and the current GCSE<br />

specifications offered for citizenship. Results suggest generally positive attitudes towards<br />

citizenship as a subject, but responses from teachers and pupils underline an educational<br />

ethos which values only the things that can be measured and graded. This attitude towards<br />

assessment appears to be affecting the perceived value of citizenship and teachers often<br />

struggle to develop methods of assessment which are appropriate for the subject.<br />

86 ENAC 2008


Standards in vocational education<br />

Andreas Saniter, University of Bremen, Germany<br />

Rainer Bremer, University of Bremen, Germany<br />

The main reason for the reliability and success of cross-OECD comparative studies in<br />

general education is not only the transnational agreement about educational standards but<br />

also the comparability of educational systems. This is not met in vocational education:<br />

Systemic differences between dual, modularized and school-based vocational education<br />

and training are obvious and generate serious obstacles in finding standards compatible to<br />

all national curricula (cf. Bremer 2005).<br />

The Leonardo pilot-project AERONET has pursued an approach that is independent from<br />

national curricula or systemic preferences. The first step was a survey about the Typical<br />

Professional Tasks (TPT) of skilled work in Aeronautic industries (mechanics and<br />

electricians) in selected Airbus-plants in France, Spain, Germany and the UK. Each TPT<br />

describes a cluster of related work processes, e. g. “Production of metallic components for<br />

aircraft or ground support equipment. In each plant skilled workers perform between 9 and<br />

12 TPT (for each profession) with surprisingly small differences between the countries<br />

(details can be found on http://www.pilot-aero.net ). To be proficient in these tasks is not<br />

only part of skilled work but also the aim of the apprenticeship – with the exception of Spain,<br />

where no apprenticeship in aeronautics exists and new workers are trained for one work<br />

process only. In our approach (Bremer/Saniter 2006) the professional work on each of<br />

these tasks is the vocational education standard and basis for evaluation – not set by<br />

trainers but by the community of practice. Obviously beginners and advanced apprentices<br />

are not yet able to fulfill all requirements of a complex task – we assessed their<br />

performance by analyzing their approaches to a holistic evaluation task in terms of<br />

understandability, practicability and usability. For each profession an evaluation task related<br />

to the assembly of equipment was chosen and was presented in a paper and pencil test to<br />

around 150 first, second, third year apprentices in France, Germany and the UK. The<br />

apprentices had 4 hours to work on the task.<br />

Surprisingly the better solutions were quite similar independently of the country and the<br />

years already spent in apprenticeship – it seems that different tracks lead to comparable<br />

results. More significant is the analysis and comparison of the performance of the<br />

apprentices who failed (partly): Whereas in our sample the German apprentices with<br />

acceptable solutions tended to ignore some aspects of the task, many participants from the<br />

UK followed the processes they had learnt, regardless of its applicability to the task and the<br />

French apprentices developed inventive but unrealistic solutions.<br />

We will present detailed results and first hypotheses concerning the relation of competence<br />

development and systemic aspects of the respective national vocational education and training.<br />

References<br />

Bremer, R. 2005: Kernberufe — eine Perspektive für die europäische Berufsentwicklung? in: Grollmann,.<br />

Philipp; Kruse, Wilfried; Rauner, Felix (Hrsg.): Europäisierung der Berufsbildung, Reihe Bildung und<br />

Arbeitswelt, Bd. 14, Münster, S. 45–62.<br />

Bremer, R.; Saniter, A. 2006: La recherche en matière développement de compétences chez les jeunes en<br />

milieu professionnel, in: L’École Comparée — Regards croisés franco-allemands, Groux, Dominique;<br />

Helmchen, Jürgen ; Flitner, Elisabeth, Paris.<br />

ENAC 2008 87


Why do some students stop showing progress on progress tests?<br />

Lydia Schaap, Erasmus University Rotterdam,The Netherlands<br />

H.G. Schmidt, Erasmus University Rotterdam,The Netherlands<br />

The Institute of Psychology at Erasmus University in Rotterdam has a problem-based (PBL)<br />

curriculum. Some of the goals of PBL are to promote a deeper understanding of the to-belearned<br />

material and to train students as effective problem solvers and lifelong learners.<br />

Therefore, long-term retention of knowledge is a crucial aspect in this learning environment<br />

(Norman & Schmidt, 1992). The Institute of Psychology wishes to reflect these goals in its<br />

assessment policy. It was decided to implement progress testing as the main assessment<br />

tool in the bachelor programme, because this Progress Test (PT) focuses on long-term<br />

retention of knowledge and measures knowledge growth (Van der Vleuten, Verwijnen, &<br />

Wijnen, 1996). Moreover, the direct association between a specific course and its test is<br />

disconnected and endless resits of exams are prevented. By using the PT as the main<br />

assessment tool, it is hoped that students are challenged to study in a way that promotes<br />

long-term retention of knowledge and that students are motivated to follow (to some extent)<br />

their own interests when studying.<br />

The PT, as used in the psychology program, reflects the objectives of the first two years of<br />

the bachelor programme. In these two years the basic knowledge of several domains of<br />

psychology are studied in sequentially programmed, five-week courses. The PT is<br />

administered four times a year. To promote studying on a regular basis, every course ends<br />

with a ‘course test’. Course tests are not rewarded with credits, however when students<br />

obtain an average score of 6.5 (on a ten-point scale) on these course tests, it is possible for<br />

them to compensate insufficient achievement on the PT.<br />

Several analyses have been carried out on the assessment data. For instance, the<br />

relationship between course tests and progress tests has been studied, as well as students’<br />

knowledge growth. Analyses have shown that scores on course tests and progress tests<br />

correlate sufficiently (r = .70). From the analyses on the growth of student knowledge it<br />

appears that not all students show the same growth curves. In fact three groups can be<br />

distinguished (Bouwmeester & Van Onna, under revision), including a group that does not<br />

grow any more after the second year of study. Currently we are trying to explain why some<br />

students show more knowledge growth than other students. We do this by taking into<br />

account student variables (e.g. IQ, professional skills in tutorial groups and study behaviour,<br />

such as invested study time and processing strategies) as well as test variables (e.g. item<br />

characteristics and level of knowledge measured). Results will be presented at the<br />

conference to discuss the different ways of influencing student variables and test variables<br />

to promote long term retention and knowledge growth.<br />

References<br />

Norman, G. R. & Schmidt, H. G. (1992). The psychological basis of problem-based learning: A review of<br />

the evidence. Academic Medicine, 67, 557-565.<br />

Van der Vleuten, C. P. M., Verwijnen, G. M., & Wijnen, W. H. F. W. (1996). Fifteen years of experience with<br />

progress testing in a problem-based learning curriculum. Medical Teacher, 18(2), 103-109.<br />

88 ENAC 2008


Contextualising Assessment: The Lecturer's Perspective<br />

Lee Shannon, Liverpool Hope University, United Kingdom<br />

Lin Norton, Liverpool Hope University, United Kingdom<br />

Bill Norton, Liverpool Hope University, United Kingdom<br />

Aim: In the research literature assessment is recognised as a fundamental driver of the<br />

learning process (Boud, 2007;Gibbs and Simpson, 2004-5; Ramsden, 2003; Rust et al.,<br />

2005) yet the relationship between lecturers’ pedagogical belief and choice of assessment<br />

is an under explored area. The aim of this study, which builds on work by Harrington et<br />

al.(2006) and Norton et al.(2005), was to elicit lecturers’ perceptions of assessment within<br />

the broader context of their philosophy of learning and teaching. Further focus is on specific<br />

aspects of the marking process, feedback relationships and the relationship between past<br />

experiences and current practices.<br />

Methodology: Thirty in depth semi-structured interviews were carried out with lecturers in 18<br />

disciplines at three higher education institutions in the UK. Participants ranged in<br />

assessment experience from 1-22 years and were drawn from a spectrum of academic<br />

backgrounds, some followed a career purely in academia, others were experienced<br />

practitioners in their field before entering higher education. Thematic analysis, using the<br />

process developed by Braun and Clarke (2006), was selected as an appropriate interpretive<br />

tool in dealing with individual and shared meanings at the analysis stage. The analysis was<br />

carried out by three experienced researchers who worked on transcripts independently at<br />

first, then jointly in an iterative process, to arrive at an agreed thematic structure. Data found<br />

to be relevant to the overall research question were developed to form thematic strands for<br />

further analysis. At all times themes were checked against the original transcripts to ensure<br />

an accurate representation of the data.<br />

Findings: Themes highlighted include:<br />

• Practical and emotional entailments of assessment regimes.<br />

• Relationships between assessment and philosophies of learning and teaching.<br />

• Assessment for learning.<br />

• Feedback: the students’ response.<br />

• Perceptions of students’ understanding of assessment.<br />

• Features of a ‘good’ University education.<br />

• Lecturing experience and changes in self perception.<br />

• Words of wisdom for the uninitiated.<br />

Implications: This study uncovered a range of features of assessment practice in various<br />

contexts and disciplines and emphasises the participants collective view that there is a need<br />

for explicit training in assessment design, marking and the use of feedback, which<br />

resonates with the findings of Rust (2002). It was also found that participants’ approaches<br />

varied from those who held explicit philosophies of learning and teaching inextricably linked<br />

to their assessment practices, to those who made implicit assumptions about pedagogy that<br />

bore little relation to the choice of assessment.<br />

Discussion: The above issues are discussed in relation to current trends in assessment in<br />

higher education and the relationship between individual pedagogies and assessment<br />

practices. Practical guidelines for supporting staff development are suggested as are key<br />

areas for further research.<br />

ENAC 2008 89


Learning to read: Modeling and assessment of<br />

early reading comprehension of the 4-year-olds in Macao kindergartens<br />

Pou Seong Sit, University of Macau, China<br />

Kwok-cheung Cheung, University of Macau, China<br />

Learning to read for the 4-year-olds is no easy task and this is especially so in traditional<br />

kindergartens in Macao. Challenging assessment in the form of integrated assessment system<br />

(IAS) is fervently needed (Birenbaum, et al. 2006). The aim of the present study is to scaffold<br />

children to higher level of reading comprehension through reciprocal teaching methods so that<br />

children learn to make use of the acquired reading strategies (i.e. questioning, clarifying,<br />

predicting, summarizing) to read predictable storybooks (Palincsar & Brown, 1984) . Notions of<br />

assessment, an integration of “assessment of learning” and “assessment for learning”, are<br />

extended and elaborated to form an IAS to comprise: (i) whole-class storybook reading<br />

instruction using reciprocal teaching methods; (ii) individualized storybook assessment followed<br />

by storyboard assessment after whole-class reading instruction, and (iii) assessment-driven<br />

action research at the reading corners that seek to bring up children’s reading comprehension<br />

levels at their zone of proximal development (Sit, 2007).<br />

Central to the research design is a conceptual model of early reading comprehension the<br />

knowledge structure of which transits from the language world progressively to the human<br />

world, and at the same time depicts children’s minds developing through “recognizing words<br />

and grammar” to “situated understanding of the texts read” (Tse, et al. 2005). Capitalizing<br />

on the textbase and situation model of the story contexts children progress from “learn to<br />

read” to “read to learn” via four distinct developmental milestones: (i) extracting the surface<br />

meanings of the texts and recognizing the apparent features of pictures read, (ii) inferring<br />

the underlying meaning of texts and inner structure of pictures for situated understanding;<br />

(iii) making connections of the meanings constructed from texts and pictures for holistic<br />

understanding of the story read; and (iv) acquiring reading strategies through reciprocal<br />

teaching methods (Sit, 2008). Worthy of particular mention is that interpretation of the four<br />

progressive milestones is done in the light of Feldman’s (1994) ideas of “sequentiality” and<br />

“hierarchical integration”, alongside the simultaneous restricted use of “universality” and<br />

“spontaneousness” of cognitive development.<br />

Central to the implementation of IAS is the development of storybook and storyboard<br />

assessment compatible with the proposed conceptual model. In the individualized<br />

storyboard assessment, target children are guided reading the storybook and are<br />

questioned and rated using a 4-point Likert scale in accordance with the objectives of the<br />

whole-class reading instruction (e.g. whether know the main characters after reading the<br />

front cover of the storybook). In the individualized storyboard assessment immediately<br />

following the storybook assessment, children make use of a storyboard to tell the story just<br />

read. They are rated using a 4-point Likert scale according to the situated understanding<br />

already emerged in their minds. Areas assessed include degree of participation, utilization<br />

of materials provided, teacher-student interactions, power of expression, completeness of<br />

the story structure exhibited, consistency of story themes, and signs of creativity.<br />

Using the assessment results as feedback for the design of action research, the present<br />

study was successful to scaffold children of varying learning ability to progress along the<br />

milestones as envisaged in the assessment model.<br />

90 ENAC 2008


Assessment in action –<br />

Norwegian secondary-school teachers and their assessment activities<br />

Anne Kristin Sjo, Stord/Haugesund University College, Norway<br />

Knut Steinar Engelsen, Stord/Haugesund University College, Norway<br />

Kari Smith, University of Bergen, Norway<br />

This paper deals with the conference theme “Learning-oriented assessment”, and takes a<br />

closer look at formative assessment amongst teachers in lower-secondary school. The<br />

paper is research-related.<br />

One of the strongest criticisms against Norwegian teachers is their lack of formative<br />

assessment skills. Results from the 2003 PISA study indicate that Norwegian teachers<br />

spend less time on feedback and reinforcement strategies than teachers from other OECD<br />

countries (Grønmo et al., 2004). This in spite of the fact that international research<br />

considers formative assessment strategies as extremely important determinants for<br />

students’ learning (Black & William, 1998; Coffield et al., 2004).<br />

The aim of this study is to find out what kind of formative assessment practices are<br />

identified amongst teachers in lower-secondary school, and more specificly, what kind of<br />

feedback processes can be detected and developed between teachers and their students.<br />

The research context is an action research project funded by the Norwegian Research<br />

Council which focuses on developing teachers’ assessment competence. The project is<br />

carried out at two schools involving nine teachers, each developing their own digital<br />

portfolio. During the course of the project the teachers will take note of experiences from<br />

their assessment practice and reflect upon these in relation to literature about assessment.<br />

The paper focuses on teachers’ feedback practices and the analysis is built on a<br />

multileveled ethno-methodological study. The intention behind the study is to gain insight<br />

into both the teachers’ real practices as seen in the classroom and how they themselves<br />

experience and explain their own feedback strategies.<br />

Initially two teachers were observed through a period of five days. The teachers where<br />

videotaped to show their interactions with the students in an attempt to reveal how the<br />

feedback processes were carried out step by step. Three situations from the video data are<br />

used in an interaction analysis (Jordan & Henderson, 1995) to show the different steps<br />

taken in the feedback processes. The observation period is followed by seven open-ended<br />

teacher interviews (Kvale, 2001). The teachers are asked to comment on various findings<br />

from the interaction analysis and elaborate on their own experiences with different ways of<br />

giving feedback to the students. In addition to an analysis of the teachers’ portfolios, this will<br />

draw a multi-levelled picture of existing practice, and also indicate how it is developing<br />

The preliminary findings indicate that the formative assessment processes performed in<br />

classrooms and the feedback situations are more complex and have even more layers than<br />

first assumed. The student’s perception of a feedback comment is to a large extent<br />

dependent on the context in which it is given, and the same comment can be understood by<br />

the student as either informative or non-informative, as constructive or destructive, as<br />

feedback or feed-forward, all depending on the context. The interaction analysis shows that<br />

the communication between teachers and students has a certain tacit dimension and one<br />

important factor to be considered in the analysis is whether the interaction involves an<br />

implicit shared inter-subjective (Rommetveit, 1974) understanding between the teacher and<br />

the student or not.<br />

ENAC 2008 91


How do students teachers and mentors assess the Practicum?<br />

Kari Smith, University of Bergen, Norway<br />

There is recognition of the importance of Practicum in teacher education (Korthagen et al.,<br />

2001; Smith & Lev Ari, 2005). Understanding for the need of student teachers to gain<br />

access to practitioners' tacit knowledge is expanding (Cambell & Kane, 1998). There is<br />

more to teaching than the direct product of theoretical knowledge and practical skills.<br />

Teaching is highly contextualized, which makes assessment of teaching a complex issue.<br />

A central focus for the Practicum is to help students develop independent reflective<br />

competence for future career-long professional development (Dewey, 1933; Schön, 1987;<br />

Korthagen, 2001; Day, 1999, 2004). Student teachers are expected to recognize when<br />

learning takes place and to recognize what is needed for future learning (Brodie & Irving,<br />

2007). These are internal self-assessment activities related to a specific learning context<br />

and are not easily articulated.<br />

Recently the role of mentors is rightfully receiving increased attention. Smith & Lev Ari (2005)<br />

show that mentors are the most significant contributors to students’ learning during Practicum.<br />

A key function of mentors is assessing students’ teaching competence. This is a difficult<br />

task as assessment serves multiple functions, to present student teachers with feedback<br />

and guidance and to serve summative and judgmental functions to protect the profession<br />

from incompetence (Smith, 2006).<br />

There is tension between the supporting role and the assessment role, especially in relation<br />

to summative assessment (grading). However, mentor opinion is essential to strengthening<br />

the validity of assessment. Mentors know the context of teaching and are able to assess the<br />

appropriateness of actions in that specific setting. Mentors accumulate practical and nondocumented<br />

evidence of assessment dialogue.<br />

The focus of the current study is to examine the extent of agreement between students’ and<br />

mentors’ assessment of the Practicum.<br />

A random sample of 20 students and their 20 mentors will be selected after the spring<br />

Practicum, and asked if they agree to respond electronically to an open ended structured<br />

questionnaire with the following focus points:<br />

• What is a good Practicum?<br />

• Strong points exhibited by student<br />

• Issues that need to be strengthened<br />

• How to go about implementing alternatives for improvement<br />

• Overall assessment of the practice period (grade).<br />

The responses of the 20 pairs (student/ mentor) will be compared to each other internally:<br />

• Comparison of open responses to each question<br />

• Comparison of grades (final question)<br />

Finally, the responses of the students and the teachers will be analysed separately to look<br />

for group commonalities.<br />

To ensure the validity of the findings the author and an additional researcher will analyse<br />

the data separately. The presented results represent the outcome of a moderation process..<br />

Data collection takes part in March 2008.<br />

Significance: The quality of communication and shared understanding of goals between<br />

students and mentors seem to be a major criterion for a successful Practicum and for<br />

quality assessment of students’ achievements. Until today, research on assessment of the<br />

Practicum is meagre (Graham, 2006), and hopefully the current study will deepen our<br />

understanding of the extent of agreement between students and mentors.<br />

92 ENAC 2008


Assessment of competencies of apprentices<br />

Margit Stein, Catholic University of Eichstätt-Ingolstadt, Germany<br />

Within the project ‘LAnf - Leistungsstarke Auszubildende nachhaltig fördern’ / Assisting<br />

highly competent apprentices’ of the BiBB (Bundesinstitut für Berufsbildung / Federal<br />

Institute for Vocational Education and Training) instruments and assessments were tested<br />

for diagnosing highly competent apprentices and young professionals.<br />

In reference to the deseco-program of the OECD (‚Definition and Selection of Competencies’)<br />

competencies within the project LAnf was not merely defined as cognitive<br />

competencies but also as achievement motivation with a high willingness for learning as<br />

well as social competencies and autonomy. This definition of competencies within LAnf<br />

sticks to the three aspects of competencies in the deseco-program: the competence to act<br />

autonomously, the effective and interactive use of symbols like language or mathematical<br />

symbols and the effective interaction in various heterogeneous groups. Especially within the<br />

domain of professional education and training a concept that would define competencies<br />

merely as skills would be rather one-sided.<br />

Up to now especially theoretical models regarding the concept of „professional competencies<br />

and skills” were developed that were rarely approved and assessed within reality.<br />

Within the project LAnf the assessment for professional competencies was developed on<br />

the assumption that within professional contexts even more than in the context of instruction<br />

within schools effective and competent action relies on factors besides mere cognitive<br />

competence like autonomy and the effective interaction in heterogeneous groups.<br />

Based on these theoretical assumptions within LAnf a psychometric assessment was<br />

developed and tested for assessing highly competent apprentices and young professionals.<br />

In a first step trainers of different enterprises and companies of varying size were asked to<br />

name apprentices and trainees who were outstanding concerning their competencies within<br />

daily work. This group of highly competent apprentices regarding to their trainers was then<br />

in a second step compared with a group of vocational school students that was matched<br />

concerning age, sex and apprenticeship training position. Both groups were confronted with<br />

an assessment based on the three dimensions of competencies of the desesco approach.<br />

The data stated that the group of highly competent apprentices regarding to their trainers<br />

(n=52) outmatched the group of vocational school students (n=61) within the domains of<br />

cognitive competencies and intelligence. The first group was even more highly significant<br />

predominant regarding achievement motivation and effective interaction in various<br />

heterogeneous groups. The matching between professional interest and professional<br />

demands was not significantly different between both groups. The data shows that not only<br />

cognitive aspects but also motivational and social aspects of competencies differ between<br />

groups that display different professional performance.<br />

ENAC 2008 93


Academics’ epistemic beliefs about their discipline and implications for their<br />

judgements about student performance in assessments<br />

Janet Strivens, The University of Liverpool, United Kingdom<br />

Cathal O'Siochru, Liverpool Hope University, United Kingdom<br />

In recent years there has been a growing focus within the debates on learning and teaching<br />

in higher education on the importance of the discipline. Academics’ primary professional<br />

allegiance is known to be to their subject and it is increasingly seen as ‘good practice’ to<br />

approach the development of their teaching skills through a disciplinary perspective. This<br />

brings into question what we really know about the nature of disciplines. Do academics<br />

within subjects share a consistent set of beliefs about their subject which do or should<br />

influence the way they teach and assess their students? If so, are these beliefs implicit or<br />

can they be clearly articulated, for the presumed greater benefit of students? Or are there in<br />

fact significant inconsistencies which may lead to different criteria applied to judgements<br />

about the quality of student performance, with the likely result of leaving students confused<br />

and uncertain?<br />

This paper reports on two studies with very different methodologies but a similar focus on<br />

making explicit the beliefs of academics about their subject and implications for their<br />

students. The first study explores ‘epistemic match’ between students and staff (faculty)<br />

using a pair of measures, both based on Hofer’s questionnaire on epistemological beliefs<br />

(Hofer, 2000): the second uses in-depth interviews to explore lecturers’ perceptions of how<br />

and why they make certain judgements about the quality of their students’ work when<br />

carrying out assessments, and what this means in terms of explicating their beliefs about<br />

‘knowledge’ and ‘learning’ in their subject area.<br />

Findings from both studies will be compared to attempt to establish what has already been<br />

learned about the significance of academics’ beliefs about their subject in relation to the<br />

learning of their students, and to draw out lessons for future research in this area.<br />

94 ENAC 2008


Techniques for Trustworthiness as a Way to Describe Teacher Educators’<br />

Assessment Processes<br />

Dineke Tigelaar, Jan van Tartwijk, Fred Janssen, Ietje Veldman, Nico Verloop<br />

ICLON-Leiden University Graduate School of Teaching, Netherlands<br />

Portfolios are increasingly being used in teacher education, both as a learning tool and as a<br />

tool for assessment. Since their introduction, portfolios have been expected to contribute to<br />

the learning and development of prospective teachers (Bird, 1990; Zeichner & Wray, 2001).<br />

Teaching portfolios should make prospective teachers think more carefully about their<br />

teaching and subject matter (Anderson & DeMeulle, 1998; Bartell, Kaye & Morin, 1998;<br />

Darling-Hammond & Snyder, 2000). However, portfolio use is often problematic. First, the<br />

potential benefits for student teacher learning often fail to materialize (Darling, 2001).<br />

Second, unambiguous portfolio rating is difficult to achieve, since information in portfolios is<br />

often non-standardized and derived from various contexts (Schutz & Moss, 2004). This implies<br />

that assessors have to interpret portfolio information and take account of context before they<br />

can derive judgments, which causes reliability problems. Therefore, a portfolio procedure is<br />

needed that promotes both student teachers’ learning processes and responsible interpretation<br />

in context. Applying Guba & Lincoln’s (1989) criteria for ‘trustworthiness’ seems promising in<br />

this respect (Tigelaar et al, 2005). Complying with these criteria means that trust must be built<br />

between assessors and student teachers, with assessors being aware of student teachers'<br />

concerns through extensive involvement in their learning processes (‘prolonged engagement’,<br />

‘persistent observation’). Assessors should discuss hypotheses with a peer and search for<br />

counterexamples (‘peer debriefing’, ‘progressive subjectivity’). Interpretations should be tested,<br />

accounting for all available evidence (‘negative case analysis’), and be ‘member checked’ with<br />

student teachers. Interpretations should be documented and conclusions should be supported<br />

by the original data (‘dependability’, ‘confirmability’). Finally, information about assessment<br />

conditions should be available (‘thick description’).<br />

In this study, eight teacher educators participated. Teacher educators acted both as<br />

supervisor and assessor for prospective teachers. We explored how teacher educators’<br />

formative and summative assessment of student teachers can be described using the<br />

framework that Guba and Lincoln provide. Research question: to what extent do teacher<br />

educators’ assessment activities relate to techniques for trustworthiness?<br />

Teacher educators were interviewed about the application of trustworthiness criteria when<br />

working with the portfolio. Questions focused on: (1) using the portfolio and/or other sources<br />

of information or formative and summative assessment; (2) criteria and procedures for<br />

formative and summative assessment (3) measures for guaranteeing the quality of the<br />

assessment processes. Data were analysed, testing tentative categories derived from Guba<br />

and Lincoln, summarized in matrices, discussed among the first and second author, and<br />

checked with the original interview transcripts and the participants.<br />

‘Prolonged engagement’ and ‘persistent observation’ were applied most by teacher<br />

educators. ‘Negative case analysis’ and ‘member check’ were evident in most interviews.<br />

Documenting (‘dependability’), ‘peer debriefing’ and ‘progressive subjectivity’ was done, but<br />

paid more attention to in cases of doubt. Tracing interpretation processes were applied least<br />

(‘confirmability’, and ‘thick description’). The results suggest that teacher educators need to<br />

make better use of scoring rubrics and artefacts in the portfolio to underpin their<br />

interpretations and conclusions, including their feedback to student teachers. Furthermore,<br />

methods for responsible portfolio interpretation might need to be made less time-consuming<br />

and more practical.<br />

ENAC 2008 95


Peer Assessment for Learning:<br />

a State-of-the-art in Research and Future Directions<br />

Marjo van Zundert, Open University, The Netherlands<br />

Despite popularity and advantages of peer assessment in education, a major problem has<br />

not yet been solved. An enormous variety in peer assessment practices exists, which<br />

makes it difficult to draw inferences in terms of cause and effect. All the more since<br />

generally, literature describes peer assessment in a holistic fashion (i.e., without specifying<br />

all variables present). To date, it is unclear exactly under which circumstances peer<br />

assessment is beneficial for student learning. And it is still inconclusive precisely what<br />

evokes satisfying measurements such as reliability and validity. Hence, this study attempted<br />

to investigate which variables foster optimal peer assessment that is beneficial for student<br />

learning and with satisfactory measurements.<br />

We tackled this problem by an inquiry of 26 experimental studies to map variety in peer<br />

assessment and to identify which strategies contribute to learning and measurements.<br />

Literature was selected on the basis of five criteria: (1) published between 1990 and 2007;<br />

(2) published journal article; (3) journal listed in Social Sciences Citation Index, domain<br />

Education & Educational Research; (4) empirical study; (5) main topic is peer assessment<br />

or related term.<br />

This literature inquiry resulted in a descriptive review, in which four outcome categories<br />

were distinguished. The first category concerned measurements of peer assessment.<br />

Measurement issues included among others agreement between multiple peer<br />

assessments, or agreement between student and staff assessment. For learning from peer<br />

assessment, three categories were distinguished: domain skill, peer assessment skill, and<br />

student attitudes. Learning of domain skill referred to improved quality of students’ work.<br />

Peer assessment skill concerned students’ competence in assessing peers. Student<br />

attitudes comprised their views on peer assessment. Measurements were enhanced by<br />

training and experience. Domain skill was fostered by providing students with the<br />

opportunity to revise their work on the basis of peer assessment. Peer assessment skill was<br />

ameliorated by training and dependent on student characteristics. Student attitudes were<br />

also positively influenced by training and experience.<br />

The multiplicity of peer assessment practices and the holistic way of reporting were<br />

underlined. Future research should strive for more transparency in peer assessment effects<br />

by true or quasi experimental studies, in which relations between variables are indicated, so<br />

strong inferences in terms of cause and effect can be drawn. Besides higher education,<br />

research can be broadened to vocational and secondary education, considering current<br />

developments there. Topics that need more scrutiny comprise long term learning effects,<br />

feedback, the role of interpersonal variables, and the distinction between assessing and<br />

being assessed. Also, more clarity in measurement issues and uniformity of measurement<br />

instruments are desired.<br />

96 ENAC 2008


Investigating the Pedagogical Push and Technological Pull of<br />

Computer Assisted Formative Assessment<br />

Denise Whitelock, The Open University, United Kingdom<br />

Over the last ten years, learning and teaching in higher education have benefited from<br />

advances in social constructivist and situated learning research (Laurillard, 1993). In<br />

contrast, assessment has remained largely transmission orientated in both conception and<br />

in practice (see Knight & Yorke, 2003). This is especially true in higher education where the<br />

teachers’ role is usually to judge student work and to deliver feedback (as comments or<br />

marks) rather than to involve students as active participants in assessment processes.<br />

This paper reports on a project which set out to provide further insights into the role of<br />

electronic formative assessment in Higher Education and to point the way forward to new<br />

assessment practices, capitalising on a range of open source tools. The project built upon<br />

the premise that assessment and learning need to be properly linked. It explored the factors<br />

that influence assessment inputs, processes and outcomes by:<br />

a) Developing a suite of technological tools at different levels of support for collaborative<br />

and free text entry e-assessment<br />

b) Evaluating a series of formative assessments across a number of disciplines.<br />

An agile methodological approach was adopted rather than a plan driven methodology for<br />

the development of the software and the user evaluation since the former supports<br />

adaptation rather than prediction. Student surveys and a case study methodology were<br />

employed to understand the pedagogical drivers and barriers associated with these types of<br />

assessment.<br />

Findings<br />

One of the more challenging aspects in the current e-assessment milieu is to provide a set<br />

of electronic interactive tasks that will allow students more free text entry and provide<br />

immediate feedback to them. Open Comment was a system that was built to accommodate<br />

free text entry for formative assessment for History and Philosophy students. It forms part of<br />

the pedagogical push from the Arts Faculty to construct systems that help students decode<br />

feedback, internalise it and become more self regulated learners.<br />

Other tools developed in this project include a BuddySpace, BuddyFinder and SIMLINK<br />

combination which assisted students to work remotely in a collaborative fashion to make<br />

predictions, using a science simulation, which were embedded in a series of formative<br />

assessment tasks.<br />

One of the major findings from this project is the creativity of staff, both academic and<br />

technical, to create formative e-assessments with feedback and collaborative online tasks<br />

that empower students to become more reflective learners. It might appear in the short term<br />

that the technological pull is currently overtaking the pedagogical push in the e-assessment<br />

arena but this project has shown with this collection of open source applications, that there<br />

is way forward to redress the balance. The approach adopted here sits well within a<br />

constructivist paradigm which has often been less well served in the past through formal<br />

summative assessment which is not an integral part of the knowledge construction process.<br />

ENAC 2008 97


Strict Tests of Equivalence for and Experimental Manipulations of<br />

Tests for Student Achievement<br />

Oliver Wilhelm, Ulrich Schroeders, Maren Formazin, Nina Bucholtz<br />

IQB, Humboldt-University Berlin, Germany<br />

Much of the research on equivalence of measurement instruments across test media is<br />

easy to summarize: Unless a test of maximal behavior is strongly speeded, test media will<br />

be of negligible relevance for what the test measures. For many applied and scientific<br />

purposes, this statement is obviously too simplistic. For example, high disattenuated<br />

correlations across test media do not ascertain the irrelevance of test media. Similarly, two<br />

measures with exactly the same score distributions do not necessarily measure the same<br />

ability underlying observed maximal behavior. The issue of equivalence across<br />

manifestations of a measure is a nuisance because due to the lack of generalizable results<br />

about the absence of determinants of divergence, the equivalence of two forms of a test has<br />

to be determined for each test in each application population and across soft- and hardware<br />

realizations. Apparently, this scientifically not very intriguing problem is a psychometric<br />

Pandora box.<br />

However, a lack of equivalence does not necessarily indicate failure of converting a<br />

measure. Lack of equivalence can also indicate meaningful improvements of a measure.<br />

For example conventional listening comprehension tasks have a variety of shortcomings<br />

that can be overcome in order to improve measurement quality in computerized testing: By<br />

using computers, it is easier to ensure the same audio stream in the same quality for all<br />

participants, there are more degrees of freedom in administrating a task in a group setting<br />

(rewinding, forwarding, pausing), and the response alternatives can be included into the<br />

audio stream. Currently, the importance of such improvements is underinvestigated.<br />

In study one, we have administrated reading and listening comprehension tests of English<br />

as a foreign language in traditional and computerized versions to a larger sample of<br />

secondary students. Test version and sequence were varied between subjects. An attempt<br />

was made to keep the computerized versions of all measures as close as possible to the<br />

conventional test form even if that implied suboptimal operationalisations of a measure. In<br />

study two, we have used modified versions of listening comprehension tests, implementing<br />

not only stimuli but also responses in audio format with a similar sample. Additionally,<br />

completely newly developed video comprehension tests were administrated. In both<br />

studies, standard demographic questionnaires and a questionnaire assessing computer<br />

experiences allow for group comparisons of means and covariances in structural equation<br />

modeling. Fluid and crystallized intelligence measures serve as covariates.<br />

The focus of all analyses are covariances between latent variables in a multi-group context.<br />

The discussion will consider a) advantages and disadvantages of computerized testing from<br />

the perspective of construct validity and b) opportunities for the assessment of hitherto<br />

unmeasurable aspects of student achievement.<br />

98 ENAC 2008


Roundtable Papers<br />

ENAC 2008 99


100 ENAC 2008


Why the moderate levels of inter-assessor reliability of student essays?<br />

Morten Asmyhr, Østfold University College, Norway<br />

Although some studies indicate that inter-assessor reliability is adequate when student<br />

papers in the essay format are considered (e.g. Johnsson & Svingby 2007), other studies<br />

reveal serious shortcomings as to assessor reliability, both when examination papers and<br />

portfolios are concerned. A number of studies, some of them ancient, are revisited to search<br />

for regularities that might help identify factors that contribute to low marker reliability.<br />

At a recent occasion, all assessors agreed to submit their tentative mark prior to the final<br />

marking session at two separate examinations. The analysis of the results revealed greater<br />

differences between the two markers than 2 steps on a 7 point scale at one of the<br />

examinations. Results on the other examination was somewhat better as to marker<br />

reliability. A small number of the student papers were selected for the second part of the<br />

study. A number of assessors were from the local pool of assessors for the examination in<br />

question recruited to mark individually the papers and to record their practical procedure<br />

when marking. Their use of defined and specified assessment standards and criteria was<br />

made a significant area of concern in their reports. A sample of students sitting for the same<br />

examination was also recruited to mark the same papers. The results from the whole data<br />

set was compiled, analysed and fed back to the group of students for them to assess the<br />

assessment of the whole group of assessors.<br />

In the paper, a more comprehensive survey of studies on assessment reliability will be<br />

presented and the results from the present study will be given and discussed in relation to<br />

practical as well as theoretical concerns pertinent to examination and assessment<br />

procedures. Is it possible to maintain a satisfactory consistency across markers or do<br />

students have to accept examination results that equally dependent upon who is the<br />

marker(s) and the quality of the students’ papers?<br />

References<br />

Jonsson, A. & Svingby, G. (2007). The use of scoring rubrics: Reliability, validity and educational<br />

consequences. Educational Research Review, 2(2): 130-144.<br />

ENAC 2008 101


Approaches to the assessment of graduate attributes in higher education<br />

Simon Barrie, C. Hughes, C. Smith<br />

The University of Sydney, Australia<br />

This paper draws on a literature review of the various approaches to the assessment of generic<br />

graduate attributes. The literature review was conducted as the first stage of a national research<br />

study exploring the integration of generic attributes in Australian universities’ assessment<br />

practices (Barrie, Hughes & Smith 2007). The issue of graduate attributes (also referred to by<br />

some authors as generic, core or employability skills), has received considerable attention in<br />

recent years as universities seek to renew and articulate their purposes and demonstrate the<br />

efficient achievement of these, particularly in response to calls for accountability (Barrie, 2005).<br />

Graduate attributes have been widely taken up by universities in many parts of the world<br />

including Australia. While university policy claims commonly refer to the “integration” and<br />

“embedding” of graduate attributes, questions have been raised in relation to the alignment<br />

between what is espoused, what is enacted and what students experience and learn (Bath,<br />

Smith, Stein and Swan, 2004) in assuring their development. Australian University Quality<br />

Agency (AUQA) audits have revealed the need for more systematic addressing of generic<br />

attributes in curricular and the provision of stronger evidence in support of institutional claims<br />

than policy statements and relatively surface mapping activities.<br />

It has been argued that the strongest evidence of graduate attribute policy implementation is their<br />

embedding in course and program assessment activities (Barrie, 2004). However there are<br />

significant barriers to the achievement of this including; lack of a conceptual basis and consistent,<br />

coherent operational definitions of the intended outcomes, the difficulty in meaningfully<br />

communicating assessment standards to students, the challenge of articulating the<br />

developmental progression and the temptation to resolve problems by defining skills at an everincreasing<br />

level of detail which soon becomes unworkable for academics and students alike<br />

(Barrie, 2005; Washer, 2007). Despite these barriers, the literature contains numerous examples<br />

of approaches to the assessment of graduate attributes which demonstrate a wide diversity of<br />

methods, levels of student involvement and disciplinary contexualisation. These include:<br />

• non-traditional assessments such as moral assessments and exit interviews. (Dunbar,<br />

Brooks, and Kubicka-Miller, 2006)<br />

• attempts to develop institutional grade descriptors based on generic attributes (Leask, 2002)<br />

• the development of resources such as templates to guide the design of assessment<br />

(Watson, 2002)<br />

• authentic outcomes based approaches using portfolio assessment (Hernon, 2004;<br />

Seybert, 1994)<br />

• standardised tests such as the Graduate Skills Assessment Test and the Collegiate<br />

Skills Assessment<br />

• self-rating scales (eg CEQ)<br />

• integrated performance based assessment tasks (Hart, Bowden & Watters, 1999)<br />

• the use of postgraduate assessment strategies eg oral presentation and defence in<br />

undergraduate contexts.<br />

This paper presents a typology of assessment approaches based on an analysis of the types of<br />

assessment strategy. This typology is considered in relation to the barriers to integration<br />

identified in the literature on generic attributes and in relation to emerging theoretical and<br />

conceptual models of graduate attributes. In doing so the paper identifies the potential for<br />

different assessment approaches to overcome the key barriers to assessment of generic<br />

attributes and will stimulate discussion in relation to both theoretical and practical issues.<br />

102 ENAC 2008


Assessment for learning in and beyond courses:<br />

a national project to challenge university assessment practice<br />

David Boud, University of Technology, Sydney, Australia<br />

There has been considerable debate in recent years about how assessment can contribute<br />

to student learning. However, most of this has focused on learning within the framework of<br />

the course of study being undertaken. In a changing world however, assessment needs<br />

also to foster the learning that will occur after course completion, as higher education needs<br />

to provide students with a foundation for a lifetime of professional practice in which they will<br />

be continually required to learn and to engage with new ideas that go beyond the content of<br />

their university course.<br />

As part of this, a critique has been building on the inadequacy of formative assessment<br />

practices that help students’ learning during their courses (eg. Sadler, 1998, Yorke, 2003).<br />

There has also been substantial criticism of the role of summative assessment and its<br />

negative effects on student learning (eg. Ecclestone, 1999, Knight, 2002, Knight & Yorke,<br />

2003). There is also concern that simply increasing feedback to students is not, in itself, a<br />

worthwhile practice unless it also builds students’ capacity to critique and improve their own<br />

work (Hounsell, 2003). There is a flourishing literature exploring assessment practices that<br />

have positive effects on learning and there have been important initiatives that look at the<br />

long-term consequences of university courses, including assessment, on subsequent<br />

learning in professional practice (Mentkowski, 2000).<br />

Boud (2000) discussed the needs of assessment in a learning society and introduced<br />

requirements for a new way of thinking about assessment. He suggested that current<br />

assessment practices in higher education did not equip students well for a lifetime of<br />

learning and the assessment challenges they would face in the future. He argued that<br />

assessment practices should be judged from the point of view of whether they effectively<br />

equip students for a lifetime of assessing their own learning. More recently Boud suggested<br />

(2007) that assessment needed to be reconceptualized as an activity of informing judgment,<br />

in particular informing judgements of learners about their own work.<br />

The Carrick Institute for Learning and Teaching in Higher Education, the main funding body<br />

for teaching and learning development for Australian universities has established a one year<br />

project to draw on international research to examine how assessment practices that focus<br />

on learning in and after courses can be developed, particularly in areas where there are<br />

large cohorts of students.<br />

The Roundtable discussion will take place at the very start of this project and it will seek to<br />

elicit international collaboration. It will focus on the questions: what assessment practices<br />

have shown potential for both meeting summative purposes but also informing student<br />

judgement in ways that carry beyond the end of courses? What evidence is there for the<br />

utility of such practices? How can they be extended beyond their initial sites of<br />

development? How can the uptake of new assessment practices within universities be<br />

influenced? The approach the project has adopted on these matters will be discussed and<br />

the views of participants on these canvassed.<br />

ENAC 2008 103


Electronic reading assessment: The PISA approach for the<br />

international comparison of reading comprehension<br />

Kwok-cheung Cheung, University of Macau, China<br />

Pou-seong Sit, University of Macau, China<br />

This paper seeks to document how Macau-PISA Center prepares for electronic assessment<br />

of reading literacy for the 15-year-old students in secondary schools in Macao. First,<br />

emerging concepts of reading literacy with regard to life-long learning for our next<br />

generation in the digital age will be explicated. Congruence of the proposed concepts of<br />

electronic reading literacy with existing curricular and instructional provisions in Macao is<br />

evaluated. Second, the Reading Literacy Assessment Framework, a response to the OECD<br />

“DeSeCo Project” (i.e. Definition and Selection of key Competences) to include the ICT<br />

(information and communication technology) components as key competences, is<br />

presented to highlight the constructs assessed and nourished in the classrooms. Third, the<br />

paper demonstrates how test items and tasks for electronic assessment of reading literacy<br />

can be designed, and subsequently developed into an individualized computerized testing<br />

platform.<br />

Central to the PISA approach is the definition of reading literacy. According to PISA, reading<br />

literacy is an individual’s capacity to understand, use and reflect on written texts, in order to<br />

achieve one’s goals, to develop one’s knowledge and potential and to participate in society<br />

(OECD, 2006). Reading literacy is assessed in relation to: (1) text format (i.e. continuous<br />

versus non-continuous texts of one of the following five types, i.e. description, narration,<br />

exposition, argumentation and instruction); (2) aspects of the reading processes (i.e.<br />

retrieving information, forming a broad general understanding, developing an interpretation,<br />

reflecting on the contents and formal qualities of a text; and (3) situations (i.e. reading for<br />

work, education, private and public use). This definition goes beyond the basic skills of word<br />

recognition, phonemic awareness, decoding and comprehension, and it requires the reader<br />

to be an active and reflective user of texts so as to expand one’s knowledge and potentials,<br />

i.e. one has to understand, apply, integrate and synthesize texts to fulfill one’s life-long<br />

learning goals.<br />

In the electronic medium, the reading tasks generally necessitate students to identify<br />

important questions, locate information in line with the access structure of the reading tasks,<br />

analyze the usefulness of the information retrieved, integrate information retrieved from<br />

multiple texts, and then communicate replies through electronic means. Therefore, the<br />

electronic texts come across by the students are dynamic with blurred boundaries. In the<br />

print medium, reading tasks are fixed texts with clearly defined boundaries, and what<br />

students did during reading are: (1) retrieve information; (2) interpret texts; and (3) reflect<br />

and evaluate. Delineation of an assessment framework for electronic reading literacy<br />

demands an incorporation and extension of concepts from the print to the electronic<br />

medium. The three distinctive aspects of electronic reading literacy that have implications in<br />

the design of assessment rubrics become: (1) accessing and retrieving appropriate<br />

information online via search engines and embedded hyperlinks; (2) constructing and<br />

integrating texts read recursively in accordance with access structures by clicking links, and<br />

searching for usable information until the reader judges synthesis has been done<br />

meaningfully; (3) reflecting and evaluating critically authorship, accuracy, as well as quality<br />

and credibility of information retrieved and conveyed in the electronic texts.<br />

104 ENAC 2008


Developing the autonomous lifelong learner:<br />

tools, tasks and taxonomies<br />

Wendy Clark, Northumbria University, United Kingdom<br />

Jackie Adamson, Northumbria University, United Kingdom<br />

This paper describes an action research project undertaken with undergraduate students at<br />

levels 4 and 5. Responding to the recent focus on lifelong learning and portfolio based<br />

personal development planning (PDP), this ongoing project encourages students to adopt a<br />

deep, active approach to learning, and thus take responsibility for their own learning.<br />

Assessment is widely recognised as an important influence on student learning. Recent<br />

conceptual shifts in thinking about assessment have highlighted the importance of<br />

developing students as autonomous learners by viewing assessment as a learning tool<br />

rather than a measurement of knowledge, and portfolios are mentioned as one of the<br />

modes appropriate for the new thinking about assessment (Havnes and McDowell, 2008).<br />

Therefore, the modules forming the basis of the project, in which the PDP concept was<br />

integrated into the curricular content and supported by the use of an ePortfolio, were<br />

designed following the precepts of Biggs’ theory of ‘constructive alignment’ (Entwistle 2003).<br />

This fits well with the PDP/ePortfolio philosophy for encouraging learner autonomy, as well<br />

as fulfilling the assessment for learning (AfL) requirements for formative feedback and lowstakes<br />

opportunities for practice before submission for rigorous summative assessment.<br />

Although there is still ongoing debate about the criteria to be used for the assessment of<br />

portfolios (Smith and Tillema, 2008), social scientists such as Baume (2002) and Biggs<br />

(1997) have shown that a qualitative view of validity and reliability can ensure adequate<br />

rigour for summative assessment. However, it is necessary to ensure inter-rater reliability as<br />

well as to make the learning goals and assessment criteria transparent for learners (Havnes<br />

and McDowell, 2008). A taxonomy for portfolio evaluation has therefore been developed<br />

which is easily understood and applied by tutors and students.<br />

In order to study the impact of this learning environment, a variety of data has been<br />

collected and analysed. This includes:<br />

• student achievement of the stated learning outcomes of the modules, assessed in<br />

accordance with our taxonomy for portfolio evaluation;<br />

• “added value” as indicated by a correlation of UCAS entry points with summative<br />

assessment results and a measurement of student engagement;<br />

• the quality of student reflection and self-evaluation demonstrated in the reflective<br />

commentaries.<br />

Results from these analyses show a positive impact.<br />

In order to provide more empirical evidence, students this year have completed the<br />

Effective Lifelong Learning Inventory (ELLI) questionnaire (details available at:<br />

https://secure.vlepower.com/nlst/core/main.htm). This profiling tool serves a double<br />

purpose: it provides students with a vocabulary to describe their own thought processes and<br />

to articulate their ideas, and it provides statistical data to tutors which indicate development<br />

of both cohort and individual student’s learning characteristics over time. Preliminary<br />

analysis of these data, together with student opinion obtained in written commentaries and<br />

in debriefing interviews, shows that the learning environment created has brought about<br />

positive change.<br />

We welcome discussion of ways of evaluating student progress towards learning autonomy,<br />

in particular of the effectiveness of the ELLI profiling tool as a measurement of learning<br />

power development.<br />

ENAC 2008 105


Assessing the Art of Diplomacy? Learners and Tutors perceptions<br />

of the use of Assessment for Learning (AfL) in non-vocational education<br />

Gillian Davison, Northumbria University, United Kingdom<br />

Craig McLean, Northumbria University, United Kingdom<br />

This paper will present findings from an authentic assessment project (Assessment for<br />

Learning) undertaken with a group of final year under-graduate students undertaking a<br />

(non-vocational) Politics degree and who elected to take a module called ‘Diplomacy’, at<br />

Northumbria University.<br />

Teaching on the module comprised not only a mixture of traditional lectures and seminars,<br />

but also a board-game exercise. It is this exercise – that is, students playing the Diplomacy<br />

board-game – that will be the focus of our paper.<br />

The research methodology took the form of non-participant observation and semi-structured<br />

interviews with learners. Data was gathered in relation to the tutor’s and students’<br />

experiences throughout the course of the module. Data was also taken from the learners’<br />

formative assessment activities and the final summative assessment which the learners<br />

were required to undertake.<br />

Students organised themselves into one of seven “teams” based upon the imperial map of<br />

Europe. Their objective was simple: to win the game by being the last “power” standing. The<br />

module is taught to a group of 24 students. This is an optimal number for the Diplomacy<br />

board-game, as it is results in teams that are neither too small (i.e., one or two players), or<br />

too large (i.e., more than five individuals per team).<br />

The Diplomacy board game is a vital learning resource because it allows students to<br />

develop skills in negotiation, bargaining and the agreement of Treaties. The board game<br />

also lets students consider questions such as whether it is ever permissible to lie, cheat or<br />

break promises. This approach to learning requires students to be active learners (anybody<br />

not paying attention is likely to be eliminated from the game!) and involves students<br />

focusing on values, building alliances, cultivating relationships and, most importantly, trust.<br />

Not only is this a vital aspect of standard diplomatic relations, but it also enables students to<br />

meet the module’s learning outcomes (the ability to: critically examine the role of diplomacy<br />

in today’s world order; apply diplomatic thought to real-world situations; and to examine<br />

critically whether current understandings of diplomacy can help to explain the business of<br />

interstate relations).<br />

Over the twelve week period students are required to compile formative assessment<br />

material in the form of seminar logs, detailing their experiences of the Diplomacy board<br />

game. This constitutes some 20% of their overall mark, and serves as a platform for the<br />

extended summative essay that students write at the end of the module.<br />

The paper aims to demonstrate that authentic assessment activities can be used effectively<br />

within non-vocational subject areas and do not necessarily need to be located in areas of<br />

professional practice.<br />

106 ENAC 2008


Assessment of oral presentation skills in higher education<br />

Luc De Grez, Martin Valcke, Irene Roozen<br />

University College Brussels, Belgium<br />

Research Problem: Underlying this research is the concept of self-regulated learning from a<br />

social cognitive perspective (Bandura, 1997). A learner acquires standards and has to be<br />

capable, eventually, to compare his actual performance with these standards and to try to<br />

close the gap. This process generates internal feedback and is often supplemented by<br />

external feedback from teachers and peers. Both forms of feedback have to be accurate<br />

because accurate calibration seems a necessary condition for productive self-regulating<br />

learning. We can change this demand for an accurate calibration into a reliability problem, if<br />

we conclude that we strive for the same assessment result whether performance is<br />

assessed by teachers, peers, or by the learner. An overview of the literature about self- and<br />

peer assessment in the domain of oral presentation skills generated some questions and<br />

remarks: Can the optimistic view be maintained that only a simple instruction is needed to<br />

generate peer and self-assessments that are in agreement with assessments by<br />

professionals? And what if there’s no such agreement? In that case, the generalizability<br />

analysis (Brennan, 2000) seems to be a good first step to analyse the error variance. An<br />

under investigated element are the perceptions students hold of peer and self assessment.<br />

Research Questions: (1) What is the agreement between peer and self- assessments and<br />

professional assessments? (2) What are the perceptions about peer assessments?<br />

Research Design: Research instruments<br />

Assessment instrument for ‘oral presentation performance’: A rubric was constructed<br />

containing: three content-related (introduction, structure, and conclusion) five deliveryrelated<br />

(eye-contact, vocal delivery, enthusiasm, contact with the public and bodylanguage),<br />

and one overall item.<br />

Perception of ‘peer assessment’: An existing questionnaire was used and presented twice.<br />

Procedure: First year students (n=57) delivered three short oral presentations about<br />

prescribed topics and the presentations were videotaped. Participants assessed their own<br />

first (n=24) or second (n=54) presentation. Five professional assessors assessed in total<br />

209 recordings. A total of 29 presentations were assessed by six peers.<br />

Research results: Overall, we have found a positive correlation between professional and<br />

peer assessment scores (significant for four criteria) and between professional and selfassessment<br />

scores (significant for five criteria). The total score of professional assessments<br />

is significantly lower than self- and peer assessments. However, scores on eight of the nine<br />

items of the rubric are significantly different between professional and peer assessments.<br />

A two-facet generalizability study was conducted to obtain variance estimates and to<br />

determine the number of peers needed for reliable scores. The analysis of the variance<br />

components showed that the variance in scores related to the oral presentations is low and<br />

the variance component for peers is large. The generalizability coefficient points at a good<br />

reliability (.81) and the results suggest that four peers are sufficient when nine criteria are<br />

used. The perception of peer assessment is predominantly positive and becomes<br />

significantly more positive in the second questionnaire.<br />

References<br />

Bandura, A. (1997). Self efficacy: the exercise of control. New York: Freeman.<br />

Brennan, R. (2000). (Mis)Conceptions about Generalizability theory. Educational Measurement: Issues and<br />

Practice, 5-10.<br />

ENAC 2008 107


Mobile Assessment of Practice Learning:<br />

An Evaluation from a Student Perspective<br />

Christine Dearnley, University of Bradford, United Kingdom<br />

Jill Taylor, Leeds Metropolitan University, United Kingdom<br />

Catherine Coates, Leeds Metropolitan University, United Kingdom<br />

The ALPS CETL* aims to develop and improve assessment, and thereby learning, in practice<br />

settings for health and social care students. The centre is working towards an<br />

interprofessional programme of assessment of common competencies such as<br />

communication, team working and ethical practice among health and social care students.<br />

The assessment tools will be delivered in electronic, mobile format. Between July and<br />

December 2007, ALPS issued nearly 900 mobile devices with unlimited data connectivity to<br />

students undertaking practice based learning and assessment across the ALPS partnership.<br />

ALPS is implementing the infrastructure to develop, deliver and manage learning content and<br />

assessments on mobile devices to students on a large scale across the 5 partner HEI.**<br />

The study that forms the basis of this paper is being undertaken across all five partner sites. It<br />

incorporates students from sixteen professions and will investigate the impact of the ALPS<br />

mobile assessment processes on learning and assessment within practice settings over an<br />

eighteen month period. Early outcomes of the study will be reported with an emphasis on the<br />

extent to which assessment of core competencies for practice can be facilitated using the ALPS<br />

mobile assessment processes and the relationships between these processes and learning in<br />

practice settings. The ALPS mobile assessment processes have two further innovative<br />

components, which will be explored as part of this study, these are inter-professional<br />

assessment of common competencies and service user involvement in practice assessment.<br />

Whilst there is considerable evidence of mobile devices being used in health and social<br />

care provision, their use for assessment of professional practice is a new and innovative<br />

development that has not been fully evaluated. This study builds on the ALPS IT Pilots,<br />

which explored the feasibility and key issues of using mobile technologies in the<br />

assessment of health and social care students in practice settings and were reported at the<br />

Earli conference 2006 (Dearnley & Haigh 2006, Taylor et al 2006). Key benefits were<br />

identified; these included reduction in paperwork and in risks of handling paper copies of<br />

assessment data, enhanced communication between peers and tutors leading to increased<br />

professional interactions and that in some cases mobile devices seemed to help students to<br />

overcome barriers to writing and to instil pride in their work. The project team is committed<br />

to further exploring the full pedagogic potential of this initiative.<br />

*Assessment & learning in Practice Settings is a centre for Excellence in Learning &<br />

Teaching (CETL) funded by the Higher Education Funding Council for England<br />

http://www.alps-cetl.ac.uk/<br />

**Universities of Bradford, Leeds, Huddersfield, Leeds Metropolitan and York St John<br />

University College<br />

References<br />

Dearnley C.A., Haigh J., Using Mobile Technologies for Assessment and Learning in Practice Settings: A<br />

Case Study. Third Biennial Joint Northumbria/EARLI SIG Assessment Conference. 30th Aug-1st<br />

Sept. Co Durham, UK (2006)<br />

Taylor J.D., Coates C., Eastburn S, & Ellis I. (2006) Using mobile phones for critical incident assessment in<br />

Health placement practice settings. Third Biennial Northumbria/EARL SIG assessment Conference.<br />

108 ENAC 2008


How reliable is the assessment of practice, and what is its purpose?<br />

Student perceptions in Health and Social Work<br />

Margaret Fisher, University of Plymouth, United Kingdom<br />

Tracey Proctor-Childs, University of Plymouth, United Kingdom<br />

Introduction: The Centre for Excellence in Professional Placement Learning (Ceppl) is<br />

based at the University of Plymouth in Devon, England. This Centre seeks to share and<br />

develop excellent practice in collaboration with other disciplines which have a placement or<br />

practice component (QAA 2003).<br />

One research strand is evaluating practice assessment methods in Midwifery, Social Work<br />

and Post-registration Health Studies. A multi-disciplinary team representing all three of<br />

these professional groups and comprising students, service-users, practitioners and<br />

academics is currently working on this project.<br />

This paper reports on the findings of Years One and Two of a three-year longitudinal study,<br />

which commenced in June 2006. Staff focus groups are concurrently being undertaken, but<br />

results from these will be reported at a later date. The literature clearly suggests that validity<br />

and reliability are fundamental to the success of an assessment, but are difficult to achieve<br />

(Chambers 1998, Calman et al 2002, Crossley et al 2002, McMullan et al 2003).<br />

Assessment of competence in practice is crucial in determining whether or not a student<br />

meets the criteria required of their profession (Cowan et al 2005, Watkins 2000). Early<br />

findings of the study raise important issues in relation to this existing evidence, as well as<br />

identifying further avenues for investigation. Once the study is complete, generic guidelines<br />

and resources will be developed to inform cross-professional assessment of practice in<br />

placement settings which should be transferrable internationally.<br />

Methodology: The aim of the project is to explore the student experience of the practice<br />

assessment process during a professional programme of study. Perceptions of validity and<br />

reliability of assessment methods as well as the impact of the process on the student<br />

learning experience are being explored. Multi-centre Research Ethics Committee approval<br />

was obtained for the study. An average of five students per professional group are<br />

participating in longitudinal case studies throughout their two to three-year programme.<br />

Semi-structured interviews are tape-recorded after submission of the practice assessment<br />

documents at the end of each year, and students are invited to add any further contributions<br />

during the year as they see fit. Single-case and cross-case analysis and synthesis is being<br />

conducted using the “Framework technique” (Ritchie and Spencer 1994).<br />

Findings: Analysis of transcripts from the first two years has resulted in identification of key<br />

themes:-Purpose, Process and Guidance. Practicalities of methods used and the students’<br />

perception of the purpose of assessment have been discussed. The role of the practice<br />

assessor and the placements themselves have been identified as key areas. An interesting<br />

sub-theme around honesty and integrity – “cheating the system” – has emerged as an issue<br />

of importance. This is being explored further in view of the future professional roles of the<br />

students. Information gained has already informed delivery and structure of some of the<br />

professional programmes and their practice assessment methods. Reports on the findings<br />

may be accessed on the Ceppl website at: www.placementlearningcetl.plymouth.ac.uk.<br />

Journal publication is in progress.<br />

ENAC 2008 109


Measuring variance and improving the reliability of<br />

criterion based assessment (CBA): towards the perfect OSCE<br />

Richard Fuller, Matthew Homer, Godfrey Pell<br />

University of Leeds, United Kingdom<br />

Background<br />

Assessment methodologies have increasingly come under the spotlight with respect both<br />

reliability and validity. In healthcare settings, the traditional unstructured ‘long and short<br />

cases’ have given way to the OSCE (Objective Structured Clinical Examination) where<br />

students undertake a series of short clinical assessments which are objectively assessed<br />

against predetermined criteria. The OSCE is a prime example of CBA in health care<br />

programmes, allowing careful blueprinting, spread of domains, clarity of assessment mark<br />

sheets, standard setting and metrics to look thoughtfully at the performance of the<br />

assessment.<br />

CBA has a number of obvious weaknesses, typically in that:<br />

- item based checklists can highly reward a scattergun approach by candidates<br />

- it can be difficult to reward better performers,<br />

- there is strong reliance on assessor behaviour despite item based checklists<br />

- they are labour intensive and costly.<br />

- Tensions exist between (face) validity on the one hand and standardisation and<br />

reliability on the other. A variety of metrics can be used in the process of defining,<br />

exploring and correcting error variance (variance in marks due to factor other than<br />

student performance). This paper explores Leeds’ experience and research in this area,<br />

defining measures for error variance and methods of reducing variance whilst<br />

maintaining strong clinical validity.<br />

Summary of work<br />

This paper will provide a brief overview of the OSCE process and analysis of final year<br />

results from recent years will be presented. We have found that between assessor variance<br />

(proportion of checklist mark/grade variance attributable to assessors out of the total<br />

mark/grade variance) in many cases exceeded 25% and in some cases exceeded 40%.<br />

Interpretation of the raft of station metrics allowed us to identify causes of both random and<br />

systematic error.<br />

This paper looks at issues such as assessor training, gender interactions and checklist<br />

structure, and shows how these issues were addressed to reduce the mean station<br />

variance to below 20%.<br />

Conclusions<br />

Tensions between reliability and validity continue to be important in complex CBA<br />

arrangements. This philosophical tension does have demonstrable effects – and we can<br />

use variance to examine this, and the impact of changes.<br />

Despite our best efforts, between assessor variance persists, perhaps as a result of varying<br />

perceptions of appropriate ‘standards’ for students at different stages of their courses,<br />

because of varying levels of assessor maturity and confidence in dealing with checklist<br />

items. Whilst we have made significant improvements by addressing specific issues<br />

detailed in this paper, it is important to recognise that error variance in complex, high stakes<br />

criterion based assessment remains an ongoing challenge.<br />

110 ENAC 2008


Learning through assessment and feedback:<br />

implications for adult beginner distance language learners<br />

Concha Furnborough, The Open University, United Kingdom<br />

Feedback on marked assignments is an important element in the learning process,<br />

especially in distance learning, where it can provide students not only with a measure of<br />

their progress but also with individualised tuition (Cole et al., 1986), and may be the sole<br />

channel for student-tutor communication (Ros i Solé & Truman, 2005: 88). Feedback also<br />

makes an important contribution to motivation (Walker & Symons, 1997: 16-17). This paper<br />

reports specifically on distance learner perceptions of positive tutor feedback, together with<br />

cognitive and affective responses they may generate.<br />

One of the challenges of studying a language at a distance is managing interpersonal and<br />

communicative aspects of language acquisition (Sussex, 1981: 180); in the Open University<br />

(UK) students are offered a supported distance course, which includes tutor feedback on<br />

assignments. In this model feedback has a dual function, being used for both formative and<br />

summative assessment purposes. Anecdotal evidence suggests that students attach far<br />

greater importance to the latter than to the former, although it can be argued that learning<br />

occurs when students perceive feedback not simply as a judgement on their level of<br />

achievement but as enabling learning (Maclellan (2001) in Weaver, 2006: 380-381).<br />

Learning depends not only on the quality of the feedback but also on students’ responses to<br />

it, according to how they interpret it.<br />

The research presented here is part of a larger study on motivation that gathered data<br />

through questionnaires and interviews. These findings draw mainly on data obtained from<br />

56 telephone interviews with students of Spanish, French and German at the midpoint of<br />

their courses. The interviews covered themes associated with motivation, including<br />

approaches to distance language learning, support in language learning, confidence and<br />

progress; so they enabled us to situate learner perspectives on tutor feedback in the context<br />

of their views on other aspects of their learning.<br />

Our results suggest that this concept of feedback as a learning tool is especially important<br />

for beginner language learners in distance learning settings, and one that also acts as a<br />

vehicle for increasing their self-confidence – an important consideration in terms of<br />

motivation maintenance. We would also argue that some learners in this category need little<br />

help in discovering how to use feedback to these ends, whereas others require<br />

considerable support, guidance and encouragement.<br />

Although our target group was beginners in a distance learning context, the findings may<br />

also be applicable to other levels and learning contexts.<br />

Practice-related discussion<br />

Suggested areas for discussion are:<br />

• implications for raising learner awareness of the teaching and learning function of<br />

feedback;<br />

• training of tutors to be aware of students’ needs in terms of feedback<br />

• the potential of feedback to engage students in active learning, and enhance their selfconfidence<br />

and motivation.<br />

ENAC 2008 111


Secret scores: Encouraging student engagement with useful feedback<br />

Stuart Hepplestone, Sheffield Hallam University, United Kingdom<br />

This short paper session will discuss the use of technology in providing useful feedback to<br />

students by exploring the development of, and presenting initial findings from ongoing<br />

research into the practical experience and impact on the student assessment experience, of<br />

two separate, yet complimentary, tools at Sheffield Hallam University (SHU) to enhance the<br />

way feedback can be provided to students, and to encourage students to engage with their<br />

feedback through the Institution’s virtual learning environment, Blackboard.<br />

Students at SHU are increasingly expecting access to their feedback and marks online,<br />

often remarking on the usefulness of online feedback as a way to track their progress on<br />

different assessment tasks for their modules. To meet these rising expectations, the<br />

University undertook a project to enhance the way feedback can be provided through<br />

Blackboard (Hepplestone & Mather 2007). A key aspect of this project was the development<br />

of customised assignment handler extension which supports effective online feedback<br />

through the Blackboard Gradebook by enabling tutors to batch upload feedback file<br />

attachments along with student marks, providing feedback on group assignments to each<br />

individual in the group, presenting student feedback all in one place and close to their<br />

learning, and encouraging students to engage with their feedback to trigger the release of<br />

their marks (after Black & Wiliam, 1998, who argued that the “effects of feedback was<br />

reduced if students had access to the answers before the feedback was conveyed”). (A<br />

poster presentation, Useful feedback and flexible submission: Designing and implementing<br />

innovative online assignment management, accompanies this short paper to explore the<br />

development process).<br />

Accompanying this development is an electronic feedback wizard. This tool allows tutors to<br />

quickly generate consistent individual feedback documents for an entire student cohort specific<br />

to each assignment created in Blackboard from a generic feedback template containing a<br />

matrix of assessment criteria and feedback comments (Hepplestone & Mather, 2007). This<br />

initiative stems from various systems developed and used by individual colleagues at SHU,<br />

paralleling the work of Denton (2001) who developed a technique using a combination of<br />

Microsoft Excel and Microsoft Word to generate personalised feedback sheets.<br />

SHU is a large UK University with over 28,000 students, based across three campuses,<br />

offering a diverse range of undergraduate and postgraduate courses.<br />

References<br />

Black, P. and Wiliam, D. (1998) Assessment and classroom learning. Assessment in Education, 5 (1), pp.7-74.<br />

Denton, P. (2001) Generating Coursework Feedback for Large Groups of Students Using MS Excel and<br />

MS Word, [online]. University Chemistry Education, 5 (1), pp.1-8. Last accessed 12 February 2008<br />

at: http://www.rsc.org/pdf/uchemed/papers/2001/p1_denton.pdf<br />

Denton, P. (2001) Generating and e-Mailing Feedback to Students Using MS Office, [online] In: Proc. 5th<br />

International Computer Assisted Assessment Conference, Loughborough, 2-3 July 2001. Learning<br />

and Teaching Development, Loughborough University. Last accessed 12 February 2008 at:<br />

http://www.caaconference.com/pastConferences/2001/proceedings/j3.pdf<br />

Hepplestone, S. & Mather, R. (2007) Meeting Rising Student Expectations of Online Assignment<br />

Submission and Online Feedback, [online] In: Proc. 11th Computer-Assisted Assessment<br />

International Conference 2007, Loughborough, 10-11 July 2007. Learning and Teaching<br />

Development, Loughborough University. Last accessed 12 February 2007 at:<br />

http://www.caaconference.com/pastConferences/2007/proceedings/Hepplestone%20S%20Mather%<br />

20R%20n1_formatted.pdf<br />

112 ENAC 2008


Large-Scale Assessment and Learning-Oriented Assessment:<br />

Like Water and Oil or new Possibilities for Future Research Directions?<br />

Therese Nerheim Hopfenbeck, University of Oslo, Norway<br />

For better or worse, large-scale assessments seem to be here to stay. Surveys such as the<br />

Programme for International Student Assessment (PISA) have had a huge impact on<br />

national educational policy in several countries, and will probably continue to do so.<br />

The aim of the current work is to bridge the gap between the fields of educational<br />

psychology concerned with learning-orientated assessment and the field of large-scale<br />

assessment and the need for policy relevant data.<br />

The present paper consists of two arguments. First I will argue, that despite critiques, largescale<br />

assessment offer valuable information to the field of educational research. They can<br />

play a valuable role in the development of comprehensive assessment systems, which also<br />

includes learning -oriented assessment.<br />

Secondly, the use of questionnaires in large-scale assessments such as PISA can be used<br />

in combination with small-scale research, such as interviews, to further investigate in depth<br />

some of the main findings from large-scale assessment. Bringing qualitative small-scale<br />

research together with large-scale assessment, can lead to improvements of the research<br />

methods used for improving classroom assessment.<br />

Using a mixed method approach, combining quantitative findings from PISA 2006 with<br />

qualitative data from an interview study in Norway, descriptions of students’ self-beliefs of<br />

learning, achievement and assessment will be presented. I will show how such studies<br />

might be carried out and contribute to a deeper understanding of assessment. The<br />

relevance of the current research is based upon a review of large – scale assessment and<br />

its’ policy influence after 1970 together with the research based principles from the<br />

Assessment Reform Group (2002).<br />

In addition to the quantitative material from the PISA 2006 test, the empirical base for the<br />

discussion includes comparisons between the low achieving students and high achieving<br />

students on the following factors:<br />

• How students experienced the PISA test<br />

• Task format<br />

• Schools preparation for the PISA test<br />

• Students’ test motivation<br />

• Different assessment cultures<br />

Together these mixed method approach offer a thick description of students’ experience of<br />

large-scale assessment, their consequences’ and challenges.<br />

Finally, suggestions for combining large-scale assessment with classroom assessment are<br />

made, in an attempt to further empower the students in their learning process, to better<br />

develop as self-regulated, or what PISA calls “learners for tomorrow's world” who are also<br />

able to monitor their own learning.<br />

ENAC 2008 113


Online interactive assessment for open learning<br />

Sally Jordan, Philip Butcher, Arlëne Hunter<br />

The Open University, United Kingdom<br />

This paper describes recent developments in the formative, summative and diagnostic use<br />

of e-assessment at the UK Open University, in particular the development of interactive<br />

computer marked assignments (iCMAs). These are being introduced within a coordinated<br />

initiative that is extending the richness of e-assessment tasks within an integrated and<br />

supported pedagogical model.<br />

The iCMAs include many different question types, some of considerable complexity and<br />

involving elements of constructed learning. It is widely recognised that rapidly received<br />

feedback on assessment tasks has an important part to play in underpinning student<br />

learning, encouraging engagement and promoting retention (see for example Rust et al,<br />

2005, op cit; Yorke, 2001). Online assessment provides an opportunity to give virtually<br />

instantaneous feedback. However, providing automatically generated feedback which is<br />

targeted to an individual student’s specific misunderstandings is more of a challenge,<br />

especially in response to answers entered as free-text. Students are allowed three attempts<br />

at each iCMA question, with tailored and increasingly detailed prompts allowing them to act<br />

on the feedback whilst it is still fresh in their minds and so to learn from it (Gibbs and<br />

Simpson, 2004, op cit). Feedback can also be provided on the student’s demonstration of<br />

learning outcomes developed in the preceding period of study.<br />

Evaluation methodologies have included student observation, comparisons against human<br />

marking and a ‘success case method’ approach. Preliminary results indicate that the<br />

systems are robust and accurate in marking, that students enjoy the iCMAs (even when<br />

used summatively) and that they usually engage with the feedback provided. The system<br />

automatically collects information about student interactions, enabling the tracking of<br />

individual students’ progress (and if necessary the provision of additional support) as well<br />

as wider-ranging insights into students’ understanding of the course material.<br />

The formative capabilities of computer based assessment tasks such as those described<br />

are of particular importance in distance learning contexts, because of the ability to mimic a<br />

‘tutor at the students’ elbow’, irrespective of the geographical location of the learner and<br />

tutor (Ross, Jordan and Butcher, 2006). They enable dialogue about standards of<br />

achievement and act as a proxy for the immediately accessible learning community enjoyed<br />

by face to face students.<br />

Although the paper emphasises the open and distance learning context, we will encourage<br />

discussion of wider applicability. We share the view that e-assessment has the potential to<br />

‘significantly enhance the learning environment’ (Whitelock and Brasher, 2007), and will<br />

seek to challenge perceptions of e-assessment as being of limited validity and relevance. In<br />

so doing we will explore reasons for its relatively low uptake.<br />

References<br />

Ross, S.M., Jordan, S.E and Butcher, P.G.(2006) Online instantaneous and targeted feedback for remote<br />

learners. In Innovative assessment in Higher Education ed. Bryan, C & Clegg, K.V., pp123-131<br />

London U.K., Routledge.<br />

Whitelock, D and Brasher, A (2006) Roadmap for e-assessment. JISC. At<br />

http://www.jisc.ac.uk/elp_assessment.html [accessed 1st February 2008].<br />

Yorke, M (2001) Formative assessment and its relevance to retention, Higher Education Research &<br />

Development, 20(2), 115-126.<br />

114 ENAC 2008


Can inter-assessor reliability be improved by deliberation?<br />

Per Lauvas, Østvold University College, Norway<br />

Gunnar Bjølseth, Østvold University College, Norway<br />

Anton Havnes, University of Bergen, Norway<br />

From previous studies it is evident that inter-assessor reliability varies from nearly zero to<br />

almost complete match. However, the reliability often seems to be lower than what is<br />

considered acceptable, at least when expressive assignments is considered. In one health<br />

related study programme, several indications (e.g. from handling appeals) raised serious<br />

concerns as to marker reliability after the number of assessors had been cut back. Standard<br />

procedure is for the assessors to assign a mark and produce a written justification for easy<br />

processing when students use their legal right to receive feedback.<br />

The final summative assessment is an integrative, across-the-modules home examination<br />

where students are assigned a thematic field and required to choose perspectives and<br />

cases based on their own priorities and experience. The teaching is organised in themespecific<br />

modules while the final assignment is integrative. All teachers are involved in the<br />

final assessment, and an assessor will mark assignments that are close to or further away<br />

from his or her field of expertise.<br />

The Department of nursing education decided to run 6 workshops (full or half day) for all<br />

academic staff involved in the bachelor programme over a one year period. Prior to each<br />

workshop a set of authentic, recent student papers were distributed to all teachers (i.e.<br />

internal assessors) for individual, independent marking and with the requirement to produce<br />

the written justification (feedback) to support the mark. Student papers covered all three<br />

years of the Bachelor programme, as well as the whole 6 step range of marks (A to Fail).<br />

The workshops had three parts: (a) thematic introduction, (b) deliberations in groups to<br />

arrive at a conclusion as to mark assigned to a specific student paper, and (c) recording of<br />

results from all groups and a subsequent plenary summary and discussion. Individual and<br />

group assessments (grades and justifications) were collected, analysed and fed back to the<br />

participants, also serving as background for selecting an assessment approach to be tested<br />

out in the next workshop.<br />

Emphasis was placed on the assessors’ interpretation and application of assessment<br />

criteria. The intention to be scrutinised was whether a systematic process of collegial<br />

deliberations over the assessment of authentic student papers in relation to assessment<br />

criteria and feedback/justification would result in improved inter-assessor reliability. The<br />

‘assessment of the assessors’, conducted in the final workshop (Oct. 2007) showed that<br />

assessment reliability had improved, but only marginally; still it was the case that the<br />

variation between individual assessors’ grading and valuing of the quality of students’<br />

assignments is inferior to standards considered appropriate by the faculty. It seems to be<br />

the case, however, that the written justifications (‘feedback’)of the given marks did change.<br />

In the paper, the background for the project will be elaborated, the process and the results<br />

will be analysed and discussed: Is it realistic to improve inter-assessor reliability to an<br />

acceptable level by deliberation among colleagues? What challenges do assessment of<br />

integrative, expressive assessment tasks represent and how could they be met?<br />

ENAC 2008 115


Sketchbooks and Journals: a tool for challenging assessment?<br />

Paulette Luff, Anglia Ruskin University, United Kingdom<br />

Gillian Robinson, Anglia Ruskin University, United Kingdom<br />

This paper highlights aspects of our experience as exploratory practitioners researching the<br />

use and the value of sketchbooks and learning journals as a form of assessment. We report<br />

our developing understandings of ways in which these can support and extend students’<br />

learning within the context of an Art, Design, Technology module and an Early Childhood<br />

Curriculum module, both for undergraduate students of education. Within our Early<br />

Childhood Studies (ECS) and Primary Education BA courses we emphasise approaches to<br />

young children’s education informed by socio-cultural theories. This promotes a view of<br />

learning which stresses the importance of shared meaning making and the co-construction<br />

of knowledge. Accordingly, we draw upon the Vygotskian concept of pedagogical tools,<br />

mediating and extending knowledge construction, and emphasise a close relationship<br />

between means of assessment and student learning. Sketchbook research journals have<br />

been used as part of the assessment for the Art, Design Technology and Control,<br />

Technology for several years. The module is delivered through lectures, practical<br />

workshops, ICT workshops, self and tutor directed learning over a period of 12 weeks.<br />

Students are challenged to make and programme a 3D working model based on a work of<br />

art, to create a teaching aid that makes effective use of cross-curricular approaches. They<br />

use the sketchbook learning journal to maintain a record of thinking and decision making<br />

during the development of this project. The Early Childhood Curriculum module is studied<br />

over 24 weeks with sketchbook learning journals used to capture and explore<br />

understandings of this topic (from lectures, workshops, fieldwork and wider reading). Our<br />

project is, therefore, based upon a multiple case study design, apt for monitoring and<br />

explaining educational practices (Sanders, 1981; Merriam, 1998 ). Most data gathering is<br />

integrated into the module programmes with the sketchbook journals themselves forming<br />

important sources of qualitative data, together with staff and students’ reflection on the<br />

processes. Evidence, from our initial analysis of findings, indicates that sketchbook learning<br />

journals can provide a means for students to capture, synthesise, reflect upon and critique<br />

their learning. By making learning visible they also offer a rich source for assessing the<br />

processes of student learning and assisting our understandings and development as<br />

teachers. In considering sketchbooks as challenging assessment tools, we address the<br />

ways that using sketchbooks challenges traditional forms of summative assessment by<br />

requiring that students show their developing thinking and learning throughout a module,<br />

with built in opportunities for formative tutor, peer and self assessment. There is also the<br />

challenge of some clash of philosophy as, although we are advocating a constructivist<br />

approach to learning, in our current system all students must achieve pre-set module<br />

learning outcomes. It is also challenging for students, as material has to be synthesised and<br />

documented in ways that communicate their ideas and demonstrate higher order thinking -<br />

and this must be sustained throughout a module. We anticipate that these points may prove<br />

fruitful for discussion.<br />

116 ENAC 2008


Evaluating the use of popular science articles for assessing<br />

high schools students<br />

Michal Nachshon, Ministry of Education, Israel<br />

Amira Rom, The Open University, Israel<br />

Alternative assessment is a way to assess students' achievements whereby teachers<br />

assess students by authentic tasks when the student is required to formulate the problem;<br />

tasks which enable different solutions and provide the opportunity to reflect on their learning<br />

process.The majority of students in Israel graduate from high school after completing at<br />

most one year of science studies. For these students, a new program, Science for All, is<br />

now being offered at the high-school level as an alternative to the traditional natural science<br />

courses. This program encourages the teaching of science in a more thematic way,<br />

integrating the different scientific disciplines and aspects of technology. The intention is to<br />

expose all students to scientific principles and, consequently, to extend their understanding<br />

of them.<br />

Popular science articles, published in a variety of newspapers and magazines, can be a<br />

powerful tool to help students connect what they learn in school to current scientific and<br />

technological advancements. Further, popular science articles can provide opportunities for<br />

students to read critically, discuss issues and reach decisions based on their knowledge of<br />

science.<br />

The purpose of the study was to evaluate the use of authentic tasks, based on popular<br />

science articles for assessing Science for All students. The results presented in this<br />

summary are part of an ongoing longitudinal study of the use of popular science articles in<br />

instruction and assessment.<br />

The sample consists of 57 teachers in 40 schools nationwide. At the end of the school year<br />

Science for All teachers were asked to choose a popular science article and use it for the<br />

development of an assignment including scoring rubrics. They were then asked to send us<br />

the assignment, the scoring rubrics, and a sample of three students’ work corresponding to<br />

excellent, medium and poor grades. In addition, teachers filled a written questionnaire in<br />

which they were asked to characterize the assignment they developed and reflect on their<br />

experience.<br />

Specifically, teachers were asked to identify the learning goal assessed by the assignment,<br />

including both concepts and skills; to specify the cognitive levels required by each item in<br />

the assignment; and to indicate which abilities of multiple intelligences are represented in<br />

their assignment.<br />

Two independent expert teachers evaluated each assignment and sample student work.<br />

Next, assessment experts reviewed these evaluations and summarized the strengths and<br />

weaknesses of each assignment. At the end of the process, each teacher received a written<br />

feedback and discussed this feedback with his assigned expert teacher.<br />

Our findings show that teachers include both low and high-level cognitive questions in the<br />

assignments. With respect to multiple intelligences, teachers tend to include tasks that<br />

require linguistic and logical abilities but not other abilities. In addition, three main difficulties<br />

were identified: Teachers had trouble identifying valid learning goals; in some cases,<br />

teachers failed to recognize the skills that were assessed; and typically, the scoring rubrics<br />

did not match the learning goals the teachers intended to assess. We believe that this<br />

process can help teachers become more knowledgeable about the desired characteristics<br />

of assessment.<br />

ENAC 2008 117


Supporting student intellectual development through assessment design:<br />

debating ‘how’?<br />

Berry O'Donovan, Oxford Brookes University, United Kingdom<br />

Margaret Price, Oxford Brookes University, United Kingdom<br />

Prior research suggests that students move through stages of intellectual development<br />

(whilst in higher education) in which their beliefs about the nature of knowledge and learning<br />

change and develop in complexity and understanding, the best known of which is probably<br />

that of Perry (1970) but also includes the influential work of Belenky et al, (1986), King and<br />

Kitchener (1994) and Baxter Magolda, (1992). However, the literature is less clear about<br />

how such intellectual development can be triggered and encouraged through assessment<br />

and learning activities.<br />

Vygotsky’s (1978) seminal work on social constructivism and ‘zones of proximal<br />

development’ conceptualises learning development in incremental terms. Students<br />

advancing to nearby learning positions that some of their peers already hold and share with<br />

them. Arguably, this suggests tutors and assessment designs should provide the cognitive<br />

scaffolding that would support collaborative, incremental and seemingly comfortable<br />

development.<br />

However, other perspectives on intellectual development such as Meyer and Land’s (2003)<br />

work on threshold concepts and the narratives within Baxter Magolda’s (1992) work on<br />

intellectual development paint a less comfortable picture. Meyer and Land posit that there<br />

are disciplinary concepts that once understood by students lead to new and previously<br />

inaccessible ways of thinking. Such intellectual movement involves students entering into a<br />

‘liminal space’ where they have moved out of familiar cognitive territory into a zone of<br />

disorientation where existing certainties are rendered problematic before they can cross the<br />

threshold into a new landscape of understandings. Baxter Magolda’s (1992) narratives also<br />

contain student reflections on critical incidents that seemingly thrust them uneasily up the<br />

intellectual development ladder, revealing the development as sometimes both erratic and<br />

disquieting.<br />

So what does this mean for assessment? Taking a Vygotskian approach may involve<br />

adopting an assessment design involving low stakes, scaffolded, collaborative assessment<br />

activity that allows for ‘slow learning’ (Claxton, 1998 cited in Knight and Yorke, 2002). In the<br />

initial stages of an undergraduate degree this may also involve designing assessment tasks<br />

that align with lower level epistemological beliefs, i.e. content focused assessment that<br />

reflects factual material verified by an authority.<br />

Alternatively, if we consider intellectual development as uneven and inconsistent, and take<br />

the stance that students need to confront ‘troublesome knowledge’ (Meyer and Land, 2003)<br />

and make disquieting intellectual leaps that cross learning thresholds -then what<br />

assessment designs would we choose? Arguably, such a stance may involve assessment<br />

designs that involve: an ‘unfreezing’ process (Lewin, 1951) to provoke students out of<br />

current comfortable orientations: assessment tasks that ‘problematise’ the subject (Grey et<br />

al, 1996); student discomfort; and tasks that provoke higher order epistemological stances.<br />

The discussion will explore the nature of intellectual development, including diverse<br />

disciplinary epistemologies, and the implications for assessment design. To support<br />

participants unfamiliar with the literature, discussion will be seeded by illustrative quotes<br />

from the literature and practical examples taken from a large scale qualitative study of<br />

students’ epistemological beliefs undertaken at Oxford Brookes.<br />

118 ENAC 2008


Assessment contexts that underpin student achievement: demonstrating effect<br />

Berry O'Donovan, Oxford Brookes University, United Kingdom<br />

Margaret Price, Oxford Brookes University, United Kingdom<br />

A large scale study in the US that examined over 25,000 students and over 190<br />

environmental variables found that the key influence on student success is student<br />

involvement fostered by student/student and student/faculty interaction (Astin,1997). Such<br />

findings have been corroborated by smaller scale unpublished studies in the UK (Holden,<br />

2008). Taking a social constructivist approach to the classroom and the use of interactive<br />

teaching strategies has been well documented in the literature (Vygotsky, 1978). Less well<br />

documented is the effect of intentionally increasing opportunities for student/student and<br />

staff/student interaction outside of the classroom (O’Donovan et al., 2008)<br />

The ASKe (Assessment Standards Knowledge exchange) Centre for Excellence based at<br />

Oxford Brookes University in the UK has for the last two years been attempting to cultivate<br />

students and staff sense of community within one School situated on a satellite campus<br />

which attracts significant numbers of undergraduates, often taught in large classes. Within<br />

this learning context, described by many students as ‘impersonal’ (Price et al. 2007), the<br />

Centre has developed initiatives that intentionally involve students with the academic<br />

community outside the formal classroom. Initiatives include: peer-assisted learning in which<br />

more advanced students help others with their learning; modular leader assistantship in<br />

which students help academics with their teaching preparation and organisation; students<br />

as co-researchers; students allowing staff insight of their experience though the media of<br />

audio diaries.<br />

Whilst these initiatives have been evaluated as very successful from both student and staff<br />

perspectives, evidencing an effect on student learning through their assessed performance<br />

is proving very tricky. As Graham Gibbs (2002) states there is a real absence in most<br />

pedagogic research of hard evidence of improvement to student learning. Roundtable<br />

discussion will commence with Gibb’s fundamental question on whether qualitative<br />

evidence demonstrating student and staff appreciation and belief in the effects of such<br />

initiatives is sufficient. After which possible methodologies that evidence cause and effect<br />

between such individual initiatives and students’ assessed performance within a context of<br />

an ever changing learning landscape will be discussed and debated.<br />

References<br />

Astin, A. (1997) What Matters in College? Four Critical Years Revisited, San Francisco: Jossey-Bass<br />

Gibbs, G. (2002) ‘Ten years of Improving Student Learning’ Improving Student Learning Theory and<br />

Practice 10 years on. Improving Student Learning 10, Berlin, September.<br />

Holden, G. (2008) ‘The Importance of Feedback’, Assessment for Learning: How does that work?’,<br />

HEA/Northumbria Workshop, Newcastle, February.<br />

Price, M. & O’Donovan, B., Rust, C. (2007) Building community: engaging students within a disciplinary<br />

community of practice, ISSOTL ‘Locating Learning’: Sydney, July.<br />

O’Donovan B., Price M., Rust C. (2008) Developing student understanding of assessment standards,<br />

Teaching in Higher Education, vol 13., no. 2, pp.205-217<br />

Vygotsky, L. S (1978). Mind in Society: The Development of Higher Psychological Processes. MA: Harvard<br />

University Press.<br />

ENAC 2008 119


In-classroom use of mobile technologies to support formative assessment<br />

Ann Ooms, Timothy Linsey, Marion Webb<br />

Kingston University, United Kingdom<br />

The paper presents the findings of a research project on in-classroom use of mobile<br />

technologies to support diagnostic and formative assessment. The research project<br />

addressed the following questions:<br />

1. Under which conditions can each of the technologies be efficiently and effectively used<br />

for diagnostic / formative assessment in classroom settings?<br />

2. What is the impact of the in-classroom use of mobile technologies for<br />

diagnostic/formative assessment on students’ attitudes toward the module?<br />

3. What is the impact of the in-classroom use of mobile technologies for<br />

diagnostic/formative assessment on students’ conceptual understanding?<br />

4. What is the impact of the in-classroom use of mobile technologies for<br />

diagnostic/formative assessment on students’ test results?<br />

5. What is the impact of the project on teaching practices? How likely is it that that impact, if<br />

there is any, will sustain?<br />

6. What is the impact of the project on assessment practices? How likely is it that that<br />

impact, if there is any, will sustain?<br />

7. What is the impact of the project on attitudes on in-classroom use of mobile<br />

technologies? How likely is it that that impact, if there is any, will sustain?<br />

8. What indicators are there of institutional commitment to and subsequent uptake of inclassroom<br />

use of mobile technologies?<br />

Thirteen academic staff members from 7 different faculties within one university used a<br />

range of mobile technologies such as electronic voting systems, mobile phones, Tablet<br />

PC’s, Interactive Tablets and i-Pods to support rapid feedback. Two mentors supported and<br />

assisted the academic staff.<br />

A mixed-methods methodology was used to collect data from academic staff<br />

(questionnaires, interviews, reflective journals), students (questionnaires, focus groups),<br />

and mentors (interviews, reflective journals). In addition, attendance records, assessment<br />

strategies, assessment tools and assessment records were compared with those from the<br />

previous year.<br />

120 ENAC 2008


The Devil's Triad:<br />

The symbiotic link between Assessment, Study Skills and Key Employability Skills<br />

Jon Robinson, Northumbria University, United Kingdom<br />

David Walker, Northumbria University, United Kingdom<br />

Student reaction to assessment, study skills and the idea of being taught graduate<br />

employability is typically negative within Higher Education Institutions. Yet, all three now<br />

have to be considered and included by those involved in the curriculum design of<br />

programmes and modules within the University of Northumbria. This is particularly<br />

problematic for non-vocational subjects, such as those typical of the Humanities. In the<br />

English Division at Northumbria we have redesigned the core first-year module for English<br />

students in a way that symbiotically links assessment, study skills and employability within a<br />

framework underpinned by the theory and practice of Assessment for Learning (AfL).<br />

This roundtable presentation will outline, and open up for in-depth discussion, the approach<br />

taken by the curriculum team, from both a theoretical and practical perspective, when<br />

designing the module assessment to link with study and employability skills. It will also<br />

present the initial findings of research into the effectiveness of the innovation in curriculum<br />

design. The overarching intention of the presentation will be to create dialogue and explore<br />

avenues for collaboration with participants at the conference, from different countries and<br />

who hold different perspectives, in order to facilitate further development of our work and<br />

create an opportunity for the exchange of ideas.<br />

ENAC 2008 121


Learning-oriented assessment and students experiences<br />

Ann Karin Sandal, Margrethe H. Syversen, Ragne Wangensteen<br />

Sogn and Fjordane University College, Norway<br />

Kari Smith, University of Bergen, Norway<br />

This presentation reports part of an ongoing project at the Sogn and Fjordane University<br />

College funded by the Norwegian Research Council. The aim of the project is to examine<br />

students’ experiences with the transition from primary to secondary school. An important<br />

issue is to investigate how portfolio assessment can be supportive for making choices and<br />

motivate for lifelong learning. The comprehensive research project focuses on how students<br />

are prepared to choose programmes in secondary schools and prepare for choosing a<br />

future profession through the subject “Elective programme subjects”. This new programme<br />

was introduced in primary schools together with the curricula reform “Knowledge Promotion”<br />

in 2006. The main aim is to prevent mistakes and dropouts, and help the students make a<br />

good choice.<br />

In the current study we examine how formative assessment influences students’ beliefs and<br />

plans for further education, and to what extent assessment, through digital portfolios,<br />

enhances consciousness about further education (Klenowski, 2002; Black & William, 2006;<br />

Harlen, 2006). We investigate how assessment in digital portfolios can support the learning<br />

process and the processes of decision making. We try to identify some consequences of<br />

teachers’ supportive assessment and the students’ experiences with assessment for<br />

learning. The study will follow students choosing vocational education programmes. We are<br />

using both qualitative and quantitative methods in the study.<br />

A questionnaire was sent to 90 students in 3 different schools in their last term in primary<br />

school (age 15). The preliminary findings show variations in the students’ interest,<br />

motivation and consciousness about the choices they are about to make. When asked what<br />

kind of assessment encourages further work and how this can stimulate the learning<br />

processes, the students valuate the written comments on their work highly. Together with<br />

oral response in assessment, this direct and personal response on their work gives the<br />

students improved self-esteem and belief in their capability of learning (Harlen, 2006; Gibbs<br />

& Simpson, 2005). It seems that this type of assessment is important to the students in<br />

order to motivate and create interest for schoolwork. (Hidi & Renningar, 2006).<br />

However, even if the students seem to be motivated for vocational education and practical<br />

activities, some students put effort into the more theoretical subjects.<br />

In order to be admitted to vocational education, good marks in the theoretical subjects are<br />

required, in which the students are not particularly interested. The students spend most of<br />

their time on these subjects in their last year in primary school, whereas their interest is<br />

inspired through, for many of them, the subject preparing them for vocational studies.<br />

This indicates some interesting challenges due to the students motivation and teachers’<br />

assessment practices. How can formative assessment help the students to develop selfesteem,<br />

knowledge, visions and intrinsic motivation for further education? And how are all<br />

these challenges dealt with while using portfolio in the formative assessment?<br />

These questions will be followed up by action research in schools and longitudinal study.<br />

122 ENAC 2008


Connecting Research Behaviours with Quality Enhancement of Assessment:<br />

Eliciting Developmental Case Studies by Appreciative Enquiry<br />

Mark Schofield, Edge Hill University, United Kingdom<br />

This paper relates the University’s commitment to systematic enhancement of the student<br />

experience of assessment. This extends beyond quality assurance and juxtaposes research<br />

and development behaviours allied to ‘thicker’ description (Geertz) of complex events in<br />

qualitative, interpretive research approaches with those traditionally ‘thinner’ evaluation<br />

tools characteristic of many university quality assurance systems.<br />

The paper describes the process of a developmental audit across the Faculties of<br />

Education, Health and Arts and Sciences. Dialogues were conducted to explore the<br />

experiences of feedback on assessment of staff and students and those in disability and<br />

specific learning through focus groups and scrutiny of practices against the SENLEF<br />

Principles of Feedback (Student Enhanced Learning through Effective Formative<br />

Feedback). The process also included the elicitation of case studies from staff and students<br />

about their experience of effective feedback on assessment in the form of short writing<br />

activities. These focused on the context of effective feedback, an individual reflection on<br />

why it worked for them, and importantly ideas and guidance for others embarking on trying<br />

similar approaches. As such, this key element of the audit was conducted in the spirit of<br />

Appreciative Enquiry.<br />

Included are reflections on the similarities and differences in these two sets of staff and<br />

student voices and Tag Cloud representations (word frequency analyses) which reveal<br />

dominant and recessive themes in the sample groups. This offers some stark insights into<br />

affective issues related to assessment and feedback, congruence in attitudes and<br />

approaches and some perhaps unexpectedly astute epistemological insights from students.<br />

The case studies will also be offered )in an abridged form), with commentary related to<br />

effective practices and alignment with the SNLEF principles represented in MS Word<br />

comments function, including key questions and challenges arising from the narrative texts.<br />

The full versions will be available via a url/hyperlink in the paper.<br />

We argue that such developmental enquiry (research based activity) has given sightlines<br />

into effective practices, highlights the importance of perceptions of effective feedback, and<br />

emphasises that the processes embodied in this approach add enhancement layers to<br />

extant, historical, quality systems. These approaches are replicable for use in supporting<br />

other lines of enquiry related to assessment and other learning-related aspects of the<br />

student experience. This enrichment of quality processes is achieved by bringing research<br />

behaviours into close juxtaposition with quality assurance systems of intelligence gathering<br />

and by producing data artefacts of both of developmental significance (for use with students<br />

in academic induction and in staff development) and influential in policy decision making<br />

related to dissemination of good practice and systematic enhancement of assessment<br />

practices.<br />

ENAC 2008 123


Conceptions of assessment in higher education:<br />

A qualitative study of scholars as teachers and researchers<br />

Elias Schwieler, Stockholm University, Sweden<br />

Stefan Ekecrantz, Stockholm University, Sweden<br />

The researcher’s professional life world is based on explicit and well reflected subject<br />

specific conceptions. These sophisticated conceptions are the result of an extensive formal<br />

education, followed by life long, advanced learning by conducting research. As a teacher,<br />

the same individual’s pedagogic life world is often exclusively a result of socialization and<br />

the reproduction of existing traditions. Consequently, the teacher in higher education is<br />

expected to develop knowledge about pedagogic work more or less intuitively, based on far<br />

less articulated and reflected conceptions. Thus, the academic profession of<br />

research/teaching can be said to be founded on two professional extremes. There is a need<br />

for an increased understanding of how such double roles and life worlds are constituted,<br />

and how they relate to each other. In the area of assessment, we argue, an individual<br />

researcher’s/teacher’s double belief systems are particularly visible, making it an important<br />

field of study.<br />

We will present preliminary results from an ongoing interview-based study about this<br />

phenomenon, from three different assessment-related themes:<br />

1) Assessment and personal theories of learning – Subject specific and generic beliefs on<br />

how, when and why different aspects of a subject need to be learned and assessed is the<br />

foundation for a teacher’s professional world view. These beliefs are studied, in part, as<br />

implicit theories of threshold concepts (Meyer & Land 2006) and views on backwash effects<br />

of assessment.<br />

2) Assessment and normative values – Summative assessment and grading highlight<br />

underlying perceptions of assessment as a means to discipline, punish and reward. (Filer<br />

2000) Also, both students’ and teachers’ workloads peak during the assessment process,<br />

often leading to stress and tension. In such a climate (Biggs 2007), reflected as well as tacit<br />

professional values are especially important.<br />

3) Methodological and epistemological foundations of assessment – Advanced<br />

epistemological beliefs on knowledge, evidence and scientific method is a vital part of all<br />

academics’ research. The same individuals’ tacit views on assessment epistemology are<br />

often at conflict those upheld in research. A methodology that would not be considered in<br />

research is frequently used uncritically in the enquiry of student learning.<br />

Our aim is, specifically, to include individual inconsistencies, contextual issues, conceptual<br />

discrepancies and unelected assumptions by focusing on each teacher’s conceptions of<br />

assessment. In previous research, with its explanatory focus on idealized models, such<br />

complexities are usually seen as residuals that must to be excluded in order to maintain a<br />

manageable amount of parameters. (Cf. Prosser et al 2005.) Furthermore, in order to grasp<br />

the intricacies of the interviewees' scientific as well as subject specific conceptions, we have<br />

chosen to study only two epistemic communities, History and English literature.<br />

124 ENAC 2008


Innovative Assessment Practice and Teachers’ Professional Development:<br />

Some Results of Austria’s IMST-Project<br />

Thomas Stern, University of Klagenfurt, Austria<br />

IMST (Innovations in Mathematics and Science Teaching) is a long term research and<br />

development project aimed at establishing an effective support system for Austrian schools.<br />

One of seven measures is the IMST-fund for the promotion of innovations in the teaching of<br />

maths, sciences and IT. About 160 teacher teams per year are encouraged to submit<br />

proposals for their classroom innovations, to evaluate both processes and results and to<br />

write reports that are published on the internet. In return they receive intensive individual<br />

counselling and some financial remuneration, and they are invited to several workshops. A<br />

remarkable number of these teachers decide to choose alternative assessment methods as<br />

their classroom innovation and as a field of investigation into their own practice.<br />

A cross-case examination of several school projects focuses on new ways of assessment<br />

that allow the students to some extent to choose their own topics and to keep track of their<br />

learning progress. Two high school teachers e.g. asked their 12 year old students to record<br />

examples of encounters with mathematics in daily life; then they went about assessing the<br />

sophistication and originality of their reports. A physics teacher let her 16 year old students<br />

choose their own fields of interest in astronomy and then draw and present posters, which<br />

she assessed in accordance with criteria she had worked out with her class. The study<br />

shows that self-regulated learning has a strong effect not only on the students’ motivation<br />

and interest but also on their proficiency and learning outcomes. What is even more<br />

impressing is the repercussion of these teaching innovations on the attitudes of the<br />

teachers themselves. In the course of their project about changes in their assessment<br />

practices most of them embarked on a thorough reflection of their teaching priorities, of their<br />

beliefs about learning and of their personal perspectives and ambitions as teachers. Both<br />

their autonomous school innovations and their action research studies can be shown to<br />

have boosted their professional development. Changes in their assessment routines turned<br />

out to have an especially strong impact on many aspects of their professional performance<br />

and were often accompanied by an additional commitment for school development and an<br />

overall increase in reflection about professional standards.<br />

ENAC 2008 125


Characteristics of an effective approach for formative assessment of teachers’<br />

competence development<br />

Dineke Tigelaar, Mirjam Bakker, Nico Verloop<br />

ICLON-Leiden University Graduate School of Teaching, The Netherlands<br />

Stimulating teachers’ professional development is an important function in assessment of<br />

teaching (Porter, Youngs & Odden, 2001). However, more research is needed into the effects<br />

of teacher assessments on teacher professional learning development (Lustick & Sykes, 2006).<br />

The research is part of a larger research project ‘Effects of different assessment<br />

approaches on teachers’ professional development’. The goal of this postdoctoral resarch<br />

project is to evaluate and compare the effects of three formative assessment approaches:<br />

(1) an expertise- en feedback based approach, (2) an approach for self-assessment, and<br />

(3) a negotiated assessment approach. In this research project, the focus is on teachers’<br />

competences for promoting reflective skills of senior secondary vocational students in<br />

health care, i.e. in nursing. Central question: “What are the effects of different formative<br />

teacher assessment approaches on the development of secondary vocational education<br />

teachers’ competences for promoting and formatively assessing students’ reflection skills,<br />

and which combination of assessment design characteristics promotes optimally the<br />

teachers’ competence development?<br />

Research questions:<br />

1. Which assessment criteria and standards are developed, formulated and used in the three<br />

PhD-projects and which set of (common) criteria and standards can be used for a<br />

representative overall measurement of the participating teachers’ competence development?<br />

2. How do the teachers perceive and value the characteristics of the assessment approaches<br />

(see the characteristics 1 – 3 above) in the projects they participated in, and what are the<br />

results of the overall measurement of the teachers’ competence development (see question 1)?<br />

3. What is the relation between the measured teachers’ overall competence development<br />

and a) the assessment approach characteristics as documented by the PhD-researchers as<br />

well as b) the participating teachers’ perceptions and evaluations of these characteristics?<br />

Tasks of the postdoc:<br />

1. Distillation of the common elements in the criteria and standards for teaching<br />

competences formulated in the PhD-projects using matrices (Miles & Huberman, 1994).<br />

2. Development of instruments for the repeated overall measurement of teachers’<br />

competences, and of the teachers’ (N=88) perceptions and evaluations of the assessment<br />

design characteristics, and more general conditions in the schools for teacher professional<br />

development. Video vignettes will be developed, and teachers’ will be asked to select<br />

samples of student work. Furthermore, questionnaires will be developed.<br />

3. Organization of the data gathering (in collaboration with the three PhD-researchers).<br />

4. Analyses of the relations between the measured teachers’ overall competence development<br />

and a) the assessment approach characteristics as documented by the PhD-researchers as well<br />

as b) the participating teachers’ perceptions and evaluations of these characteristics and of the<br />

more general relevant conditions. This will be done using qualitative analyses (matrices) and<br />

quantitative analyses (analysis of variance, multiple regression analysis, and multilevel analysis).<br />

5. Development of an optimal combination of design characteristics.<br />

We stimulate both a research and practice related discussion.<br />

126 ENAC 2008


Posters<br />

ENAC 2008 127


128 ENAC 2008


Predictive indicators of academic performance at degree level<br />

Andy Bell, Manchester Metropolitan University, United Kingdom<br />

Kevin Rowley, Manchester Metropolitan University, United Kingdom<br />

In the absence of A* grades at A Level, Cambridge University has designed an additional<br />

Admissions selection ‘tool’ – hence UCLES (University of Cambridge Local Examinations<br />

Syndicate) has produced the Thinking Skills Assessment Test’ (TSA Test).<br />

The TSA Test is designed as a ‘knowledge-independent’ measure of the candidate’s ability<br />

to think effectively and critically. This test is composed of two types of questions: ‘problemsolving’<br />

questions, and questions which ‘tap into’ ‘critical thinking’ abilities.<br />

The extent to which the TSA Test is predictive of performance at degree level has yet to be<br />

established. Initial ‘in-house’ research by Cambridge suggests that there is indeed a<br />

significant predictive link between scores on the TSA Test and performance at degree level<br />

for students at Cambridge (Emery, J. L. et al, 2006; Emery, J.L, 2006).<br />

Cambridge University, then, is currently involved in appraising the TSA Test as part of its<br />

Admissions process. If used extensively to select students for a place at Cambridge, such<br />

use would have to be seen as justified – otherwise, it would be unfairly discriminatory. The<br />

present research at the Manchester Metropolitan University (MMU) was designed to add to<br />

the knowledge base concerning the validity of the TSA Test as a predictor of performance<br />

at degree level and, ipso facto, its validity as an Admissions ‘tool’.<br />

Hence, this paper addresses research currently being conducted to examine the extent to<br />

which the TSA Test is predictive of success at degree level at a non-Oxbridge institution<br />

(Manchester Metropolitan University). Four cohorts of first-year Psychology undergraduates<br />

(total N = approx. 350) completed Test L (a research version of the TSA Test). With the<br />

students’ consent, their academic performance was tracked throughout the three years of<br />

their degree-level studies. It was thus possible to examine the extent to which students’<br />

scores on the TSA Test were predictive of degree level performance in examinations and<br />

assessed coursework (ACW).<br />

Factors other than the TSA Test – such as personality traits as measured by Quintax<br />

(Stuart Robertson & Associates, 1999); performance at A Level and participants’ scores on<br />

an IQ-type test (Ravens Progressive Matrices - Plus) – were also examined as possible<br />

predictors of students’ success at degree level. Students’ scores on the three sub-scales of<br />

the Approaches & Study Skills Inventory for Students (ASSIST / Entwistle, 2000) were also<br />

established and possible links with academic performance were examined.<br />

In addition to the above, an adaptation of the ‘Big Five’ scale provided on the website of the<br />

International Personality Item Pool (IPIP) is currently being developed (Bell & Rowley,<br />

2008). This is the ‘Big Five for Students’ scale. This will undergo factor analysis and item<br />

analysis. Then students’ scores on this scale will be correlated with their academic<br />

performance at degree level. As this scale is a recent development, this aspect will only<br />

include the 2007-2008 First Year cohort of students (N= 101).<br />

This research will be completed and all data analysis will be conducted in time for<br />

presentation to the EARLI (2008) conference in Berlin.<br />

ENAC 2008 129


Online Formative Assessment for Algebra<br />

Christian Bokhove, Utrecht University, Netherlands<br />

Rationale: In the Netherlands – as in many other EU-countries – universities complain about<br />

the algebraic skill level of students coming from secondary school. It is unclear whether<br />

these complaints have to do with basic skills or “actual conceptual understanding”, which<br />

we refer to as symbol sense (Arcavi, 1994). In this research we want to find out how ICT<br />

tools can help with formatively assessing algebraic skills.<br />

Key concepts: Three key topics come together in this poster session, forming the<br />

conceptual framework for our observations: tool use, assessment and algebraic skills.<br />

The first topic concerns acquiring algebraic skills. Here we discern basic skills, for example<br />

solving an equation, but in particular conceptual understanding. Arcavi (1994) calls this<br />

“symbol sense”.<br />

In this case, formative assessment would be appropriate; aimed at assessment for learning.<br />

Assessment contributes to learning and understanding of concepts (Black & Wiliam, 1998).<br />

Feedback plays an important in formative assessment. On the other hand, still getting<br />

scores and results, remains it also is important to record the progress of a student:<br />

summative assessment, assessment of learning. Using both to get ‘the best of both worlds’.<br />

In assessment for learning using ICT tools can be beneficial. ICT tools can help with giving<br />

users feedback, may focus on process rather than result, track results or scores and<br />

provide several ‘modes’, ranging from practice to exam. Thus, assessment is for learning.<br />

Method: In this poster session experiments with with an ICT tool called Digital Mathematical<br />

Environment (Bokhove, Koolstra, Heck, & Boon, 2006) are described. Through expert<br />

reviews, one-to-ones and small group experiments we provide a framework on how<br />

formative assessment can support learning for mathematics. We mention:<br />

Using several ‘modes’ of assessment during a sequence of lessons: first practice with more<br />

feedback, gradually more exam-like assessment without feedback.<br />

Emphasis on the process: how does a student reach his/her correct or wrong answer. This<br />

information can be used in a subsequent lesson.<br />

Results: Embedding use of an algebra tool in a didactical scenario, where self-assessment<br />

and classroom feedback make up a balanced curriculum for attaining sufficient algebraic<br />

skills, is an important part of formative assessment. We will describe the preliminary results<br />

of possible didactical scenario’s with the Digital Mathematical Environment. These<br />

scenario’s will be used for further research on the subject.<br />

Discussion: We would like to discuss the implications for classroom practice when using ICT<br />

tools for (formative and summative) assessment, and what didactical scenario’s are best<br />

suited for acquiring algebraic skills, by using ICT tools.<br />

References<br />

Arcavi, A. (1994). Symbol Sense: Informal Sense-Making in Formal Mathematics. For the Learning of<br />

Mathematics, 14(3), 24-35.<br />

Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles,<br />

Policy & Practice, 5(1), 7-73.<br />

Bokhove, C., Koolstra, G., Heck, A., & Boon, P. (2006). Using SCORM to Monitor Student Performance:<br />

Experiences from Secondary School Practice. Math CAA series.<br />

130 ENAC 2008


Investigating the use of short answer free-text e-assessment questions<br />

with instantaneous tailored feedback<br />

Barbara Brockbank, The Open University, United Kingdom<br />

Sally Jordan, The Open University, United Kingdom<br />

Tom Mitchell, Intelligent Assessment Technologies Ltd., United Kingdom<br />

Warburton and Conole (2005) argue that ‘It seems likely that the drive towards emergent<br />

technologies such as simulations and free-text marking will result in increasingly strong<br />

competitive pressures against the more traditional ‘standardised testing’, purely objective<br />

types of CAA system.’ This paper describes the application of such an emergent<br />

technology, grounded in a desire to improve the student learning experience.<br />

The UK Open University’s OpenMark assessment system enables students to be provided with<br />

immediate and tailored feedback on their responses to questions of a range of types, including<br />

those requiring free-text entry of numbers, symbols and single words (Ross, Jordan and<br />

Butcher, 2006). This study is an investigation into the viability and effectiveness of adding<br />

questions which require free-text responses of up to about 20 words in length. Answer matching<br />

is provided by an authoring tool supplied by Intelligent Assessment Technologies Ltd. (IAT)<br />

which is able to perform an intelligent match between free-text answers and predefined<br />

computerised model answers. Thus an answer such as ‘the Earth orbits the Sun’ can be<br />

differentiated from ‘the Sun orbits the Earth’ and an answer of ‘The forces are balanced’ is<br />

marked as correct whereas an answer of ‘The forces are not balanced’ is not. The tool looks for<br />

understanding without unduly penalising errors of spelling, grammar or semantics.<br />

The questions are delivered to students online and instantaneous targeted feedback is<br />

provided on both specifically incorrect and incomplete answers. Another novel feature of the<br />

project has been the use of student responses to early developmental versions of the<br />

questions – themselves delivered online – to improve the answer matching.<br />

Students have been observed performing the assessment tasks. Most claim that they wrote<br />

their responses as if for a human marker. However a few were conscious that they were<br />

being marked by a computer and anticipating (incorrectly) that only keywords were required,<br />

entered answers either in note form or in very long sentences. Most students enjoyed the<br />

assessment tasks and seemed comfortable with the concept of a computer marking freetext<br />

responses. Where the initial response was incorrect, most students were observed to<br />

use the advice provided by the feedback and many reached the correct answer.<br />

A human-computer marking comparison has indicated that the computer’s marking is typically<br />

indistinguishable from that of six subject-specialist human markers. The computer’s marking<br />

was generally accurate, showing greater than 95% concordance with the question author. A<br />

small number of these questions have been incorporated into regular summative interactive<br />

computer marked assignments on a new distance-learning interdisciplinary science course.<br />

We will encourage discussion of our evaluation findings and of the technological, financial,<br />

cultural and pedagogical issues which appear to limit take-up of assessment of this type.<br />

References<br />

Ross, S.M., Jordan, S.E and Butcher, P.G.(2006) Online instantaneous and targeted feedback for remote<br />

learners. In Innovative assessment in Higher Education ed. Bryan, C & Clegg, K.V., pp123-131<br />

London U.K., Routledge.<br />

Warburton, W and Conole, G (2005) Wither e-assessment. Proceedings of the 2005 CAA Conference at<br />

http://www.caaconference.com/pastConferences/2005/proceedings/index.asp [accessed 1st<br />

February 2008]<br />

ENAC 2008 131


Contextualized reasoning with written and audiovisual material:<br />

Same or different?<br />

Nina Bucholtz, Maren Formazin, Oliver Wilhelm<br />

IQB, Humboldt University Berlin, Germany<br />

The ability to arrive at valid conclusions from given information and to comprehend given material<br />

of non trivial complexity is of importance for many aspects in life, e.g., for learning and acquiring<br />

knowledge. Fluid intelligence (gf) or more specifically reasoning can be regarded as the main<br />

prerequisite for contextualized reasoning. In addition, relevant domain specific knowledge –<br />

supposedly a content-specific component of crystallized intelligence (gc) - can aid in solving<br />

contextualized reasoning tasks. We have developed an innovative measure of contextualized<br />

reasoning in order to further investigate the distinction between decontextualized reasoning tasks<br />

as included in intelligence tests and contextualized reasoning tasks as a relevant aspect of<br />

student achievement. The new measure is expected to tap both abstract reasoning ability (gf) and<br />

the recall of acquired information (gc). Contextualized reasoning measures differ from traditional<br />

comprehension tests – as for example included in the PISA studies – because they focus on<br />

specific content as opposed to rather general and somehow arbitrary topics.<br />

One aim of our efforts is to bridge the gap between intelligence and student achievement.<br />

Additionally, we want to overcome the shortcoming of focusing on written material in<br />

contextualized measures by also considering audiovisual material. With this design we hope<br />

to encompass contextualized reasoning in a wider sense, including typical learning<br />

situations encountered by students.<br />

In order to reach the above research aims we have designed two studies. The most critical<br />

research questions were:<br />

1. Can short video sequences presented via PDA or notebook be embedded into<br />

contextualized reasoning tasks that meet all requirements of standardized measures of<br />

maximal behaviour?<br />

2. Are such audiovisual contextualized reasoning tasks equivalent to paper pencil based<br />

contextualized reasoning tasks?<br />

3. Is performance in audiovisual and traditional contextualized reasoning tasks a linear<br />

function of decontextualized reasoning and relevant (i.e. natural sciences) domain<br />

specific knowledge?<br />

In study one, a newly developed audiovisual test that comprises video sequences of 3 to 5<br />

min length was piloted with N = 86 high school students. The videos include real life scenes<br />

or animated simulations from biology, chemistry, physics, and geography. Participants<br />

watch every video once and then have to answer several comprehension questions on the<br />

basis of the video. A paper pencil based contextualized reasoning test was matched in<br />

content and comprises texts, tables and figures. All questions cover the circumscribed<br />

domain of natural sciences. A g-factor-model on the basis of testlets was established for<br />

both contextualized reasoning tests separately. Both models fit the data well. The relation<br />

between these two latent factors in a SEM was r = .94; fit of this model was not noticeable<br />

better than the fit of a model with a single latent factor across both tests.<br />

In a second study that is currently running with about 200 participants, an effort is made to<br />

replicate the results from study one and to address research question number three from<br />

the above list. Results of this study will be presented and discussed with regard to<br />

implications for further test development and educational implications.<br />

132 ENAC 2008


Effects of Large Scale Assessments in Schools:<br />

How Standard-Based School Reform Works<br />

Tobias Diemer, Freie Universität Berlin, Germany<br />

Harm Kuper, Freie Universität Berlin, Germany<br />

Comparative large scale assessments form a centerpiece of recent standard-based reforms<br />

of the educational systems of the Federal Republic of Germany. The formulation standards<br />

in education and the implementation of according large scale standard tests in many States<br />

(Länder) of Germany mark a considerable shift away from an input-oriented towards an<br />

output-oriented account of governance. By providing and feeding back standardized and<br />

comparative data about pupils’ achievement, standard tests intend to make teachers and<br />

schools accountable and to help them to control and improve the outcomes of the pupils’<br />

work by means of evidence-based decision-making processes concerning the profession of<br />

teaching as well as the task of organizing schooling.<br />

The proposed poster deals with the question of whether and how the results from<br />

comparative large scale assessments are utilized by teachers as professionals and schools<br />

as organizations. It therefore will examine the profession- and organization-related processes<br />

and effects that are produced in line with large scale standardized testing and the feeding<br />

back of comparative results to teachers and schools. The paper will present typological<br />

descriptions of effects and process-related patterns of individual as well as collective databased<br />

decision-making that is based on large scale assessment test results in schools.<br />

Special attention will be drawn on noticeable consequences and changes regarding the<br />

conceptualisation and the design of teaching and learning processes by teachers.<br />

Exploration and analysis of the outlined effects and processes are carried out on two levels<br />

of abstraction. On a first, comparatively concrete level, the observable phenomena are<br />

described within the framework of a heuristic model suggested by Helmke (2004).<br />

According to this model the process of utilization of test results conceptually subdivides into<br />

four cyclically iterative stages: (1) reception, (2) reflection, (3) action, and (4) evaluation.<br />

Subsequently, the findings described within these categories are further aggregated as well<br />

as re-aggregated on a more abstract level. On this level theories of professions and<br />

organizational theories, particularly new institutionalism, sensemaking theory and system<br />

theory are analysed in reference to the results found on the more concrete level.<br />

Within these conceptual frameworks, empirical evidences will be presented that provide<br />

systematic as well as exemplary insight into the ways standard-based school reform works<br />

in schools. Furthermore, by reason of the integration of profession- and organization-related<br />

models, the study contributes to the development of a general theory of the functioning of<br />

school development in the context of the present standard-based and outcome-orientated<br />

paradigm of governance within the educational system.<br />

To capture the processes of decision-making, it is used a longitudinal case study approach<br />

based on semi-structured qualitative problem-centered interviews with headmasters and<br />

teachers. The material is analyzed according to procedures of qualitative content analyses<br />

and grounded theory. The data basis consists of about 120 interviews and 8 observations,<br />

that are accomplished in 4 schools in the space of 4 data collecting phases spanning over a<br />

period of 2 years. Due to its longitudinal design, the study gives information on the processrelated<br />

conditions and effects of standard-based reforms in schools.<br />

ENAC 2008 133


Support in Self-assessment in Secondary Vocational Education<br />

Greet Fastré, Marcel van der Klink, Dominique Sluijsmans, Jeroen van Merriënboer<br />

Open University, The Netherlands<br />

Despite the importance placed on student’s self-assessment in current education, it appears<br />

that students are not always able to assess themselves accurately, because they are<br />

insufficiently able to decide on which criteria they should assess themselves.<br />

In current assessment practices, students are often asked to come up with self-generated<br />

criteria and standards on which they want to assess themselves. However, it appears that<br />

students at the beginning of their study are not able to identify the standards and criteria<br />

themselves because they do not have a clear view on what is expected of them when it<br />

comes to their learning outcomes. It is thus the question if novice students should be asked<br />

to self-generate the assessment criteria.<br />

If students are given assessment criteria, still, most of the time, only a few assessment<br />

criteria are relevant for a certain task. When students need to become competent self-<br />

assessors, they should not only be able to make an accurate assessment, but they should<br />

also be capable in taking a good decision on which criteria are relevant and which criteria<br />

are not relevant for assessing a task (Sadler, 1989). This is certainly true in the case of<br />

assessing real-life whole tasks. In real-life tasks, resembling professional life, a large<br />

database of potential performance criteria could reasonably be considered. The whole set<br />

of criteria can be split up in two parts: relevant and irrelevant criteria. In today’s educational<br />

practices, often, no information on the relevance of the criteria is available for the students<br />

in advance. The question arises if students are capable of selecting the relevant criteria<br />

from the whole set of criteria. Regehr and Eva (2006) state that when students get the<br />

freedom of choosing on which criteria they want to assess themselves, there is a risk that<br />

they will only highlight the criteria on which they perform well or which they like because<br />

people naturally strive at creating a positive feeling. The risk is that students will thus not<br />

recognize exactly those learning needs that are really necessary.<br />

In this study, it is hypothesized that students who receive information on the relevance of<br />

the criteria can produce a more accurate self-assessment than students who do not receive<br />

information on the relevance of the criteria. Furthermore, we expect that students with a<br />

high accuracy of self-assessment are more competent in selecting points of improvement<br />

than students with a low accuracy of self-assessment. In the end, we expect there to be a<br />

positive relation between the accuracy of student’s self-assessment skills and student’s task<br />

performance.<br />

One hundred and six first-year students of a Secondary Vocational Education in Nursing<br />

participated in this study. The experimental design was a 2x2 factorial pre-test - post-test<br />

design in which the effects of ‘information on the relevance of the criteria’ (Relevant criteria<br />

vs. All criteria) and ‘variability in learning trajectory’ (School-based vs. Practice-based) were<br />

studied. Data are collected at the moment and results will be available by time of the<br />

conference.<br />

134 ENAC 2008


The confidence levels of course/subject coordinators in undertaking<br />

aspects of their assessment responsibilities<br />

Merrilyn Goos, Clair Hughes, Ann Webster-Wright<br />

The University of Queensland, Australia<br />

This paper reports the findings of an investigation of the confidence levels of course/subject<br />

coordinators in undertaking aspects of their assessment responsibilities at a large<br />

metropolitan university. Like universities in many other parts of the world, the Australian<br />

institution in which this investigation was undertaken is experiencing “a period of rapid<br />

change and innovation in relation to assessment policies and practice” (Havnes &<br />

McDowell, 2008, p. 3). The pressures for change and innovation range from developing<br />

pedagogical advances that call into question many traditional assessment practices to the<br />

challenges presented by the increasing student diversity, class sizes and casualisation of<br />

teaching, which, along with diminishing resources, characterise contemporary educational<br />

contexts (Anderson et al, 2002).<br />

The investigation was one element of a situational analysis which formed the first phase of<br />

a broader project aimed at supporting the leadership capacities of course/subject<br />

coordinators as assessment innovators. This group was targeted because, though<br />

significant in the implementation of institutional assessment policy, the role is scarcely<br />

researched despite it being highly likely that improved performance would benefit student<br />

learning (Blackmore et al, 2007). Confidence is considered central to the ability to learn<br />

about and master new practices (Gaven, 2004) and was identified as an issue for this group<br />

through an earlier pilot conducted by of one of the project team.<br />

The investigation took the form of an online survey of all course coordinators (response rate<br />

33%). Survey items were developed from the responsibilities and expectations either<br />

explicated or implied in institutional policies and rules. The survey identified areas of<br />

particularly high (e.g. making and defending summative judgements) and low (e.g. dealing<br />

with plagiarism and locating support when needed) levels of confidence. The paper will<br />

report survey findings in relation to individual items as well as the influential factors that<br />

emerged from analysis and the correlation of particular factors with demographic data such<br />

as years of experience and gender. In addition, coordinators provided open-ended<br />

comment, the analysis of which is used to elaborate on or clarify particular findings in<br />

relation to their positive or negative impact on confidence.<br />

The project was funded through the Fellowship scheme of the (Australian) Carrick Institute<br />

for Learning and Teaching in Higher Education.<br />

References<br />

Anderson, D., Johnson, R., & Saha, L. (2002). Changes in Academic Work: Implications for Universities of<br />

the Changing Age Distribution and Work Roles of Academic Staff. Canberra: DEST.<br />

Blackmore, P., Law, S., & Dales, R. (2007). Investigating the capabilities of course and module leaders in<br />

departments. Paper presented at the Higher Education Academy Annual Conference, Harrogate,<br />

Graven, M. (2004). Investigating mathematics teacher learning within an in-service community of practice:<br />

The centrality of confidence. Educational Studies in Mathematics, 57, 177-211.<br />

Havnes, A., & McDowell, L. (2008). Assessment dilemmas in contemporary learning cultures. In A. Havnes<br />

& L. McDowell (Eds.), Balancing Dilemmas in Assessment and Learning in Contemporary Education.<br />

New York: Routledge.<br />

ENAC 2008 135


Useful feedback and flexible submission:<br />

Designing and implementing innovative online assignment management<br />

Stuart Hepplestone, Sheffield Hallam Univeristy, United Kingdom<br />

Specific functionality has been added to the Blackboard virtual learning environment at<br />

Sheffield Hallam University (SHU) to enhance the way in which feedback can be provided to<br />

students and to improve the way student assignments are processed. This poster will<br />

explore the practical experience of designing and implementing a customised assignment<br />

handler tool in response to rising student expectations of online feedback and online<br />

assignment submission. (This poster presentation accompanies the short paper session,<br />

Secret scores: Encouraging student engagement with useful feedback, which discusses the<br />

use of technology in providing useful feedback to students).<br />

The design of this innovative assignment handler tool was achieved by mapping out the<br />

lifecycle of a student assignment and highlighting key functional areas for development.<br />

These have been developed into an innovative assignment handler tool which:<br />

1. Supports the online delivery of useful feedback through the Blackboard Gradebook by:<br />

• batch upload of individual file attachments providing detailed feedback along with<br />

student marks (whether the original work is submitted through Blackboard, or in a nonelectronic<br />

format such as hard-copy, by portfolio or presentation)<br />

• allowing partial cohort feedback to be uploaded by each member of the marking team<br />

• providing feedback on group assignments to each individual in the group, rather than<br />

one per group<br />

• giving students access to their feedback all in one place and presented as close to their<br />

learning as possible<br />

• encouraging students to engage and reflect on their feedback in order to activate the<br />

release of their marks (after Black & Wiliam, 1998, who argued that the “effects of feedback<br />

was reduced if students had access to the answers before the feedback was conveyed”).<br />

2. Supports the online submission of student work through Blackboard by providing students with<br />

a detailed electronic receipt of their assignment submission.<br />

The poster will present a visual representation of the lifecycle of a student assignment, clearly<br />

indicating where students have responsibilities in the course of completing and submitting<br />

assignments, and reflecting and acting upon feedback (Hepplestone & Mather, 2007). Information<br />

about an accompanying electronic feedback wizard development will also be displayed.<br />

SHU is a large regional University with over 28,000 students. It is based on three campuses<br />

and offers courses in a diverse range of academic subjects at both undergraduate and<br />

postgraduate levels.<br />

References<br />

Black, P. & Wiliam, D. (1998) Assessment and classroom learning. Assessment in Education, 5 (1), pp.7-74.<br />

Hepplestone, S. & Mather, R. (2007) Meeting Rising Student Expectations of Online Assignment<br />

Submission and Online Feedback, [online] In: Proc. 11th Computer-Assisted Assessment<br />

International Conference 2007, Loughborough, 10-11 July 2007. Learning and Teaching<br />

Development, Loughborough University. Last accessed 12 February 2007 at:<br />

http://www.caaconference.com/pastConferences/2007/proceedings/Hepplestone%20S%20Mather%<br />

20R%20n1_formatted.pdf<br />

136 ENAC 2008


The challenge of engaging students with feedback<br />

Rosario Hernandez, University College Dublin, Ireland<br />

Effective and high quality feedback is often regarded as a key element of excellence in<br />

teaching that supports student learning (Ramsden, 2003; Black and William, 1998, Sadler,<br />

1989). Despite this, feedback is often regarded by teachers as a labour-intensive activity<br />

that frequently makes little impact on student learning. Similarly, students have stressed<br />

that sometimes they do not understand the feedback they receive, that the feedback is too<br />

vague or that it does not provide them with suggestions on how to improve their work.<br />

These comments are particularly relevant in the teaching of modern languages in higher<br />

education where the feedback provided by teachers often focuses on the correction of<br />

grammatical mistakes and the provision of correct answers. Adding to that pressure is the<br />

fact that there are large numbers of students in classes and the practice of offering the<br />

“traditional” timely written feedback to students has become a struggle for many teachers.<br />

After an initial study of the issues concerning academics and students in the provision of<br />

effective feedback, an action-research project was undertaken with a group of<br />

undergraduate students of Hispanic Studies at University College Dublin. Throughout a<br />

semester, the duration of the module chosen for the study, students were provided with a<br />

variety of learning tasks whose main aim was to engage students with feedback. This<br />

approach to teaching and assessment required the involvement of students in a variety of<br />

learning activities, among others their participation in dialogue about the assessment criteria<br />

adopted, the use of assessment sheets with comments to act on them, the reading and<br />

critiquing of their work and that of their classmates (self-and peer-assessment) or the<br />

provision of feedback, by the teacher, with no grades. Written and oral data were collected<br />

by the teacher of this module at different moments during the semester in order to explore<br />

the experiences of the students with regard to this approach to feedback. Furthermore, a<br />

focus-group session was conducted in class at the end of the semester. This paper reports<br />

on the outcomes of the data collected throughout the semester, on the focus-group session<br />

and on the challenges that this approach to the provision of feedback to students entailed.<br />

ENAC 2008 137


Towards More Integrative Assessment<br />

Dai Hounsell, University of Edinburgh, United Kingdom<br />

Chun Ming Tai, University of Edinburgh, United Kingdom<br />

Rui Xu, Ningbo University, China<br />

Assessment in higher education is typified by competing tensions between multiple<br />

purposes, functions and stakeholders; wide diversity in practices within and across subject<br />

areas, courses and institutions; and diffuse responsibilities for the oversight and<br />

management of different aspects of assessment. Achieving coherence and integration in<br />

assessment practices, processes and policies is therefore a formidable challenge.<br />

This poster summarises the outcomes of a project which drew extensively upon the<br />

international literature on assessment in higher education to examine how a more<br />

integrative approach might be pursued. The project was undertaken as part of a sector-wide<br />

initiative in Scottish higher education on quality enhancement.<br />

The main outcomes of the project were a workshop programme and four guides, each of<br />

which focused on a key aspect of Integrative Assessment:<br />

• Monitoring Students’ Experiences of Assessment. This guide examines strategies to ascertain<br />

how well assessment in its various manifestations is working, so as to build on strengths and<br />

take prompt remedial action where helpful. It explores why it is important to monitor assessment<br />

practices systematically, what aspects of assessment are currently well-monitored in Scottish<br />

universities, and how the monitoring of assessment could be improved.<br />

• Balancing Assessment of and Assessment for Learning. This guide discusses ways of striking<br />

an optimal balance between the twin central functions of assessment, i.e. to evaluate and certify<br />

students’ performance or achievement, and to assist students in fulfilling their fullest potential as<br />

learners. It highlights some undesirable side-effects of imbalances and explores four strategies<br />

to rebalance assessment: feed-forward assessments, cumulative coursework, betterunderstood<br />

expectations and standards, and speedier feedback. Each strategy is illustrated<br />

with case-examples from a range of subjects and settings.<br />

• Blending Assignments and Assessments for High-Quality Learning. The starting-point for<br />

this guide is why it might be important not only to assess students' progress and<br />

performance by a variety of means, but also to consider what combination or blend of<br />

assignments and assessments in a course or programme of study might be optimal. The<br />

guide goes on to explore four important considerations that can shape how assignments<br />

and assessments are blended: blending for alignment of assessment and learning; blending<br />

for student inclusivity; blending to support progression in students’ understanding and skills,<br />

and blending for economy and quality. Examples and case reports are outlined from a<br />

cross-section of subject areas and course settings.<br />

• Managing Assessment Practices and Procedures. This guide argues that while most<br />

dimensions of assessment are generally well-managed, there are also aspects which have<br />

often not received the weight of attention they seem to warrant in the contemporary<br />

university. These aspects are: managing assessment for as well as assessment of learning;<br />

enabling evolutionary change in assessment; and wider sharing of responsibilities for<br />

managing assessment practices and processes.<br />

The four guides are freely downloadable from the Scottish Universities’ Enhancement<br />

Themes website (http://www.enhancementthemes.ac.uk/publications/) and a web-based<br />

version of the guides is being launched in spring 2008.<br />

138 ENAC 2008


Using a framework adapted from Systemic Functional Linguistics<br />

to enhance the understanding and design of assessment tasks<br />

Clair Hughes, The University of Queensland, Australia<br />

The plentiful and steadily increasing literature on teaching and learning in higher education<br />

has produced a number of helpful frameworks and guidelines that can be applied to the<br />

development and communication of assessment practice. As an educational developer I<br />

regularly deploy a core group of appropriate resources in the selection and staging<br />

(adaption of McAlpine, 2004) of assessment tasks that target specific cognitive levels<br />

(Krathwohl, 2002), the planning of feedback (Gibbs and Simpson, 2004: Price and<br />

O’Donovan, 2006) and the making of assessment judgements (Biggs and Collis, 1991).<br />

The literature however, is surprisingly light on material to support the analysis and<br />

purposeful design of individual assessment tasks. This gap initially became an issue for me<br />

when working with academics in adjusting assessment tasks to minimize opportunities for<br />

plagiarism. Our work was limited by several factors including a failure to acknowledge the<br />

wide variations in both task type and level of demand that can distinguish assessments<br />

within and between such categories as ‘orals’, ‘examinations’ or ‘assignments’; the<br />

identification of tasks by reference to subject matter and activity only; and, a belief that<br />

assessment tasks are restricted to the traditional or ‘signature’ forms of assessment<br />

associated with particular disciplines (Bond, 2007).<br />

This paper reports the outcome of my efforts to locate a framework that would provide the<br />

shared concepts and terminology required as a basis for productive and meaningful<br />

discussions of assessment tasks with academics. In broadening my search beyond the<br />

assessment literature, I investigated systemic functional linguistics (SFL) (Eggins, 2004;<br />

Knapp and Watkins, 2005). The resulting framework that is described has proved a useful<br />

resource for explicating the components of assessment tasks including many that were<br />

previously overlooked or inferred – audience, student perspective, mode of presentation<br />

and so on. The paper outlines the application of the framework to the original purpose of<br />

‘designing out’ opportunities for plagiarism and concludes that the framework has significant<br />

further potential to introduce academic teachers to a vast but generally unfamiliar literature<br />

on the systematic development of academic communication skills (see for example Swales<br />

and Feak, 2004) and as a basis for the critique of assessment as cultural practice.<br />

References<br />

Biggs, J., & Collis, K. (1982). Evaluating the Quality of Learning - the SOLO Taxonomy. New York: Academic Press.<br />

Bond, L. (2007). Toward a Signature Assessment for Liberal Education. Retrieved January 23, 2008, from<br />

http://bondessays.carnegiefoundation.org/?p=8<br />

Eggins, S. (2004). An Introduction to Systemic Functional Linguistics. Continuum. London & New York.<br />

Gibbs, G., & Simpson, C. (2004). Conditions under which assessment supports students' learning.<br />

Learning and Teaching in Higher Education Retrieved 19 April, 2005, from<br />

http://www.glos.ac.uk/shareddata/dms/2B70988BBCD42A03949CB4F3CB78A516.pdf<br />

Knapp, P., & Watkins, M. (2005). Genre, text, grammar: technologies for teaching and assessing writing<br />

Sydney: UNSW Press.<br />

Krathwohl, D. (2002). A revision of Bloom's taxonomy: An overview. Theory into Practice, 41(4), 212-218.<br />

McAlpine, L. (2004). Designing learning as well as teaching. Active Learning in Higher Education, 5(2), 119-134.<br />

Swales, J., & Feak, C. (2004). Academic Writing for Graduate Students: Essential Tasks and Skills<br />

(Second ed.). Ann Arbor: The University of Michigan Press.<br />

ENAC 2008 139


The use of transparency in the "Interactive examination" for student teachers<br />

Anders Jonsson, Malmö University, Sweden<br />

If the aim of education is for all students to learn and improve, then the expectations must<br />

be transparent to the students. In this study, three aspects of transparency are investigated<br />

in relation to an examination methodology for assessing student teachers' skills in analyzing<br />

classroom situations and in self-assessing their answers: self-assessment criteria, a scoring<br />

rubric, and exemplars. The examinations studied were carried out in 2004, 2005, and 2006<br />

respectively, all with a cohort of first year student teachers (n = 170, 154, and 138). There<br />

was a large difference in scores between the 2004 and 2005 cohorts (effect size, d = 3.21),<br />

when changes in the examination were implemented in order to increase the transparency.<br />

The comparison between 2005 and 2006, when no further changes were made, does not<br />

show a corresponding difference (d = .27). These results suggest that, by making the<br />

assessment more transparent, students’ performances could be greatly improved.<br />

140 ENAC 2008


School monitoring in Luxembourg:<br />

computerized tests and automated results reporting<br />

Ulrich Keller, Monique Reichert, Gilbert Busana, Romain Martin<br />

University of Luxembourg, Luxembourg<br />

This presentation will introduce the Luxembourgish school monitoring project, focusing on<br />

the various tools that were developed and used, especially the more innovative tools<br />

developed for internet-based computer assisted testing and automatic report generation.<br />

Luxembourg’s school system faces a transition encountered in many countries throughout<br />

the world: a transition towards more autonomy for individual schools. This necessitates the<br />

establishment of a school monitoring program, regularly assessing the progress of students<br />

in a variety of areas including, but not limited to, academic achievement.<br />

Apart from the development of valid, reliable and objective measures, two other<br />

requirements for the success and usefulness of such a project are the economical<br />

administration of tests and comprehensive reporting of relevant results. In this presentation,<br />

we will introduce the internet-based testing platform TAO and tools used for automatic<br />

report generation. Though developed in a country with a small population, these tools scale<br />

very well to other contexts and countries with much larger populations than in Luxembourg.<br />

We will also outline further possible developments of these tools in order to respond more<br />

fully to the demands of evidence based decision making in an educational context.<br />

ENAC 2008 141


Mathematical power of special needs students<br />

Marjolijn Peltenburg, FIsme, Utrecht University, The Netherlands<br />

Marja van den Heuvel-Panhuizen, FIsme, Utrecht University, The Netherlands/<br />

IQB, Humboldt University Berlin, Germany<br />

The poster will inform the conference participants about a small-scale study that forms the<br />

start of the IMPulSE project. This is a large project aimed at revealing the undisclosed<br />

mathematical power of special needs students (Van den Heuvel-Panhuizen & Peltenburg,<br />

2007). The purpose of the small-scale study is to pilot a set of test items which differ from<br />

regular grade-level achievement tests used to determine students’ mathematics<br />

understanding. The items in the pilot have been designed with the intention of offering<br />

children optimal possibilities to show what they are capable of. An important characteristic<br />

of these items is their ‘elasticity’. Elasticity in items allows different levels of strategy use,<br />

which makes it possible for students to pass the limits of their assumed capacities. This<br />

reduces the ‘all-or-nothing’ character of assessment (Van den Heuvel-Panhuizen, 1996).<br />

To reveal the undisclosed mathematical power of weak students we chose a topic that is<br />

recognized as difficult for weak students: subtraction with “borrowing”; that means<br />

subtraction problems in which the 1’s digit of the subtrahend is larger than the 1’s digit of<br />

the minuend (e.g., 52–17 = ...). A frequently made mistake in these problems is reversing<br />

the digits (in this case, subtracting 2 from 7 instead of 7 from 2).<br />

The set of items that is presented to the children includes fourteen subtraction problems in<br />

the number domain up to 100. The items are taken from the Cito LOVS Test for Mid Grade<br />

6, but re-designed and placed in an ict environment in which the children are offered a<br />

dynamic visual tool to find the answers. We expect that this tool will help students to<br />

overcome the obstacles, as mentioned above, in solving these subtraction problems which<br />

require “borrowing”.<br />

The data-collection takes place in two schools for primary special education. In total, the set<br />

of items is piloted with 20 children. While working on the computer, the children’s steps<br />

through the program are recorded by the Camtasia Studio software. The analysis of the<br />

data focuses on the correct scores in the two conditions – regular Cito LVS Test and ict<br />

version with the dynamic tool – and on tool use in the ict version.<br />

The poster shows a sample of the problems used in the study and a summary of the<br />

findings. In addition to the results presented on the poster, Camtasia clips will be shown on<br />

a laptop. During the poster presentation, we would like to share with the audience our<br />

experiences with using an ict-based dynamic assessment format to reveal weak students’<br />

learning potential. In connection with this, we would also like to discuss ways to continue<br />

this research.<br />

References<br />

Van den Heuvel-Panhuizen, M. (1996). Assessment and realistic mathematics education. Utrecht: CD-ß<br />

Press/Freudenthal Institute, Utrecht University.<br />

Van den Heuvel-Panhuizen, M. & Peltenburg, M (2007). Unused learning potential of special-ed students in<br />

mathematics. Research proposal. Utrecht, the Netherlands: Freudenthal Institute for Science and<br />

Mathematics Education.<br />

142 ENAC 2008


Quality Assurance review of clinical assessment:<br />

How does one close the loop?<br />

Glynis Pickworth, M. van Rooyen, T.J. Avenant<br />

University of Pretoria, South Africa<br />

The MBChB Undergraduate Programme Committee (UPC) of the School of Medicine,<br />

University of Pretoria mandated the Assessment sub-committee (AC) to review assessment<br />

practices in the student internship rotations. These rotations take place during the last 18<br />

months of the six-year programme. Students no longer have any class activities and work<br />

the whole day in a clinic or hospital. There are five seven-week rotations and eight three or<br />

three and a half week rotations through various departments such as Family Medicine,<br />

Obstetrics, Gynaecology, etc. On the whole the staff supervising and assessing students<br />

are clinicians will little or no training in education and assessment practices. The university<br />

provides such courses for staff but the clinicians’ workload mostly precludes them from<br />

attending such courses. They are joint appointments by the state and university and find it<br />

difficult to get time off due to service delivery commitments.<br />

The relevant departments were informed of the review process and criteria, after these had<br />

been approved by the UPC. They were also supplied with a resource guide outlining best<br />

practice in clinical assessment. The AC made an appointment for a group meeting with the<br />

staff responsible for assessment for a particular rotation. The group consisted of rotation<br />

heads and representatives from other departments, members of the AC and members of<br />

the department responsible for the rotation. During the group meeting the assessment<br />

practices would be described and discussed according to the review criteria. The AC would<br />

then compile a report describing the assessment practices. Good practice would be<br />

acknowledged and recommendations for improvement made. The report would be sent to<br />

the person responsible for the rotation to make sure the information on assessment<br />

practices was correct, where after it would be tabled at a meeting of the UPC.<br />

A comparison across rotations revealed that a wide diversity of assessment methods are<br />

used. Compared with the four levels of Miller’s pyramid too much assessment is still related<br />

to the lower levels of the pyramid model rather than to the apex. The review sensitised a<br />

number of staff to good assessment practice through the resource guide and discussion<br />

about their assessment practice.<br />

The question is ‘How does one close the loop in quality assurance?’ Are the<br />

recommendations for improvement actually implemented? A follow-up study still needs to<br />

be done.<br />

ENAC 2008 143


Feedback: What’s in it for me?<br />

Margaret Price, Karen Handley, Berry O’Donovan<br />

Oxford Brookes University, United Kingdom<br />

Hattie and Timperley (2007) conceptualize feedback broadly as, ‘information provided by an<br />

agent…regarding aspects of one’s performance or understanding' and are very clear that it<br />

must be ‘a “consequence” of performance’. This view is not contentious. Students want a<br />

response to their effort and staff need to provide information on the gap between<br />

performance and aim (Sadler, 1989). However we know that the process of providing and<br />

receiving feedback is fraught with difficulty arising from the multiple purposes of feedback,<br />

communication problems and emotional responses to name but a few.<br />

This paper seeks to examine the relational dimension of feedback and argues that it is a central<br />

but often a missing dimension of feedback. The role of feedback in creating the relational<br />

footings between student and tutor provides the foundations for a successful learning process<br />

and, in particular, for on-going student engagement (Black and Wiliam 1998).<br />

If we want students to engage with their assessment feedback, we must pay attention to the<br />

relational dimension of feedback. Students are free to accept, partially accept or reject feedback<br />

(Chinn and Brewer, 1993) and we would encourage them to exercise their own judgement in<br />

evaluating feedback as they progress as independent learners. Students make judgements on<br />

the basis not only of the 'content' but also of their perceptions of the credibility and intentions of<br />

the author. In addition, there is a temporal dimension because if students are initially confused<br />

(and negatively evaluate the feedback) but can then engage in dialogue with receptive tutors,<br />

students may come to understand and therefore value the feedback. Therefore our feedback<br />

must be convincing, but not necessarily positive ‘feelgood’ feedback which does not link with<br />

performance (Dweck, 2000). However, to be persuaded of the feedback’s worth, students must<br />

recognise the feedback as valuable through the reciprocity of the assessor. That reciprocity will<br />

be demonstrated through the feedback communication process. Is this process seen as<br />

unidirectional or dialogic, active or passive, are the participants in this together or separately?<br />

A three-year study on student engagement with assessment feedback involving 35 interviews<br />

with students and staff, 12 case studies, and questionnaire data from 3 institutions will be<br />

presented and the findings used to provide a framework for analysing the factors that impact<br />

on the relational dimension of feedback including:<br />

• effectiveness of communication process<br />

• timeliness of response<br />

• match between staff and students expectations of the process<br />

• trust in the assessor<br />

• media of communication – what sort of knowledge can it carry?<br />

• dialogue opportunity<br />

• context in which it can be acted upon.<br />

The findings confirm that what students are looking for in feedback is not unrealistic, but often<br />

not provided, and this leads to disillusionment and a cycle of disengagement. Therefore the<br />

implications for practice will be considered and discussed, including the need to prepare staff<br />

and students to give and receive feedback and establish a relational footing; the opportunity for<br />

dialogue in resource-constrained environments; and opportunities to use the feedback once<br />

received and understood.<br />

144 ENAC 2008


From students’ to teachers’ collaboration:<br />

a case study of the challenges of e-teaching and assessing as co-responsibility<br />

Ana Remesal, Universidad de Barcelona, Spain<br />

Manuel Juárez, José Luis Ramírez<br />

Centro Nacional de Investigación y Desarrollo Tecnológico, Mexico<br />

The introduction of new technologies into education demands that teachers handle tools that<br />

allow them to use techno-pedagogical environments for e-learning (Mauri et al. 2007). These<br />

environments pose a challenge for teachers when it comes to transforming their practice and<br />

to using the new tools in an efficient way that eventually will transform and optimize the<br />

teaching and learning processes. We report about a case study in Higher Education as the<br />

first part of a two-round project. The “Foundations of Computing Science” preliminary distance<br />

course tackled basic subjects in discrete mathematics and their applications to computing; its<br />

aim was to develop common knowledge among the students accepted for the Master’s in<br />

Computer Science in the Centro Nacional de Investigación y Desarrollo Tecnológico (National<br />

Center of Investigation and Technological Development)(CENIDET), an institution that<br />

belongs to Mexico’s Sistema Nacional de Educación Superior Tecnológica (National System<br />

of Higher Technological Education ) (SNEST). The Claroline (V. 1.1) distance learning<br />

platform was used. The purpose of this poster is to describe some of the experiences we had<br />

as designers and teachers of this first Web-based distance course on discrete mathematics.<br />

Particularly, this poster describes the difficulties encountered and the solutions proposed by<br />

the group of professors that designed and developed the course using Claroline s a tool in<br />

order to prepare the second edition of the course.<br />

The course was developed over five weeks in 2007, with a group of 18 students. The<br />

course was structured around five units, one per week. The students’ work consisted of<br />

learning activities done first individually, then contrasted in pairs and then discussed in the<br />

whole group. These activities could be carried out either in an asynchronous or a<br />

synchronous manner. The platform used, despite important deficiencies, allowed for the<br />

organization, administration and follow-up on an individual level, on student-pair level and<br />

also on the whole group.<br />

This experience showed us how the distance learning tools introduce conditions that are<br />

different from face-to-face courses. In order for the design of contents, materials and<br />

dynamics to be adequate for this new environment, greater reflection by the teacher is<br />

necessary (Coll, 2004). Big challenges are set for the second implementation of the course:<br />

especially challenges concerning the assessment of students’ learning from a coresponsibility<br />

perspective. The second implementation of the course will be carried out by<br />

two teachers simultaneously. This poses particular challenges as to students’ assessment,<br />

since both teachers will need to clarify and share teaching goals and assessment purposes<br />

and instruments. Thus, the teachers’ conceptions about assessment are expected to play<br />

an important role in this new course.<br />

References<br />

Coll, C. (2004). Psicología de la educación y prácticas educativas mediadas por las tecnologías de la<br />

información y la comunicación. Sinéctica , 25 , 1-24.<br />

Mauri, T., Colomina, R., De Gispert, I. (en prensa). Diseño de propuestas docentes con TIC en la<br />

enseñanza superior: nuevos retos y principios de calidad desde una perspectiva<br />

socioconstructivista. Revista deEducación. MEC. (7-3-2007).<br />

ENAC 2008 145


Symbiotic relationships:<br />

Assessment for Learning (AfL), study skills and key employability skills<br />

Jon Robinson, Northumbria University, United Kingdom<br />

David Walker, Northumbria University, United Kingdom<br />

This poster links directly to a detailed aspect of the area to be covered in a roundtable<br />

presentation application that has been submitted, but it can also stand alone as a<br />

representation of the development of the particular teaching practice.<br />

In the English Division at Northumbria University we have redesigned the core first-year<br />

module for English students in a way that symbiotically links assessment, study skills and<br />

employability within a framework underpinned by the theory and practice of Assessment for<br />

Learning (AfL). This poster provides a textual and graphical presentation of the introduction<br />

and evaluation of the first element of summative assessment on the core 1st year<br />

undergraduate module in English Studies at Northumbria University.<br />

This assessment covers the topic of plagiarism, a contentious study skills topic not only<br />

within Northumbria but also in the sector as a whole. The assessment practice is based on<br />

the principles of Assessment for Learning and designed in a way that also provides an<br />

opportunity to begin introducing students to practices that relate directly to key employability<br />

skills highlighted by the English Subject Centre as deficient in the typical English Studies<br />

graduate.<br />

146 ENAC 2008


Assessing low achievers’ understanding of place value –<br />

consequences for learning and instruction<br />

Petra Scherer, University of Bielefeld, Germany<br />

Introduction<br />

Understanding place value is necessary for understanding our decimal number system. As<br />

a consequence, a certain understanding has relevance for different fields of school<br />

mathematics. Having place-value concept is crucial for developing effective calculation<br />

strategies (e.g. to replace one-by-one finger counting), for understanding the written<br />

algorithms or for moving from integers to fractions. Research shows that especially low<br />

achievers have great difficulties, even in higher grades, with understanding place value.<br />

The paper describes a small case study in which an assessment tool was developed and<br />

piloted. By means of this tool teachers should get a better understanding of (1) low<br />

achievers’ difficulties and (2) consequences for teaching and learning processes.<br />

Assessment tool development<br />

The existing instruments for assessing the understanding of place value mainly focus on<br />

applying the concept in standard calculations. The developed tool, in contrast, includes testitems<br />

that cover different levels of representations and the main building blocks for<br />

calculation strategies. Moreover, not only standard items have been chosen but unknown<br />

formats or challenging items which have not been treated in classroom yet, which cannot be<br />

solved in a mechanistic way and which refer to the specific role of zero. The assessment<br />

tool comprises tasks with numbers up to 1000 and covers the following topics: Counting in<br />

steps, splitting up numbers in place values, composing numbers, interpretation of iconical<br />

representations of numbers, identifying place values of digits in 3-digit-numbers and solving<br />

simple additions and subtractions. The items can be used for paper-and-pencil tests as well<br />

as for interviews to get information for both, oral and written competences of the children.<br />

Results<br />

The assessment tool was piloted with 12 low achieving students (4 girls; 8 boys) from 5th<br />

and 6th grade who visit a special school for learning disabled. A first analysis of the results<br />

showed a certain understanding of place value for all students but also revealed a variety of<br />

difficulties, especially with the non-standard items (e. g., composing a number out of<br />

70+200+3 led to the incorrect number 723 whereas a more or less standard item like<br />

300+50+4 resulted in a correct solution). Moreover, working out simple addition and<br />

subtraction tasks in many cases was done in a rather mechanistic way by manipulating the<br />

digits and not thinking about the numbers (e. g., students did not consider all place values<br />

when adding 314+314 and came to the result 328). Beyond this, problems with zero<br />

became obvious (e. g. 624–203 led to the result 401).<br />

Discussion<br />

The analysis also shows that test results cannot be seen in an isolated way, but one has to<br />

take into account that children might have individual interpretations of the tasks. Just as<br />

important is the analysis of the whole solving process. After presenting a selection of<br />

results, consequences for teaching and learning will be discussed. Assessment of low<br />

achievers’ competences as well as classroom practice requires more than focusing on<br />

correct results but should also take into account the students’ solution strategies (including<br />

explanations and reasoning).<br />

ENAC 2008 147


Using a course forum to promote learning and assessment<br />

for learning in environmental education<br />

Revital (Tali) Tal, Technion – Israel Institute of Technology, Israel<br />

In continuation of a previous study, in which a complex assessment framework was<br />

implemented in an environmental education course in a science education department (Tal,<br />

2005), this study focused on one component of the assessment – the discussions in the<br />

course forum. The on-line a-synchronic forum served as a sociocultural arena for raising<br />

questions, leading of and participating in socio-environmental debates, and uploading the<br />

students’ projects and carrying out peer assessment. The participants were 15 minority preservice<br />

teachers from various science education disciplines who had very little prior<br />

knowledge about or awareness of the environment. Within the assessment for learning<br />

framework that directed the learning, the students were required to read and critique<br />

newspaper articles, investigate an environmental problem in their home community,<br />

participate in a field trip and discuss a variety of environmental topics in the course forum<br />

that was managed by the author (the course instructor) and a teaching assistant. The main<br />

goal was to use the course forum to improve and assess learning and engagement in<br />

environmental discourse. As the students were pre-service teachers, an additional goal was<br />

to expose the students to multi modal learning and assessment in environmental education,<br />

which is in line with the basic principles of environmental education. The research questions<br />

were: (a) to what extent the course-forum allowed participating in an environmental<br />

education learning-community? (b) in what ways students expressed engagement and<br />

concern for environmental-issues? (c) to what extent the course forum allowed the students<br />

to express diverse learning outcomes? Three levels of participation in the forum were<br />

identified: obligatory – very little participation, limited to the requested course tasks;<br />

occasional – characterized by random activity and limited to responding to others; and<br />

active – expressed by continuous activity either as initiators who brought up new topics for<br />

discussions or respondents who continued to develop the discussion. There was good<br />

alignment between the activity in the forum and the students’ final score. The students who<br />

actively participated in the forum expanded their learning far beyond class and the course<br />

assignments. In the interviews carried out a year after the course has ended, these students<br />

referred to the forum as learning, as well as an assessment instrument that contributed as<br />

well to their environmental awareness and commitment. Finally, the course forum enabled<br />

deep discussions that elevated the class-based learning. The students discussed more<br />

local problems, which were typical to their communities, and provided rich evidence to<br />

meaningful learning. In the follow up interviews, they indicated about the contribution of the<br />

forum to their freedom and their success to overcome the class language barrier.<br />

Referring to the idea of sociocultural theory and communities of practice, the course-forum<br />

enhanced learning through intensive interaction among the students, where three levels of<br />

practice were identified: peripheral, occasional and experienced. This study contributes to<br />

the field of teaching and assessment in higher education, and to the field of environmental<br />

education in multicultural societies.<br />

148 ENAC 2008


Learning-oriented feedback: a challenge to assessment practice<br />

Mirabelle Walker, The Open University, United Kingdom<br />

The paper starts by presenting research into feedback carried out in the Technology Faculty<br />

of the UK’s Open University. A coding tool introduced by Brown and Glover (2006) was<br />

used to analyse over 3000 comments made on 106 assignments in three undergraduate<br />

course modules. One dimension of this code was used to determine the categories of<br />

comments being made: relating to the content of the answer; relating to skills development;<br />

offering motivation; etc. The other dimension was used to determine the ‘depth’ of the<br />

comments: whether they were indicative, corrective or explanatory. Students’ responses to<br />

these comments were obtained through individual interviews with 43 of the students whose<br />

commented assignments had been examined. The students were asked to indicate, if<br />

possible, an example of a comment on their assignment that they had been able to use in<br />

later assignments in the module. They were also asked how they had responded to some of<br />

the specific comments written on their assignment. In the latter case, a thematic analysis of<br />

these responses was carried out, followed by a matching of response themes to categories<br />

and depths of comment.<br />

Two key results emerged from this work: the most effective comments for the students’<br />

future work are those that relate to skills development; the most effective comments for<br />

helping students to understand inadequacies in their work are those that are explanatory.<br />

The paper shows that these findings are consistent with a conceptualisation of effective<br />

feedback on assignments that, drawing on Sadler (1989) and Black & Wiliam (1998), sees it<br />

as offering students a means whereby they can reduce or close the gap between their own<br />

knowledge, skills and understanding and the desired knowledge, skills and understanding.<br />

Feedback of this type is learning-oriented feedback, and assessment which is designed to<br />

offer adequate opportunities for feedback of this type is learning-oriented assessment.<br />

The paper concludes by highlighting the ways in which these findings challenge both<br />

feedback practice, which is often insufficiently learning-oriented, and assessment practice,<br />

where skills development tends to be undervalued by those who set and mark the<br />

questions, and attention is seldom paid to skills development through the sequence of the<br />

assignments in a module or programme of study.<br />

It is intended that discussion will centre around the challenges to feedback and assessment<br />

practice that arise from this research, as outlined in the conclusion to the paper. It is hoped<br />

that participants will be able to share experiences of, or suggestions for, responding to<br />

these challenges.<br />

References<br />

Black, P. & Wiliam, D. (1998) Assessment and classroom learning, Assessment in Education: Principles,<br />

Policy and Practice, 5(1), 7–74.<br />

Brown, E. & Glover, C. (2006) ‘Evaluating written feedback’ in Bryan, C. & Clegg, K. (eds.) Innovative<br />

Assessment in Higher Education, Abingdon: Routledge, 81–91.<br />

Sadler, D. R. (1989) Formative assessment and the design of instructional systems, Instructional Science,<br />

18, 119–144.<br />

ENAC 2008 149


Progressive Formalization as an Interpretive Lens for<br />

Increasing the Learning Potentials of Classroom Assessment<br />

David Webb, University of Colorado at Boulder, The United States of America<br />

Education researchers have repeatedly asserted that to improve student learning, teachers<br />

need to give greater attention to their use of formative assessment. To effectively guide<br />

student learning, teachers must develop greater confidence in their own decision making<br />

and expertise in classroom assessment. To appropriately interpret student responses to<br />

instructional activities, teachers need to understand how the mathematical content<br />

demonstrated in students’ representations relate to the development of student learning and<br />

expectations for mathematical literacy.<br />

The didactical design construct of progressive formalization, and many examples thereof,<br />

draws from decades of developmental research using the principles of Realistic Mathematics<br />

Education. Instructional sequences in RME are conceived as “learning lines” in which problem<br />

contexts serve as starting points to elicit students’ informal representations. When<br />

appropriate, the teacher builds upon students’ representations and either draws upon student<br />

strategies that are progressively more formal or introduces students to new strategies and<br />

models. Students are encouraged to refer back to less formal representations to deepen their<br />

understanding of the abstract-symbolic. Essentially, progressive formalization is a designoriented<br />

mathematical instantiation of cognitive/constructivist learning theories. Through<br />

careful attention to students' prior knowledge and guided support from the teacher, students’<br />

conceptions are related to other pre-formal mathematical representations. The teacher<br />

facilitates student learning and a sense of ownership by selecting appropriate problems,<br />

interpreting student responses, posing clarifying questions, and using counterexamples to<br />

support the development of students’ mathematical understanding.<br />

This paper reports the underlying design theory and results from a research-based,<br />

professional development program designed to improve teacher confidence, expertise and<br />

use of classroom assessment. Over the past 3-years, the program has involved 32 middle<br />

grades mathematics teachers working among six middle schools (i.e., 12 to 14 year old<br />

students) in a moderately-sized U.S. public school district.<br />

From prior assessment design studies involving mathematics teachers, we recognized that<br />

limitations in the content knowledge of some teachers had a profound influence on their<br />

ability to select or design tasks accessible to students’ informal and pre-formal<br />

representations. As a way to deepen their understanding of mathematics, teachers<br />

completed mathematical tasks that illustrated progressive formalization in rational number<br />

and algebra. In design and analysis activities, teachers continuously used progressive<br />

formalization as a lens to adapt or create assessment tasks, review their instructional<br />

materials, design scoring guides and rubrics, interpret student responses, and discuss<br />

instructional responses based on examples of student work.<br />

The analysis of teachers’ assessment portfolios (i.e., collection of all paper and pencil<br />

assessments) suggest that this PD model provided teachers with a more principled basis for<br />

assessing student understanding and resulted in a conceptualization of assessment that<br />

was generative. That is, teachers applied principles of progressive formalization by<br />

increasing the accessibility of the assessment tasks they used and in the ways they<br />

interpreted and responded to student work. The full paper and presentation will include<br />

examples of the classroom assessment teachers designed, how they used student work to<br />

inform the revision and redesign of assessments, and preliminary analysis of the impact of<br />

this program on student achievement of participating teachers.<br />

150 ENAC 2008


Author Index<br />

ENAC 2008 151


152 ENAC 2008


Adamson 105<br />

Admiraal 69<br />

Allin 57<br />

Asghar 58<br />

Asmyhr 101<br />

Avenant 143<br />

Bakker 126<br />

Barrie 102<br />

Bayrhuber 34<br />

Bell 129<br />

Berenst 52<br />

Bjølseth 115<br />

Black<br />

Beth 59<br />

Paul 66<br />

Blackmore 42<br />

Bloxham 60<br />

Bohemia 65<br />

Bokhove 130<br />

Boud 103<br />

Boursicot 45<br />

Bremer 87<br />

Brockbank 131<br />

Bruder 34<br />

Bucholtz 98, 132<br />

Busana 141<br />

Butcher 114<br />

Campbell 60<br />

Cheung 90, 104<br />

Clark 105<br />

Coates 108<br />

Contreras Palma 61<br />

Cowie 62<br />

Crossouard 14<br />

Dacre 45<br />

Davison 106<br />

de Glopper 50, 52<br />

De Grez 107<br />

Dearnley 76, 108<br />

Diemer 133<br />

Dysthe 25, 27<br />

Ebert 63<br />

Ecclestone 13<br />

Eckes 16<br />

Eggen 64<br />

Ekecrantz 124<br />

Engelsen 27, 91<br />

Entwistle 32<br />

Fastré 134<br />

Fisher 109<br />

Fishwick 57<br />

Foreman-Peck 29<br />

Formazin 98, 132<br />

Fuller 44, 110<br />

Furnborough 111<br />

Goldhammer 39, 40, 74<br />

Goos 135<br />

Handley 84, 144<br />

Harman 65<br />

Harrison 66<br />

Harsch 17<br />

Hartig 17, 33, 36<br />

Hartley 76<br />

Hartnell-Young 26<br />

Havnes 25, 67, 68, 115<br />

Hepplestone 112, 136<br />

Hernandez 137<br />

Higham 45<br />

Hodgen 66<br />

Hoeksma 69<br />

Höhler 36<br />

Homer 110<br />

Hopfenbeck 113<br />

Hounsell<br />

Dai 70, 138<br />

Jenny 70<br />

Hughes<br />

C. 102<br />

Clair 135, 139<br />

Hunter 114<br />

Jadoul 38<br />

James 12<br />

Janssen<br />

Fred 95<br />

Judith 69<br />

Jones<br />

Alister 62<br />

Julie 31<br />

Jonsson 140<br />

Jordan 22, 114, 131<br />

Joughin 71<br />

Juárez 145<br />

Jude 40<br />

Karius 18<br />

Keller 141<br />

Klieme 5, 40<br />

Köller 15<br />

Kunina 35<br />

ENAC 2008 153


Kuper 133<br />

Kwant 52<br />

Lai<br />

Mei Kuin 78<br />

Patrick 72<br />

Latour 38<br />

Lauvas 115<br />

Lecaque 38<br />

Leitch 6<br />

Leuders 34<br />

Linsey 120<br />

Luff 116<br />

Maier 73<br />

Marshall 66<br />

Martens 37, 39, 40, 74<br />

Martin 141<br />

McCabe 75<br />

McCusker 21<br />

McDowell 11<br />

McLean 106<br />

Meddings 76<br />

Meeus 28<br />

Mellor 32<br />

Mitchell 131<br />

Montgomery 77<br />

Moreland 62<br />

Nachshon 117<br />

Narciss 49<br />

Naumann 39, 40<br />

Neumann 18<br />

Nicholson 21<br />

Norton<br />

Bill 89<br />

Lin 89<br />

O'Brien 78<br />

O'Doherty 79<br />

O'Donovan 84, 118, 119, 144<br />

Oehler 80<br />

Ooms 120<br />

Orr 81<br />

O'Siochru 94<br />

Otrel-Cass 62<br />

Pat-El 82<br />

Pell 41, 44, 110<br />

Peltenburg 142<br />

Pickworth 143<br />

Pilkington 83<br />

Plichart 38<br />

Price 84, 118, 119, 144<br />

Proctor-Childs 109<br />

Pryor 14<br />

Ramírez 145<br />

Reichert 141<br />

Reimann 25<br />

Remesal 85, 145<br />

Renault 38<br />

Richardson 86<br />

Ridgway 21<br />

Roberts 41, 45<br />

Robinson<br />

Gilian 116<br />

Jon 121, 146<br />

Robitzsch 15, 18, 80<br />

Rölke 39, 40<br />

Rom 117<br />

Roozen 107<br />

Rowley 129<br />

Ruedel 20<br />

Rupp 35<br />

Sambell 77<br />

Sandal 122<br />

Saniter 87<br />

Schaap 88<br />

Scharaf 39<br />

Scherer 147<br />

Schmidt 88<br />

Schofield 123<br />

Schroeders 98<br />

Schwieler 124<br />

Segers 49, 82<br />

Serret 66<br />

Shannon 89<br />

Sit 90, 104<br />

Sjo 91<br />

Sluijsmans 46, 47, 48, 134<br />

Smee 43<br />

Smith 19<br />

Andrew 31<br />

C. 102<br />

Kari 91, 92, 122<br />

Stein 93<br />

Stern 125<br />

Strijbos 48, 49<br />

Strivens 94<br />

Swietlik-Simon 38<br />

Syversen 122<br />

Tai 138<br />

Tal 148<br />

Taylor 108<br />

Tigelaar 95, 126<br />

154 ENAC 2008


Tillema 82<br />

Valcke 107<br />

Van de Watering 48<br />

van den Boogaard 53<br />

van den Heuvel-Panhuizen 50, 53, 142<br />

van der Klink 134<br />

van der Pol 51<br />

van Lierop-Debrauwer 51<br />

van Merriënboer 47, 134<br />

Van Petegem 28<br />

van Rooyen 143<br />

van Tartwijk 95<br />

van Zundert 47, 96<br />

Vedder 82<br />

Veldman 95<br />

Verloop 95, 126<br />

Vernon 30<br />

Walker<br />

David 121, 146<br />

Mirabelle 149<br />

Wangensteen 122<br />

Webb<br />

David 150<br />

Marion 120<br />

Webster-Wright 135<br />

Whitelock 19, 97<br />

Wilhelm 35, 98, 132<br />

Wiliam 7<br />

Wirtz 34<br />

Xu 138<br />

ENAC 2008 155


156 ENAC 2008


Address List of<br />

presenters<br />

ENAC 2008 157


158 ENAC 2008


Allin, Linda<br />

Northumbria University<br />

CETL<br />

NE1 8ST Newcastle Upon Tyne<br />

UNITED KINGDOM<br />

linda.allin@unn.ac.uk<br />

Barrie, Simon<br />

The University of Sydney<br />

AUSTRALIA<br />

S.Barrie@itl.usyd.edu.au<br />

Blackmore, David<br />

Medical Council of Canada<br />

2283 St. Laurent Boulevard<br />

K1G 5A2 Ottawa<br />

CANADA<br />

dblackmore@mcc.ca<br />

Boud, David<br />

University of Technology, Sydney<br />

PO Box 123<br />

NSW 2007 Broadway<br />

AUSTRALIA<br />

David.Boud@uts.edu.au<br />

Bucholtz, Nina<br />

Humboldt-Universität zu Berlin<br />

IQB<br />

Unter den Linden 6<br />

10099 Berlin<br />

GERMANY<br />

bucholtn@iqb.hu-berlin.de<br />

Asghar, Mandy<br />

Leeds Metropolitan University<br />

7 Chelwood Ave<br />

LS8 2BA Leeds<br />

UNITED KINGDOM<br />

a.asghar@leedsmet.ac.uk<br />

Bell, Andy<br />

Manchester Metropolitan University<br />

24 Park Row<br />

SK4 3DY Heaton Mersey /<br />

Stockport<br />

UNITED KINGDOM<br />

A.Bell@mmu.ac.uk<br />

Bloxham, Sue<br />

University of Cumbria<br />

Bowerham Rd<br />

LA1 3JD Lancaster<br />

UNITED KINGDOM<br />

susan.bloxham@cumbria.ac.uk<br />

Boursicot, Katharine<br />

St George's, University of London<br />

46-47 Compton Road<br />

N1 2PB London<br />

UNITED KINGDOM<br />

kboursic@sgul.ac.uk<br />

Cheung, Kwok Cheung<br />

Faculty of Education<br />

University of Macau<br />

11A, Block 2<br />

Taipa Macao<br />

CHINA<br />

kccheung@umac.mo<br />

Asmyhr, Morten<br />

Østfold University College<br />

Sagmesterveien 49<br />

1414 Trollåsen<br />

NORWAY<br />

morten.asmyhr@hiof.no<br />

Black, Beth<br />

Cambridge Assessment<br />

8 Marshall Road<br />

CB1 7TY Cambridge<br />

UNITED KINGDOM<br />

Black.B@cambridgeassessment.org.uk<br />

Bokhove, Christian<br />

FIsme<br />

Utrecht University<br />

Aidadreef 12<br />

3561 GE Utrecht<br />

NETHERLANDS<br />

cbokhove@gmail.com<br />

Brockbank, Barbara<br />

The Open University<br />

19 Woodfield Road<br />

TN9 2LG Tonbridge<br />

UNITED KINGDOM<br />

bsb3@tutor.open.ac.uk<br />

Clark, Wendy<br />

Northumbria University<br />

CETL<br />

NE1 8ST Newcastle Upon Tyne<br />

UNITED KINGDOM<br />

wendy.clark@unn.ac.uk<br />

ENAC 2008 159


Contreras Palma, Saul Alejandro<br />

CHILE<br />

saul2674@hotmail.com<br />

de Glopper, Kees<br />

Center for Language and<br />

Cognition, Faculty of Arts,<br />

University of Groningen<br />

PO Box 716<br />

9700 AS Groningen<br />

NETHERLANDS<br />

c.m.de.glopper@rug.nl<br />

Diemer, Tobias<br />

Freie Universität Berlin<br />

Arnimallee 12<br />

14195 Berlin<br />

GERMANY<br />

diemer@zedat.fu-berlin.de<br />

Ecclestone, Kathryn<br />

Oxford Brookes University<br />

Westminster Institute of Education<br />

Harcourt Hill OX Oxford<br />

UNITED KINGDOM<br />

kecclestone@brookes.ac.uk<br />

Fastré, Greet<br />

Open Universiteit Nederland<br />

Valkenburgerweg 177<br />

6419 AT Heerlen<br />

NETHERLANDS<br />

greet.fastre@ou.nl<br />

Cowie, Bronwen<br />

University of Waikato<br />

Hillcrest Rd<br />

2001 Hamilton<br />

NEW ZEALAND<br />

bcowie@waikato.ac.nz<br />

De Grez, Luc<br />

University College Brussels<br />

Koningsstraat 336<br />

1030 Brussels<br />

BELGIUM<br />

luc.degrez@hubrussel.be<br />

Dysthe, Olga<br />

University of Bergen<br />

Beiteveien 9<br />

5019 Bergen<br />

NORWAY<br />

Olga.Dysthe@iuh.uib.no<br />

Eckes, Thomas<br />

TestDaF Institute<br />

Feithstr. 188<br />

58084 Hagen<br />

GERMANY<br />

thomas.eckes@testdaf.de<br />

Fisher, Margaret<br />

University of Plymouth<br />

Drake Circus<br />

PL4 8AA Plymouth<br />

UNITED KINGDOM<br />

m.fisher@plymouth.ac.uk<br />

Davison, Gillian<br />

Northumbria University<br />

CETL AfL<br />

NE1 8ST Newcastle upon Tyne<br />

UNITED KINGDOM<br />

gillian.davison@unn.ac.uk<br />

Dearnley, Christine<br />

University of Bradford<br />

Ashgrove Barn, Broad Lane<br />

HD9 1LS Huddersfield<br />

UNITED KINGDOM<br />

c.a.dearnley1@bradford.ac.uk<br />

Ebert, Julian<br />

University of Zurich<br />

Binzmühlestr. 14<br />

8050 Zürich<br />

SWITZERLAND<br />

ebert@ifi.uzh.ch<br />

Eggen, Astrid Birgitte<br />

University of Oslo<br />

Kapellveien 17c<br />

487 Oslo<br />

NORWAY<br />

astrid.eggen@ils.uio.no<br />

Foreman-Peck, Lorraine<br />

The University of Northampton<br />

53 Portland Road<br />

0X2 7EZ Oxford<br />

UNITED KINGDOM<br />

lorraine.foremanpeck@northampton.ac.uk<br />

160 ENAC 2008


Fuller, Richard<br />

School of Medicine<br />

University of Leeds<br />

LS2 9JT Leeds<br />

UNITED KINGDOM<br />

R.Fuller@leeds.ac.uk<br />

Harman, Kerry<br />

Northumbria University<br />

CETL, Ellison Building<br />

NE1 8ST Newcastle Upon Tyne<br />

UNITED KINGDOM<br />

m.newson@unn.ac.uk<br />

Hartnell-Young, Elizabeth<br />

Learning Science Research Institute<br />

The University of Nottingham<br />

NG8 1BB Nottingham<br />

UNITED KINGDOM<br />

elizabeth.hartnellyoung@nottingham.ac.uk<br />

Hernandez, Rosario<br />

University College Dublin<br />

School of Languages and Literatures<br />

Newman Building, Belfield 4<br />

Dublin<br />

IRELAND<br />

charo.hernandez@ucd.ie<br />

Hopfenbeck, Therese Nerheim<br />

University of Oslo<br />

Faculty of Education<br />

Sem Seland vei 24, P.O. Box 1099<br />

Blindern<br />

NO-0317 Oslo<br />

NORWAY<br />

t.n.hopfenbeck@ils.uio.no<br />

Furnborough, Concha<br />

The Open University<br />

Walton Hall<br />

MK7 6AA Milton Keynes<br />

UNITED KINGDOM<br />

c.furnborough@open.ac.uk<br />

Harrison, Christine<br />

King's College London<br />

Franklin-Wilkins-Building WBW,<br />

150 Stamford Street<br />

SE1 9NN London<br />

UNITED KINGDOM<br />

christine.harrison@kcl.ac.uk<br />

Havnes, Anton<br />

University of Bergen<br />

Øvreveien 36<br />

N-1450 Nesoddtangen<br />

NORWAY<br />

anton.havnes@hio.no<br />

Hoeksma, Mark<br />

University of Amsterdam Graduate<br />

School for Teaching and Learning<br />

P.E. Tegelbergplein 4<br />

1019 TA Amsterdam<br />

NETHERLANDS<br />

m.hoeksma@uva.nl<br />

Hounsell, Dai<br />

University of Edinburgh<br />

Paterson's Land, Holyrood Road<br />

EH8 8AQ Edinburgh<br />

UNITED KINGDOM<br />

Dai.Hounsell@ed.ac.uk<br />

Goldhammer, Frank<br />

German Institute for International<br />

Educational Research (DIPF)<br />

Schlossstr. 29<br />

60486 Frankfurt/Main<br />

GERMANY<br />

goldhammer@dipf.de<br />

Hartig, Johannes<br />

German Institute for International<br />

Educational Research (DIPF)<br />

Schloßstraße 29<br />

60486 Frankfurt am Main<br />

GERMANY<br />

hartig@dipf.de<br />

Hepplestone, Stuart<br />

Sheffield Hallam University<br />

Howard Street<br />

S1 1WB Sheffield<br />

UNITED KINGDOM<br />

s.j.hepplestone@shu.ac.uk<br />

Höhler, Jana<br />

German Institute for International<br />

Educational Research (DIPF)<br />

Schloßstraße 29<br />

60486 Frankfurt am Main<br />

GERMANY<br />

hoehler@dipf.de<br />

Hounsell, Jenny<br />

University of Edinburgh<br />

Paterson's Land, Holyrood Road<br />

EH8 8AQ Edinburgh<br />

UNITED KINGDOM<br />

Jenny.Hounsell@ed.ac.uk<br />

ENAC 2008 161


Hughes, Clair<br />

The University of Queensland<br />

147 Swann Rd<br />

4068 Brisbane<br />

AUSTRALIA<br />

clair.hughes@uq.edu.au<br />

Jonsson, Anders<br />

Malmö University<br />

School of Teacher Education<br />

SE-205 06 Malmö<br />

SWEDEN<br />

anders.jonsson@mah.se<br />

Keller, Ulrich<br />

University of Luxembourg<br />

Route de Diekirch<br />

L-7220 Walferdange<br />

LUXEMBOURG<br />

ulrich.keller@uni.lu<br />

Kunina, Olga<br />

Humboldt-Universität zu Berlin<br />

IQB<br />

Unter den Linden 6<br />

10099 Berlin<br />

GERMANY<br />

Olga.Kunina@iqb.hu-berlin.de<br />

Lauvas, Per<br />

Østvold University College<br />

NORWAY<br />

per.lauvas@hiof.no<br />

James, David<br />

University of the West of England<br />

Coldharbour Lane<br />

BS16 1QY Bristol<br />

UNITED KINGDOM<br />

david.james@uwe.ac.uk<br />

Jordan, Sally<br />

The Open University<br />

COLMSCT<br />

95 Sluice Road<br />

PE38 0DZ Downham Market<br />

UNITED KINGDOM<br />

s.e.jordan@open.ac.uk<br />

Klieme, Eckhard<br />

German Institute for International<br />

Educational Research (DIPF)<br />

Schloßstraße 29<br />

60486 Frankfurt<br />

GERMANY<br />

klieme@dipf.de<br />

Kwant, Aletta<br />

Center for Language and<br />

Cognition, Faculty of Arts<br />

University of Groningen<br />

PO Box 716<br />

9700 AS Groningen<br />

NETHERLANDS<br />

l.p.kwant@rug.nl<br />

Leitch, Ruth<br />

Queen`s University Belfast<br />

69-71 University Street<br />

BT7 1HL Belfast<br />

UNITED KINGDOM<br />

r.leitch@qub.ac.uk<br />

Jones, Julie<br />

The University of Northampton<br />

11 Cardinal Close<br />

NN4 0RP Northampton<br />

UNITED KINGDOM<br />

julie.jones@northampton.ac.uk<br />

Joughin, Gordon<br />

University of Wollongong<br />

CEDIR, University of Wollongong<br />

2522 Wollongong<br />

AUSTRALIA<br />

gordonj@uow.edu.au<br />

Köller, Olaf<br />

Humboldt-Universität zu Berlin<br />

IQB<br />

Unter den Linden 6<br />

10099 Berlin<br />

GERMANY<br />

iqboffice@iqb.hu-berlin.de<br />

Lai, Patrick<br />

The Hong Kong Polytechnic University<br />

Educational Development Centre<br />

Room TU607<br />

Hung Hom, Kowloon<br />

Hong Kong<br />

CHINA<br />

etktlai@netvigator.com<br />

Luff, Paulette<br />

Anglia Ruskin University<br />

Bishop Hall Lane<br />

CM1 1SQ Chelmsford<br />

UNITED KINGDOM<br />

paulette.luff@anglia.ac.uk<br />

162 ENAC 2008


Maier, Uwe<br />

University of Education Schw. Gmünd<br />

Ostalbstrasse 8<br />

73529 Schwäbisch Gmünd<br />

GERMANY<br />

uwe.maier@ph-gmuend.de<br />

McDowell, Liz<br />

University of Northumbria<br />

CETL Hub D121, Ellison Building,<br />

Ellison Place<br />

NE1 8ST Newcastle Upon Tyne<br />

UNITED KINGDOM<br />

liz.mcdowell@unn.ac.uk<br />

Mellor, Antony<br />

Northumbria University<br />

School of Applied Sciences<br />

NE1 8ST newcastle Upon Tyne<br />

UNITED KINGDOM<br />

antony.mellor@unn.ac.uk<br />

Naumann, Johannes<br />

German Institute for International<br />

Educational Research (DIPF)<br />

Schloßstraße 29<br />

60486 Frankfurt am Main<br />

GERMANY<br />

naumann@dipf.de<br />

O'Donovan, Berry<br />

Oxford Brookes University<br />

150 Marlborough Road<br />

OX1 4LS Oxford<br />

UNITED KINGDOM<br />

bodonovan@brookes.ac.uk<br />

Martens, Thomas<br />

German Institute for International<br />

Educational Research (DIPF)<br />

Postfach 900270<br />

60442 Frankfurt am Main<br />

GERMANY<br />

m@rtens.net<br />

Meddings, Fiona<br />

Division of Midwifery &<br />

Reproductive Health<br />

University of Bradford<br />

55 Crowther Avenue<br />

LS28 5SA Leeds<br />

UNITED KINGDOM<br />

f.s.meddings@bradford.ac.uk<br />

Montgomery, Catherine<br />

Northumbria University<br />

CETL AfL, Ellison Building<br />

Ellison Place, Newcastle<br />

NE1 8ST Newcastle<br />

UNITED KINGDOM<br />

c.montgomery@unn.ac.uk<br />

O'Brien, Patrice<br />

Faculty of Education<br />

University of Auckland<br />

111 Blockhouse Bay Rd, Avondale<br />

1026 Auckland<br />

NEW ZEALAND<br />

pa.obrien@auckland.ac.nz<br />

Oehler, Raphaela<br />

Humboldt-Universität zu Berlin<br />

IQB<br />

Unter den Linden 6<br />

10099 Berlin<br />

GERMANY<br />

raphaela.oehler@iqb.hu-berlin.de<br />

McCabe, Michael<br />

University of Portsmouth<br />

Lion Terrace<br />

PO1 3HF Portsmouth<br />

UNITED KINGDOM<br />

michael.mccabe@port.ac.uk<br />

Meeus, Wil<br />

Universiteit Antwerpen<br />

Venusstraat 35<br />

2000 Antwerpen<br />

BELGIUM<br />

wil.meeus@ua.ac.be<br />

Nachshon, Michal<br />

Ministry of Education<br />

Vardiya st. 24<br />

34657 Haifa<br />

ISRAEL<br />

michaln@tx.technion.ac.il<br />

O'Doherty, Michelle<br />

Liverpool Hope University<br />

Hope Park<br />

L16 9JD Liverpool<br />

UNITED KINGDOM<br />

odoherm@hope.ac.uk<br />

Ooms, Ann<br />

Kingston University<br />

19 Woodlands - 4 South Bank<br />

KT6 6DB Surbiton<br />

UNITED KINGDOM<br />

a.ooms@kingston.ac.uk<br />

ENAC 2008 163


Orr, Susan<br />

York St John University<br />

Lord Mayor's Walk<br />

YO317EX York<br />

UNITED KINGDOM<br />

s.orr@yorksj.ac.uk<br />

Peltenburg, Marjolijn<br />

FIsme<br />

Utrecht University<br />

Aidadreef 12<br />

3561 GE Utrecht<br />

NETHERLANDS<br />

M.Peltenburg@fi.uu.nl<br />

Plichart, Patrick<br />

CRP Henri Tudor<br />

Avenue John F. Kennedy L, 29<br />

1855 Luxembourg - Kirchberg<br />

LUXEMBOURG<br />

patrick.plichart@tudor.lu<br />

Remesal, Ana<br />

Universidad de Barcelona<br />

Paseo del Valle Hebrón, 171<br />

E-08035 Barcelona<br />

SPAIN<br />

aremesal@ub.edu<br />

Roberts, Trudie<br />

University of Leeds<br />

Level 7, Worsley Building,<br />

Clarendon Way<br />

LS2 9NL Leeds<br />

UNITED KINGDOM<br />

t.e.roberts@leeds.ac.uk<br />

Pat-El, Ron<br />

Leiden University<br />

Catharinaland 69<br />

2591 CG Den Haag<br />

NETHERLANDS<br />

rpatel@fsw.leidenuniv.nl<br />

Pickworth, Glynis<br />

University of Pretoria<br />

90 Wenning Street<br />

181 Pretoria<br />

SOUTH AFRICA<br />

glynis.pickworth@up.ac.za<br />

Price, Margaret<br />

Oxford Brookes University<br />

2 Hearne Road<br />

W4 3NJ London<br />

UNITED KINGDOM<br />

meprice@brookes.ac.uk<br />

Richardson, Mary<br />

Roehampton University<br />

Froebel College<br />

Roehampton Lane<br />

SW15 5PJ London<br />

UNITED KINGDOM<br />

mary.richardson@roehampton.ac.uk<br />

Robinson, Jon<br />

Northumbria University<br />

CETL AfL<br />

NE1 8ST Newcastle Upon Tyne<br />

UNITED KINGDOM<br />

john.robinson@unn.ac.uk<br />

Pell, Godfrey<br />

University of Leeds<br />

CSSME, EC Stoner Building<br />

LS2 9JT Leeds<br />

UNITED KINGDOM<br />

G.Pell@leeds.ac.uk<br />

Pilkington, Ruth<br />

University of Central Lancashire<br />

67 Lower Bank Road<br />

PR2 8NU Preston<br />

UNITED KINGDOM<br />

RMHPilkington@uclan.ac.uk<br />

Pryor, John<br />

University of Sussex<br />

9 Wellington Road<br />

BN2 3AB Brighton<br />

UNITED KINGDOM<br />

j.b.pryor@sussex.ac.uk<br />

Ridgway, Jim<br />

University of Durham<br />

School of Education<br />

Leazes Road<br />

Durham DH1 1TA UK<br />

UNITED KINGDOM<br />

Jim.Ridgway@durham.ac.uk<br />

Robitzsch, Alexander<br />

Humboldt-Universität zu Berlin<br />

IQB<br />

Unter den Linden 6<br />

10099 Berlin<br />

GERMANY<br />

alexander.robitzsch@iqb.hu-berlin.de<br />

164 ENAC 2008


Ruedel, Cornelia<br />

University of Zurich<br />

E-Learning Center<br />

Hirschengraben 84<br />

8001 Zurich<br />

SWITZERLAND<br />

Cornelia.Ruedel@access.uzh.ch<br />

Schaap, Lydia<br />

Erasmus University Rotterdam<br />

Institute of Psychology<br />

Haagdijk 51A<br />

4811 TP Breda<br />

NETHERLANDS<br />

l.schaap@fsw.eur.nl<br />

Schwieler, Elias<br />

Stockholm University<br />

UPC Frescativ. 28<br />

106 91 Stockholm<br />

SWEDEN<br />

elias.schwieler@upc.su.se<br />

Sjo, Anne Kristin<br />

Stord/Haugesund University College<br />

PB 5000<br />

5409 Stord<br />

NORWAY<br />

aks@hsh.no<br />

Smith, Kari<br />

University of Bergen<br />

Post box 7800<br />

5120 Bergen<br />

NORWAY<br />

kari.smith@iuh.uib.no<br />

Sandal, Ann Karin<br />

Sogn and Fjordane University College<br />

Stedjeåsen 24<br />

6856 Sogndal<br />

NORWAY<br />

ann.karin.sandal@hisf.no<br />

Scherer, Petra<br />

University of Bielefeld<br />

Faculty of Mathematics<br />

Athener Weg 9<br />

44269 Dortmund<br />

GERMANY<br />

petra.scherer@uni-bielefeld.de<br />

Shannon, Lee<br />

Liverpool Hope University<br />

6 Leda Grove<br />

L17 8XL Liverpool<br />

UNITED KINGDOM<br />

leeroyshannon@hotmail.co.uk<br />

Sluijsmans, Dominique<br />

Open Universiteit Nederland<br />

PO Box 2960<br />

6401 DL Heerlen<br />

NETHERLANDS<br />

dominique.sluijsmans@ou.nl<br />

Stein, Margit<br />

Lehrstuhl für Sozialpädagogik und<br />

Gesundheitspädagogik<br />

Kath. Universität Eichstätt-Ingolstadt<br />

Schießstättberg 5<br />

85072 Eichstätt<br />

GERMANY<br />

margit.stein@gmx.net<br />

Saniter, Andreas<br />

ITB Uni Bremen<br />

Am Fallturm 1 Pf. 330440<br />

28334 Bremen<br />

GERMANY<br />

asaniter@uni-bremen.de<br />

Schofield, Mark<br />

Edge Hill Univesity<br />

St Helens Road<br />

L39 4QP Lancashire<br />

UNITED KINGDOM<br />

schom@edgehill.ac.uk<br />

Sit, Pou Seong<br />

Faculty of Education<br />

University of Macau<br />

J520<br />

Taipa Macao<br />

CHINA<br />

pssit@umac.mo<br />

Smee, Sydney<br />

Medical Council of Canada<br />

2283 St. Laurent Blvd<br />

K1G 5A2 Ottawa<br />

CANADA<br />

sydney@mcc.ca<br />

Stern, Thomas<br />

University of Klagenfurt<br />

Schottenfeldg. 29<br />

1070 Wien<br />

AUSTRIA<br />

thomas.stern@uni-klu.ac.at<br />

ENAC 2008 165


Strijbos, Jan-Willem<br />

Universiteit Leiden<br />

Fac. Sociale Wetenschappen<br />

Postbus 9555<br />

2300 RB Leiden<br />

NETHERLANDS<br />

jwstrijbos@fsw.leidenuniv.nl<br />

Tigelaar, Dineke<br />

ICLON-Leiden University<br />

Graduate School of Teaching<br />

PO Box 9555<br />

2300 RB Leiden<br />

NETHERLANDS<br />

DTigelaar@iclon.leidenuniv.nl<br />

van der Pol, Coosje<br />

Tilburg University<br />

Retiesheike 16<br />

2460 Kasterlee<br />

THE NETHERLANDS<br />

j.a.vdrpol@uvt.nl<br />

Walker, Mirabelle<br />

The Open University<br />

Communication & Systems Dept.<br />

MCT Faculty<br />

Walton Hall<br />

MK7 6AA Milton Keynes<br />

UNITED KINGDOM<br />

c.m.walker@open.ac.uk<br />

Wilhelm, Oliver<br />

Humboldt-Universität zu Berlin<br />

IQB<br />

Unter den Linden 6<br />

10099 Berlin<br />

GERMANY<br />

oliver.wilhelm@rz.hu-berlin.de<br />

Strivens, Janet<br />

The University of Liverpool<br />

Y Graig, Llandegla<br />

LL11 3BG Wrexham<br />

UNITED KINGDOM<br />

strivens@liv.ac.uk<br />

van den Boogaard, Sylvia<br />

FIsme<br />

Utrecht University<br />

Aidadreef 12<br />

3561 GE Utrecht<br />

NETHERLANDS<br />

s.vandenboogaard@fi.uu.nl<br />

van Zundert, Marjo<br />

Open Universiteit Nederland<br />

Postbus 2960<br />

6401 DL Heerlen<br />

THE NETHERLANDS<br />

marjo.vanzundert@ou.nl<br />

Webb, David<br />

University of Colorado at Boulder<br />

249 UCB<br />

80309 Boulder<br />

THE UNITED STATES<br />

dcwebb@colorado.edu<br />

Wiliam, Dylan<br />

University of London<br />

20 Bedford Way<br />

WC1H 0AL London<br />

UNITED KINGDOM<br />

d.wiliam@ioe.ac.uk<br />

Tal, Tali<br />

Technion –<br />

Israel Institute of Technology<br />

30 Ella st<br />

25147 Kefar Veradim<br />

ISRAEL<br />

rtal@technion.ac.il<br />

van den Heuvel-Panhuizen, Marja<br />

FIsme, Utrecht University<br />

Aidadreef 12, 3561 GE Utrecht<br />

NETHERLANDS<br />

m.vandenheuvel@fi.uu.nl<br />

IQB, Humboldt-Universität zu Berlin<br />

Unter den Linden 6, 10099 Berlin<br />

GERMANY<br />

heuvelpm@IQB.hu-berlin.de<br />

Vernon, Julia<br />

The University of Northampton<br />

Park Campus, Boughton Green Rd<br />

Northampton NN2 7AL<br />

UNITED KINGDOM<br />

julia.vernon@northampton.ac.uk<br />

Whitelock, Denise<br />

The Open University<br />

Institute of Educational Technology<br />

Walton Hall<br />

MK7 6AA Milton Keynes<br />

UNITED KINGDOM<br />

d.m.whitelock@open.ac.uk<br />

Wirtz, Markus Antonius<br />

University of Education<br />

Department of Psychology<br />

Kunzenweg 21<br />

79117 Freiburg<br />

GERMANY<br />

markus.wirtz@ph-freiburg.de<br />

166 ENAC 2008

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!