CHALLENGING ASSESSMENT - PSYCONDIA
CHALLENGING ASSESSMENT - PSYCONDIA
CHALLENGING ASSESSMENT - PSYCONDIA
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>CHALLENGING</strong> <strong>ASSESSMENT</strong><br />
─<br />
BOOK OF ABSTRACTS OF THE<br />
FOURTH BIENNIAL<br />
EARLI/NORTHUMBRIA<br />
<strong>ASSESSMENT</strong> CONFERENCE 2008<br />
Edited by<br />
Marja van den Heuvel-Panhuizen<br />
Olaf Köller
<strong>CHALLENGING</strong> <strong>ASSESSMENT</strong><br />
─<br />
BOOK OF ABSTRACTS OF THE<br />
FOURTH BIENNIAL<br />
EARLI/NORTHUMBRIA<br />
<strong>ASSESSMENT</strong> CONFERENCE 2008
<strong>CHALLENGING</strong> <strong>ASSESSMENT</strong> – BOOK OF ABSTRACTS OF THE FOURTH BIENNIAL<br />
EARLI/NORTHUMBRIA <strong>ASSESSMENT</strong> CONFERENCE 2008<br />
Edited by<br />
Marja van den Heuvel-Panhuizen<br />
Olaf Köller<br />
Editorial assistance<br />
Monika Lacher<br />
Humboldt-Universität zu Berlin<br />
Institut zur Qualitätsentwicklung im Bildungswesen (IQB)<br />
Unter den Linden 6<br />
10099 Berlin<br />
Germany<br />
This Book of Abstracts is also available for download as a PDF file at<br />
http://www.iqb.hu-berlin.de/veranst/enac2008?reg=r_11<br />
2008<br />
Printed by<br />
Breitfeld Vervielfältigungsservice, Berlin, Germany<br />
Copyright © 2008 left to the Authors<br />
All rights reserved<br />
ISBN 978-3-00-025471-0<br />
Fourth Biennial EARLI/Northumbria Assessment Conference 2008-07-29<br />
August 27 – 29, 2008<br />
Hosted by IQB, Humboldt University Berlin<br />
Conference Venue: Seminaris Seehotel Potsdam/Berlin, Germany<br />
ii ENAC 2008
PREFACE<br />
This Book of Abstracts presents recent research in the field of assessment and evaluation.<br />
In total, the volume contains 124 contributions consisting of the abstracts of 3 plenary<br />
lectures, 31 symposium papers, 42 papers, 26 roundtable papers, and 22 posters. All these<br />
contributions have been brought together by the Fourth Biennial EARLI/Northumbria<br />
Assessment Conference 2008.<br />
The contributions cover a rich variety of topics that reflect the themes:<br />
• Standards-based assessment<br />
• E-assessment<br />
• Measuring and modelling performances<br />
• Consequences and contexts of assessment<br />
• Learning-oriented assessment.<br />
The book opens with the abstracts of the three invited plenary lectures. Two of them<br />
highlight the connection between assessment on the one hand, and instruction and learning<br />
on the other hand. The third plenary lecture addresses the rights of children in assessment.<br />
The three invited symposia give a view on the socio-cultural perspective of assessment, on<br />
new psychometric developments in test design, and on advances in e-assessment.<br />
With these invited contributions, the International Conference Committee is leaving its<br />
marks on the 2008 conference.<br />
In addition, the ICC chose as the title for this conference Challenging Assessment in order<br />
to signify that assessment is a very complex and complicated domain of research, which<br />
demands maximum input of energy and know-how from those involved in it. At the same<br />
time, this title indicates that assessment is a fascinating area to work in. As a matter of fact,<br />
this is where all human development and learning can become visible. It is our job to reveal<br />
the traces of growth and the results of education and other learning environments.<br />
However, this is only half of our work. Besides generating this knowledge, making it<br />
accessible to all stakeholders in a meaningful and productive way is equally important.<br />
May this Book of Abstracts inspire its readers’ thoughts and actions towards further<br />
progress in the field of assessment.<br />
Marja van den Heuvel-Panhuizen<br />
(Conference President)<br />
Olaf Köller<br />
(Director IQB)<br />
Berlin, August 2008<br />
ENAC 2008 iii
TABLE OF CONTENTS<br />
Preface iii<br />
Table of contents iv<br />
Introduction xvii<br />
EARLI/Northumbria Assessment Conference 2008 xvii<br />
International Conference Committee ENAC 2008 xvii<br />
The review process of ENAC 2008 xviii<br />
Plenary Lectures 3<br />
Eckhard Klieme<br />
Assessment, grading, and instruction: Understanding the context of<br />
educational measurement<br />
Ruth Leitch<br />
Improving Children’s Rights in Assessment: issues, challenges and<br />
possibilities<br />
Dylan Wiliam<br />
When is assessment learning-oriented?<br />
Invited Symposia 9<br />
Using socio-cultural perspectives to understand and change assessment in<br />
post-compulsory education<br />
Organiser: Liz McDowell<br />
David James<br />
Getting beyond the individual and the technical: How a<br />
cultural approach offers knowledge to transform assessment<br />
Kathryn Ecclestone<br />
Straitjacket or springboard?: the strengths and weaknesses<br />
of using a socio-cultural understanding of the effects of<br />
formative assessment on learning<br />
John Pryor, Barbara Crossouard<br />
Formative assessment: the discursive construction of<br />
identities<br />
iv ENAC 2008<br />
5<br />
6<br />
7<br />
11<br />
12<br />
13<br />
14
Measuring language skills by means of C-tests: Methodological challenges and<br />
psychometric properties<br />
Organisers: Alexander Robitzsch and Olaf Köller<br />
Chair: Olaf Köller<br />
Thomas Eckes<br />
Constructing a calibrated item bank for C-test<br />
Johannes Hartig, Claudia Harsch<br />
Gaining substantive information from local dependencies<br />
between C-test items<br />
Alexander Robitzsch, Ina Karius, Daniela Neumann<br />
C-tests for German Students: Dimensionality, Validity and<br />
Psychometric Perspectives<br />
Moving forward with e-assessment<br />
Organiser / Chair: Denise Whitelock<br />
Discussant: Kari Smith<br />
Cornelia Ruedel<br />
The Future of E-Assessment: E-Assessment as a Dialog<br />
Jim Ridgway, Sean McCusker, James Nicholson<br />
Alcohol and a Mash-up: Understanding Student<br />
Understanding<br />
Sally Jordan<br />
E-assessment for learning? The potential of short free-text<br />
questions with tailored feedback<br />
Symposia 23<br />
Portfolios in Higher Education in three European countries – Variations in<br />
Conceptions, Purposes and Practices<br />
Organiser: Olga Dysthe<br />
Chair: Nicola Reimann<br />
Discussant: Anton Havnes<br />
Elizabeth Hartnell-Young<br />
Learning opportunities through the processes of eportfolio<br />
development<br />
Olga Dysthe, Knut Steinar Engelsen<br />
The Disciplinary Content Portfolio in Norwegian Higher<br />
Education – How and Why?<br />
Wil Meeus, Peter van Petegem<br />
Portfolio diversity in Belgian (Flemish) Higher Education – A<br />
comparative study of eight cases<br />
ENAC 2008 v<br />
15<br />
16<br />
17<br />
18<br />
19<br />
20<br />
21<br />
22<br />
25<br />
26<br />
27<br />
28
Aims, values and ethical considerations in group work assessment<br />
Organiser: Lorraine Foreman-Peck<br />
Julia Vernon<br />
Involuntary Free Riding – how status affects performance in<br />
a group project<br />
Julie Jones, Andrew Smith<br />
Facilitating Group work: leading or empowering?<br />
Tony Mellor, Jane Entwsitle<br />
Marginalised students in group work assessment: ethical<br />
issues of group formation and the effective support of such<br />
individuals<br />
Multidimensional measurement models of students' competencies<br />
Organiser: Johannes Hartig<br />
Markus Wirtz, Timo Leuders, Marianne Bayrhuber, Regina Bruder<br />
Evaluation of non-unidimensional item contents using<br />
diagnostic results from Rasch-analysis<br />
Olga Kunina, Oliver Wilhelm, André A. Rupp<br />
Modelling multidimensional structure via cognitive diagnosis<br />
models: Theoretical potentials and methodological limitations<br />
for practical applications<br />
Jana Höhler, Johannes Hartig<br />
Modelling Specific Abilities for Listening Comprehension in a<br />
Foreign Language with a Multidimensional IRT Model<br />
Recent Developments in Computer-Based Assessment: Chances for the<br />
measurement of Competence<br />
Organiser: Thomas Martens<br />
Thibaud Latour, Raynald Jadoul, Patrick Plichart, Judith Swietlik-<br />
Simon, Lionel Lecaque, Samuel Renault<br />
Enlarging the range of assessment modalities using CBA:<br />
New challenges for generic (web-based) platforms<br />
Frank Goldhammer, Thomas Martens, Johannes Naumann, Heiko<br />
Rölke, Alexander Scharaf<br />
Developing stimuli for electronic reading assessment: The<br />
hypertext-builder<br />
Johannes Naumann, Nina Jude, Frank Goldhammer, Thomas<br />
Martens, Heiko Roelke, Eckhard Klieme<br />
Component skills of electronic reading competence<br />
vi ENAC 2008<br />
29<br />
30<br />
31<br />
32<br />
33<br />
34<br />
35<br />
36<br />
37<br />
38<br />
39<br />
40
Issues in High-Stakes Performance-based Assessment of Clinical Competence<br />
Organiser: Godfrey Pell<br />
Chair/Discussant: Trudie Roberts<br />
David Blackmore<br />
Lessons Learned from Administering a National OSCE for<br />
Medical Licensure<br />
Sydney Smee<br />
Quality Assurance through the OSCE Life Cycle<br />
Godfrey Pell, Richard Fuller<br />
Investigating OSCE Error Variance when measuring higher<br />
level competencies<br />
Katharine Boursicot, Trudie Roberts, Jenny Higham, Jane Dacre<br />
Beyond checklist scoring – clinicians’ perceptions of<br />
inadequate clinical performance<br />
Towards (quasi-) experimental research on the design of peer assessment<br />
Organiser: Dominique Sluijsmans<br />
Marjo van Zundert, Dominique Sluijsmans, Jeroen van<br />
Merriënboer<br />
The effects of peer assessment format and task complexity<br />
on learning and measurements<br />
Dominique Sluijsmans, Jan-Willem Strijbos, Gerard Van de<br />
Watering<br />
Modelling the impact of individual contributions on peer<br />
assessment during group work in teacher training: In search<br />
of flexibility<br />
Jan-Willem Strijbos, Susanne Narciss, Mien Segers<br />
Peer feedback in academic writing: How do feedback<br />
content, writing ability-level and gender of the sender affect<br />
feedback perception and performance?<br />
Assessment in kindergarten classes: experiences from assessing competences<br />
in three domains<br />
Organiser: Marja van den Heuvel-Panhuizen<br />
Chair/Discussant: Kees de Glopper<br />
Coosje van der Pol, Helma van Lierop-Debrauwer<br />
A picture book-based tool for assessing literary competence<br />
in 4 to 6-year olds<br />
Aletta Kwant, Jan Berenst, Kees de Glopper<br />
Assessing the social-emotional development of young<br />
children by means of storytelling and questions<br />
ENAC 2008 vii<br />
41<br />
42<br />
43<br />
44<br />
45<br />
46<br />
47<br />
48<br />
49<br />
50<br />
51<br />
52
Sylvia van den Boogaard, Marja van den Heuvel-Panhuizen<br />
Assessing mathematical abilities of kindergartners:<br />
possibilities of a group-administered multiple-choice test<br />
Papers 55<br />
Linda Allin, Lesley Fishwick<br />
Ethical Dilemmas: ‘Insider’ action research into Higher Education<br />
assessment practice<br />
Mandy Ashgar<br />
Reciprocal Peer Coaching as a Formative assessment strategy: Does it<br />
assist student to self regulate their learning<br />
Beth Black<br />
Using an adapted rank-ordering method to investigate January versus<br />
June awarding standards<br />
Sue Bloxham, Liz Campbell<br />
Generating dialogue in coursework feedback: exploring the use of<br />
interactive coversheets<br />
Saul Alejandro Contreras Palma<br />
Reforming practice or modifying Reforms? The science teacher’s<br />
responses to MBE and to assessment teaching in Chile<br />
Bronwen Cowie, Alister Jones, Judy Moreland, Kathrin Otrel-Cass<br />
Expanding student involvement in Assessment for Learning: A multimodal<br />
approach<br />
Julian Ebert<br />
Assessment Center Method to Evaluate Practice-Related University<br />
Courses<br />
Astrid Birgitte Eggen<br />
Democracy, Assessment and Validity. Discourses and practices<br />
concerning evaluation and assessment in an era of accountability<br />
viii ENAC 2008<br />
53<br />
57<br />
58<br />
59<br />
60<br />
61<br />
62<br />
63<br />
64
Kerry Harman, Erik Bohemia<br />
Using Assessment for Learning: exploring student learning experiences in<br />
a design studio module<br />
Christine Harrison, Paul Black, Jeremy Hodgen, Bethan Marshall, Natasha<br />
Serret<br />
Chasing Validity – The Reality of Teacher Summative Assessments<br />
Anton Havnes<br />
There is a bigger story behind. An analysis of mark average variation<br />
across Programmes<br />
Anton Havnes<br />
Course design and the Law of Unintended Consequences: Reflections on<br />
an assessment regime in a UK “new” University<br />
Mark Hoeksma, Judith Janssen, Wilfried Admiraal<br />
Reliability and validity of the assessment of web-based video portfolios:<br />
Consequences for teacher education<br />
Jenny Hounsell, Dai Hounsell<br />
Diversity in patterns of assessment across a university<br />
Gordon Joughin<br />
Learning-oriented assessment: A critical review of foundational research<br />
Patrick Lai<br />
Implementing standards-based assessment in Universities: Issues,<br />
Concerns and Recommendations<br />
Uwe Maier<br />
Test-based School Reform and the Quality of Performance Feedback: A<br />
comparative study of the relationship between mandatory testing policies<br />
and teacher perspectives in two German states<br />
Thomas Martens, Frank Goldhammer<br />
Motivational aspects of complex item formats<br />
Michael McCabe<br />
Remarkable Pedagogical Benefits of Reusable Assessment Objects for<br />
STEM Subjects<br />
ENAC 2008 ix<br />
65<br />
66<br />
67<br />
68<br />
69<br />
70<br />
71<br />
72<br />
73<br />
74<br />
75
Fiona Meddings, Christine Dearnley, Peter Hartley<br />
Demystifing the assessment process: using protocol analysis as a<br />
research tool in higher education<br />
Catherine Montgomery, Kay Sambell<br />
Challenging the formality of assessment: a student view of ‘Assessment<br />
for Learning’ in Higher Education<br />
Patrice O'Brien, Mei Kuin Lai<br />
Secondary students’ motivation to complete written dance examinations<br />
Michelle O'Doherty<br />
Mind the gap: assessment practices in the context of UK widening<br />
participation<br />
Raphaela Oehler, Alexander Robitzsch<br />
Measuring writing skills in large-scale assessment: Treatment of student<br />
non-responses for Multifaceted-Rasch-Modeling<br />
Susan Orr<br />
Collaborating or fighting for the marks? Students’ experiences of group<br />
assessment in the creative arts<br />
Ron Pat-El, M. Segers, P. Vedder, H. Tillema<br />
Constructing a new assessment for learning questionnaire<br />
Ruth Pilkington<br />
Assessing Professional Learning: the challenge of the UK Professional<br />
Standards Framework<br />
Margaret Price, Karen Handley, Berry O'Donovan<br />
Feedback – all that effort but what is the effect?<br />
Ana Remesal<br />
Student teachers on assessment: First year conceptions<br />
Mary Richardson<br />
Testing our citizens. How effective are assessments of citizenship in<br />
England?<br />
Andreas Saniter, Rainer Bremer<br />
Standards in vocational education<br />
x ENAC 2008<br />
76<br />
77<br />
78<br />
79<br />
80<br />
81<br />
82<br />
83<br />
84<br />
85<br />
86<br />
87
Lydia Schaap, H.G. Schmidt<br />
Why do some students stop showing progress on progress tests?<br />
Lee Shannon, Lin Norton, Bill Norton<br />
Contextualising Assessment: The Lecturer's Perspective<br />
Pou-seong Sit, Kwok-cheung Cheung<br />
Learning to read: Modeling and assessment of early reading<br />
comprehension of the 4-year-olds in Macao kindergartens<br />
Anne Kristin Sjo, Knut Steinar Engelsen, Kari Smith<br />
Assessment in action – Norwegian secondary-school teachers and their<br />
assessment activities<br />
Kari Smith<br />
How do students teachers and mentors assess the Practicum?<br />
Margit Stein<br />
Assessment of competencies of apprentices<br />
Janet Strivens, Cathal O'Siochru<br />
Academics’ epistemic beliefs about their discipline and implications for<br />
their judgements about student performance in assessments<br />
Dineke Tigelaar, Jan van Tartwijk, Fred Janssen, Ietje Veldman, Nico Verloop<br />
Techniques for trustworthiness as a way to describe teacher educators'<br />
Assessment processes<br />
Marjo van Zundert<br />
Peer Assessment for Learning: a State-of-the-art in Research and Future<br />
Directions<br />
Denise Whitelock<br />
Investigating the Pedagogical Push and Technological Pull of Computer<br />
Assisted Formative Assessment<br />
Oliver Wilhelm, Ulrich Schroeders, Maren Formazin, Nina Bucholtz<br />
Strict Tests of Equivalence for and Experimental Manipulations of Tests<br />
for Student Achievement<br />
ENAC 2008 xi<br />
88<br />
89<br />
90<br />
91<br />
92<br />
93<br />
94<br />
95<br />
96<br />
97<br />
98
Roundtable papers 99<br />
Morten Asmyhr<br />
Why the moderate levels of inter-assessor reliability of student essays?<br />
Simon Barrie, C. Hughes, C. Smith<br />
Approaches to the assessment of graduate attributes in higher education<br />
David Boud<br />
Assessment for learning in and beyond courses: a national project to<br />
challenge university assessment practice<br />
Kwok-cheung Cheung, Pou-seong Sit<br />
Electronic reading assessment: The PISA approach for the international<br />
comparison of reading comprehension<br />
Wendy Clark, Jackie Adamson<br />
Developing the autonomous lifelong learner: tools, tasks and taxonomies<br />
Gilian Davison, Craig McLean<br />
Assessing the Art of Diplomacy? Learners and Tutors perceptions of the<br />
use of Assessment for Learning (AfL) in non-vocational education<br />
Luc De Grez, Martin Valcke, Irene Roozen<br />
Assessment of oral presentation skills in higher education<br />
Christine Dearnley, Jill Taylor, Catherine Coates<br />
Mobile Assessment of Practice Learning: An Evaluation from a Student<br />
Perspective<br />
Margaret Fisher, Tracey Proctor-Childs<br />
How reliable is the assessment of practice, and what is its purpose?<br />
Student perceptions in Health and Social Work<br />
Richard Fuller, Matthew Homer, Godfrey Pell<br />
Measuring variance and improving the reliability of criterion based<br />
assessment (CBA): towards the perfect OSCE<br />
Concha Furnborough<br />
Learning through assessment and feedback: implications for adult<br />
beginner distance language learners<br />
xii ENAC 2008<br />
101<br />
102<br />
103<br />
104<br />
105<br />
106<br />
107<br />
108<br />
109<br />
110<br />
111
Stuart Hepplestone<br />
Secret scores: Encouraging student engagement with useful feedback<br />
Therese Nerheim Hopfenbeck<br />
Large-Scale Assessment and Learning-Oriented Assessment: Like Water<br />
and Oil or new Possibilities for Future Research Directions?<br />
Sally Jordan, Philip Butcher, Arlëne Hunter<br />
Online interactive assessment for open learning<br />
Per Lauvas, Gunnars Bjølseth, Anton Havnes<br />
Can inter-assessor reliability be improved by deliberation?<br />
Paulette Luff, Gilian Robinson<br />
Sketchbooks and Journals: a tool for challenging assessment?<br />
Michal Nachshon, Amira Rom<br />
Evaluating the use of popular science articles for assessing high schools<br />
students<br />
Berry O'Donovan, Margaret Price<br />
Supporting student intellectual development through assessment design:<br />
debating ‘how’?<br />
Berry O'Donovan, Margaret Price<br />
Assessment contexts that underpin student achievement: demonstrating<br />
effect<br />
Ann Ooms, Timothy Linsey, Marion Webb<br />
In-classroom use of mobile technologies to support formative assessment<br />
Jon Robinson, David Walker<br />
The Devil's Triad: the symbiotic link between Assessment, Study Skills<br />
and Key Employability Skills<br />
Ann Karin Sandal, Margrethe H. Syversen, Ragne Wangensteen, Kari Smith<br />
Learning-oriented assessment and students experiences<br />
Mark Schofield<br />
Connecting Research Behaviours with Quality Enhancement of<br />
Assessment: Eliciting Developmental Case Studies by Appreciative<br />
Enquiry<br />
ENAC 2008 xiii<br />
112<br />
113<br />
114<br />
115<br />
116<br />
117<br />
118<br />
119<br />
120<br />
121<br />
122<br />
123
Elias Schwieler, Stefan Ekecrantz<br />
Conceptions of assessment in higher education: A qualitative study of<br />
scholars as teachers and researchers<br />
Thomas Stern<br />
Innovative Assessment Practice and Teachers’ Professional Development:<br />
Some Results of Austria’s IMST-Project<br />
Dineke Tigelaar, Mirjam Bakker, Nico Verloop<br />
Characteristics of an effective approach for formative assessment of<br />
teachers’ competence development<br />
Posters 127<br />
Andy Bell, Kevin Rowley<br />
Predictive indicators of academic performance at degree level<br />
Christian Bokhove<br />
Online Formative Assessment for Algebra<br />
Barbara Brockbank, Sally Jordan, Tom Mitchell<br />
Investigating the use of short answer free-text e-assessment questions<br />
with instantaneous tailored feedback<br />
Nina Bucholtz, Maren Formazin, Oliver Wilhelm<br />
Contextualized reasoning with written and audiovisual material: Same or<br />
different?<br />
Tobias Diemer, Harm Kuper<br />
Effects of Large Scale Assessments in Schools: How Standard-Based<br />
School Reform Works<br />
Greet Fastré, Marcel van der Klink, Dominique Sluijsmans, Jeroen van<br />
Merriënboer<br />
Support in Self-assessment in Secondary Vocational Education<br />
Merrilyn Goos, Clair Hughes, Ann Webster-Wright<br />
The confidence levels of course/subject coordinators in undertaking<br />
aspects of their assessment responsibilities<br />
xiv ENAC 2008<br />
124<br />
125<br />
126<br />
129<br />
130<br />
131<br />
132<br />
133<br />
134<br />
135
Stuart Hepplestone<br />
Useful feedback and flexible submission: Designing and implementing<br />
innovative online assignment management<br />
Rosario Hernandez<br />
The challenge of engaging students with feedback<br />
Dai Hounsell, Chun Ming Tai, Rui Xu<br />
Towards More Integrative Assessment<br />
Clair Hughes<br />
Using a framework adapted from Systemic Functional Linguistics to<br />
enhance the understanding and design of assessment tasks<br />
Anders Jonsson<br />
The use of transparency in the "Interactive examination" for student<br />
teachers<br />
Ulrich Keller, Monique Reichert, Gilbert Busana, Romain Martin<br />
School monitoring in Luxembourg: computerized tests and automated<br />
results reporting<br />
Marjolijn Peltenburg, Marja van den Heuvel-Panhuizen<br />
Mathematical power of special needs students<br />
Glynis Pickworth, M. van Rooyen, T.J. Avenant<br />
Quality Assurance review of clinical assessment: How does one close the<br />
loop?<br />
Margaret Price, Karen Handley, Berry O'Donovan<br />
Feedback: What’s in it for me?<br />
Ana Remesal, Manuel Juárez, José Luis Ramírez<br />
From students’ to teachers’ collaboration: a case study of the challenges<br />
of e-teaching and assessing as co-responsibility<br />
Jon Robinson, David Walker<br />
Symbiotic relationships: Assessment for Learning (AfL), study skills and<br />
key employability skills<br />
ENAC 2008 xv<br />
136<br />
137<br />
138<br />
139<br />
140<br />
141<br />
142<br />
143<br />
144<br />
145<br />
146
Petra Scherer<br />
Assessing low achievers’ understanding of place value – consequences<br />
for learning and instruction<br />
Revital Tal<br />
Using a course forum to promote learning and assessment for learning in<br />
environmental education<br />
Mirabelle Walker<br />
Learning-oriented feedback: a challenge to assessment practice<br />
David Webb<br />
Progressive Formalization as an Interpretive Lens for Increasing the<br />
Learning Potentials of Classroom Assessment<br />
Author Index 151<br />
Address list of presenters 157<br />
xvi ENAC 2008<br />
147<br />
148<br />
149<br />
150
INTRODUCTION<br />
EARLI/Northumbria Assessment Conference 2008<br />
The EARLI/Northumbria Assessment Conference (ENAC) is held now for the fourth time. It<br />
is a conference series established jointly by the EARLI Special Interest Group on<br />
Assessment and Evaluation, and Northumbria University in Newcastle, United Kingdom.<br />
ENAC conferences are held biennially, in the years between the – equally biennial – fullscale<br />
EARLI conferences.<br />
The EARLI/Northumbria Assessment Conference started at Northumbria University in<br />
2002. In 2004, the second conference took place in Norway, organised by the University of<br />
Bergen. The third conference returned to Northumbria University.<br />
The present conference continues the tradition of the EARLI/Northumbria Assessment<br />
Conferences by providing a forum for participants to exchange ideas in an inspiring<br />
professional environment and a pleasant and comfortable venue.<br />
The EARLI/Northumbria Assessment Conference 2008 is hosted by IQB (Institut zur<br />
Qualitätsentwicklung im Bildungswesen) of Humboldt University and organised in<br />
collaboration with the International Conference Committee.<br />
International Conference Committee ENAC 2008<br />
Marja van den Heuvel-Panhuizen (Conference President)<br />
IQB, Humboldt University Berlin, Germany<br />
Freudenthal Institute, Utrecht University, the Netherlands<br />
Olaf Köller (Director of IQB)<br />
IQB, Humboldt University Berlin, Germany<br />
Dietlinde Granzer<br />
IQB, Humboldt University Berlin, Germany<br />
Liz McDowell<br />
Northumbria University, UK<br />
Kay Sambell<br />
Northumbria University, UK<br />
Nicola Reimann<br />
Northumbria University, UK<br />
Jim Ridgway (EARLI SIG)<br />
University of Durham, UK<br />
Denise Whitelock (EARLI SIG)<br />
Open University, UK<br />
Anton Havnes<br />
Oslo University College, Norway<br />
Kari Smith (EARLI SIG)<br />
University of Bergen, Norway<br />
ENAC 2008 xvii
The review process of ENAC 2008<br />
The review process was organised by Dietlinde Granzer. In total, we received<br />
196 submissions for the Fourth Biennial EARLI/Northumbria Assessment Conference 2008.<br />
The submissions included the 500-word abstracts of 170 papers (40 of them as part of a<br />
symposium), 11 roundtable papers and 15 posters. Every submission was anonymously<br />
peer-reviewed by two reviewers out of a group of experts selected by the International<br />
Conference Committee. In case the two reviewers had entirely different opinions about the<br />
submission, a third reviewer was asked.<br />
The main criterion in the review was whether the quality of a submission was high enough,<br />
in general, and with respect to the proposed presentation format in particular.<br />
All in all, the quality of the submissions was very high. Because of the large number of high<br />
quality proposals for paper presentations and symposia, the ICC re-allocated some of those<br />
proposals to round tables and posters.<br />
The final decisions about acceptance, rejection, or re-allocating to another presentation<br />
format were in the hands of the ICC. The total acceptance rate was almost 65%.<br />
The ICC thanks the following people for their help in the review process:<br />
Bremerich-Vos, Albert, Universität Duisburg-Essen (GERMANY)<br />
Brna, Paul, Educational Consultancy in Technology Enhanced Learning (UK)<br />
Clegg, Karen, University of York (UK)<br />
Granzer, Dietlinde, Humboldt University Berlin (GERMANY)<br />
Havnes, Anton, University of Bergen (NORWAY)<br />
Higgins, Steve, Durham University (UK)<br />
Köller, Olaf, Humboldt University Berlin (GERMANY)<br />
Lauvas, Per, Fellesadministrasjonen (NORWAY)<br />
McCusker, Sean, Durham University (UK)<br />
McDowell, Liz, Northumbria University (UK)<br />
Montgomery, Catherine, Northumbria University (UK)<br />
Reimann, Nicola, Northumbria University (UK)<br />
Reiss, Kristina, Ludwig-Maximilians-Universität München (GERMANY)<br />
Ridgway, Jim, Durham University (UK)<br />
Ruedel, Cornelia, University of Zürich (SWITZERLAND)<br />
Rust, Christopher, Oxford Brookes University (UK)<br />
Sambell, Kay, Northumbria University (UK)<br />
Smith, Kari, University of Bergen (NORWAY)<br />
Van den Heuvel-Panhuizen, Marja, Utrecht University/Humboldt University Berlin (NL/GER)<br />
Webb, David, University of Colorado at Boulder (USA)<br />
Whitelock, Denise, The Open University (UK)<br />
Wilhelm, Oliver, Humboldt University Berlin (Germany)<br />
Winkley, John; Becta (UK)<br />
xviii ENAC 2008
ABSTRACTS<br />
ENAC 2008 1
2 ENAC 2008
Plenary Lectures<br />
ENAC 2008 3
4 ENAC 2008
Assessment, grading, and instruction:<br />
Understanding the context of educational measurement<br />
Eckhard Klieme, German Institute for International Educational Research (DIPF), Germany<br />
Both institutional effectiveness and individualized, adaptive education depend on the<br />
availability of sophisticated instruments to measure and model student learning. However,<br />
“one size of assessment does not fit all” (Pellegrino at al., 2001, p. 222). These authors<br />
called for multidisciplinary research activities focusing on three facets: “(1) development of<br />
cognitive models of learning that can serve as the basis for assessment design, (2)<br />
research on new statistical measurement models and their applicability, (3) research on<br />
assessment design” (p. 284). Similarly, a recently started research program in Germany<br />
covers four key areas: “the development of theoretical models of competence (1), the<br />
construction of psychometric models (2), the construction of measurement instruments for<br />
the empirical assessment of competencies (3), and research on the use of diagnostic<br />
information (4)” (Koeppen, Hartig, Klieme & Leutner 2008).<br />
Given that outstanding improvements have been made in recent years with regard to cognitive<br />
modelling, psychometrics, and test design, the fourth area seems to be the one which is least<br />
understood in educational research. While a lot of – mostly critical – research has been done on<br />
effects of high-stakes, standard-based assessment systems, few researchers have studied the<br />
many varieties of educational assessment that take place in the context of everyday classroom<br />
teaching and the huge impact these practices have on student learning.<br />
Teachers make observations of students’ understanding and performance in a variety of<br />
ways: in classroom dialogue, homework assignments, and formal tests. These procedures<br />
should permit diagnosis on an individual level, in terms of understanding students’ individual<br />
solution paths, misconceptions, etc. Appropriate individual feedback is crucial to support the<br />
subsequent learning process. A number of research questions arise in this context: What kind<br />
of diagnostic information is best understood by students, and what kind by teachers? How<br />
well can teachers evaluate individual learning processes? What factors influence teachers’<br />
grading decisions? What models of competence do teachers rely on – implicitly or explicitly?<br />
How well founded and how helpful is the individual student feedback provided by the teacher?<br />
And how do all these processes interact with newly implemented assessment systems?<br />
After an overview of attempts for “instructionally sensitive assessment”, the paper will present<br />
two studies investigating everyday classroom practices in detail, based on a sample of math<br />
lessons from Germany and Switzerland. The first study examines teacher judgments about<br />
student achievement in terms of the grades awarded. It examines to which degree the grades<br />
awarded reflect different dimensions of students’ achievement and learning behavior. It also<br />
explores whether assessment and instruction are indeed aligned in the classroom, that is,<br />
whether teachers’ grading is aligned with their instruction. In the second study, we analyze<br />
how teacher evaluation affects students’ subsequent learning processes. This study utilizes<br />
feedback given to students by the teacher within classroom interaction as an indicator for the<br />
communication of student evaluation, and investigates the impact of two types of feedback,<br />
evaluative and informational, on student learning and motivation.<br />
ENAC 2008 5
Improving Children’s Rights in Assessment:<br />
issues, challenges and possibilities<br />
Ruth Leitch, Queen's University Belfast, United Kingdom<br />
Much valuable work has been achieved in recent years concerning the educational benefits<br />
of consulting with children and young people on teaching and learning (Flutter & Ruddock,<br />
2004) and as a means of assuring children their rights in education. There has been<br />
significantly less research on improving children’s rights in assessment despite a growing<br />
interest in the relationship between assessment and social justice. Rights in relation to the<br />
assessment of students’ learning or performance do not expressly exist in legislation in<br />
most jurisdictions although they are enshrined in international treaties such as the United<br />
Nations Convention on the Rights of the Child (UNCRC, 1989).<br />
This presentation will contribute a children’s rights perspective to issues of assessment<br />
focusing specifically on the legal implications and imperatives of Article 12 of the UNCRC.<br />
Data illuminating various issues will be derived from a recent ESRC/TLRP qualitative<br />
research project that consulted pupils on aspects of assessment policy and practice,<br />
including the introduction of annual pupil profiles and assessment for learning (AfL) in the<br />
Northern Ireland context (Leitch et al, 2008). A conceptual model based on a critical legal<br />
interpretation of Article 12 (Lundy, 2007) will be unpacked to illustrate some of the<br />
opportunities and obstacles afforded by students being involved more fully in the<br />
assessment of their learning. The presentation will conclude by arguing that if we are truly<br />
committed to improving children’s rights in relation to assessment, there must be a<br />
concerted approach to awareness raising on the obligations of children’s rights at all levels<br />
within the education system as part of a democratic culture shift – and Article 12 (UNCRC)<br />
is a valuable place to start.<br />
References<br />
Flutter, J. & Rudduck, J. (2004) Consulting Pupils: What's in it for Schools? London: RoutledgeFalmer.<br />
Leitch, R., Gardner, J., Mitchell, S., Lundy, L., Galanouli, D. & Odena, O. (2008) Consulting Pupils on the<br />
Assessment of their Learning. ESRC/TLRP Research Briefing Number 36, March 2008,<br />
http://wwwtlrp.org/pub/research/html<br />
Lundy, L. (2007) ‘Voice is not enough’ : The implications of Article 12 of the UNCRC for Education.British<br />
Educational Research Journal , Vol 33, No 6, 927-942.<br />
UNCRC (1989) United Nations Convention on the Rights of the Child UN General Assembly Resolution<br />
44/25 New York. United Nations.<br />
6 ENAC 2008
When is assessment learning-oriented?<br />
Dylan Wiliam, University of London, United Kingdom<br />
Educational assessments are conducted in a variety of ways and their outcomes can be<br />
used for a range of purposes. There are differences in who decides what is to be assessed,<br />
who carries out the assessment, where the assessment takes place, how the resulting<br />
responses made by students are scored and interpreted, and what happens as a result<br />
(Black and Wiliam, 2004). In particular, each of these can be the responsibility of the<br />
learners themselves, those who teach the students, or, at the other extreme, all the<br />
processes can be carried out by an external agency. Cutting across these differences, there<br />
are also differences in the functions that assessments serve.<br />
Assessments can be used to support judgments about the quality of educational programs<br />
or institutions (what might be termed the evaluative function). They can be used to describe<br />
the achievements of individuals, either for the purpose of certifying that they have reached a<br />
particular level of performance of competence, or for making predictions about their future<br />
capabilities (what might be termed the summative function). And assessment can be used<br />
to support learning (what might be termed the formative function).<br />
In this talk, I will suggest that an assessment functions formatively only when evidence<br />
about student achievement elicited by the assessment is interpreted and used to make<br />
decisions about the next steps in learning that are likely to be better, or better founded, than<br />
the decisions that would have been made in the absence of that evidence.<br />
I will further suggest that learning-oriented assessment involves five key strategies, which<br />
serve to connect assessment to other important educational processes:<br />
• Clarifying, understanding, and sharing learning intentions<br />
• Engineering effective classroom discussions, tasks and activities that elicit evidence of<br />
learning<br />
• Providing feedback that moves learners forward<br />
• Activating students as learning resources for one another<br />
• Activating students as owners of their own learning<br />
Examples of each of these strategies will be given, and the presentation will conclude by<br />
offering a set of priorities for the design of learning-oriented assessments.<br />
ENAC 2008 7
8 ENAC 2008
Invited Symposia<br />
ENAC 2008 9
10 ENAC 2008
Invited Symposium: Socio-cultural perspectives<br />
Using socio-cultural perspectives to understand and change<br />
assessment in post-compulsory education<br />
Organiser: Liz McDowell, University of Northumbria, United Kingdom<br />
This symposium focuses on the application of socio-cultural perspectives to the day-to-day<br />
practices of assessment in post-compulsory education. The importance of the research is<br />
that despite significant changes in theories of learning and teaching and in perspectives on<br />
the societal goals for educational systems, shifts in assessment thinking and practices have<br />
lagged behind these changes (Shepard, 2000). This remains true despite a number of<br />
powerful arguments making the case for change of vision from a ‘testing culture’ to an<br />
‘assessment culture’ (Wolf et al., 1991).<br />
Assessment as testing, based in the scientific measurement approach (Hager & Butler,<br />
1996) has retained its behaviourist roots much more strongly than contemporary teaching<br />
practices. Constructivist approaches should give students a more active role as participants<br />
in assessment rather than victims of the assessor (Dochy & McDowell, 1997) and there has<br />
been considerable growth in interest in formative assessment (Black & Wiliam, 1998).<br />
Nevertheless, in practice, assessment tends to be seen in a technicist way as a decontextualised<br />
and narrow type of activity, a system designed to direct students,<br />
formatively, towards performances that are summatively validated and represented by<br />
grades awarded. Hence there is considerable emphasis on effective techniques and the<br />
design of constructively aligned systems (Biggs, 2003) that channel students into the<br />
desired assessment performances.<br />
Socio-cultural approaches take a much broader view of assessment, recognising that it is a<br />
socially and contextually located set of practices. It can be seen as a structure with a<br />
complex of activities, influences and outcomes experienced by the actors within it, chiefly<br />
teachers and students, situated within a broader social, historical and cultural context.<br />
The symposium presenters draw upon research studies which have produced new insights<br />
into assessment practices. Their collective work represents a cumulative and integrated<br />
body of evidence pointing to the value of socio- cultural understandings of assessment and<br />
their utility in improving practice in classrooms, lecture rooms and examination halls. Each<br />
paper draws on a wide range of evidence but presents data from recent research studies in<br />
post-compulsory education. Assessment is viewed in terms of: its meaning for individuals<br />
with their personal histories and developing identities; its meaning at the collective level,<br />
that is, the ways that assessment is constructed in classrooms, courses, and institutions;<br />
and its longer term consequences (Boud & Falchikov).<br />
The research teams have worked with teachers to make links between new understandings<br />
of assessment and local assessment practices. Some have used action research<br />
approaches to engage teachers and involve them in the development of theory and<br />
practice. Each paper will challenge symposium participants to problematise aspects of<br />
assessment thinking and practice that may have been taken for granted and will offer ways<br />
of accommodating new understandings in assessment practice.<br />
ENAC 2008 11
Invited Symposium: Socio-cultural perspectives / Paper 1:<br />
Getting beyond the individual and the technical:<br />
How a cultural approach offers knowledge to transform assessment<br />
David James, University of the West of England, United Kingdom<br />
This paper argues that a socio-cultural perspective on assessment is an urgent and<br />
practical necessity in higher education. It is important to understand how and to what extent<br />
assessment practices (a) attempt to serve contradictory purposes, and (b) determine<br />
conceptions and practices of learning beyond those desired by tutors and students. It is also<br />
crucial to appreciate how much scope tutors have for beneficial interventions, and why.<br />
The paper begins by setting out some tools by which this might be achieved. It draws upon<br />
the methods and outcomes of the Transforming Learning Cultures in Further Education<br />
project which formed part of the national UK Teaching and Learning Research Programme.<br />
The project was the largest ever independent study of practices in further education. Its<br />
aims were to deepen understanding of the complexities of learning, to weigh up strategies<br />
for improvement, and to set in place a lasting capacity for productive enquiry amongst FE<br />
professionals. To this end, the study involved over 1000 questionnaires, 600 interviews,<br />
extensive shadowing and tutor diaries. It was informed by a range of theoretical sources, of<br />
which Dewey (Biesta and Burbules, 2003) and Bourdieu (e.g. Bourdieu, 1998; Grenfell and<br />
James, 1998) were prominent. The outcomes of the study included a tool for understanding<br />
tutor interventions, some ‘principles of procedure’ for improvement, and a new ‘cultural’<br />
theory (Hodkinson, Biesta and James, 2007). One recurrent theme in the analysis is how<br />
assessment regimes and events embody – sometimes whilst concealing – strong notions of<br />
learning and teaching. Related to this, the study demonstrated how and why individual<br />
tutors often had limited capacity to make significant improvements on their own, sometimes<br />
despite sterling efforts, and how the same intervention could have positive or negative<br />
effects depending on the specific setting.<br />
A method for interrogating learning cultures whilst keeping assessment as a core focus is<br />
presented and applied to HE practice. This research approach raises questions such as,<br />
what conception of learning is inherent in particular ways of writing learning outcomes, or in<br />
the use of academic credit, or in certain assessment events, and marking regimes? Are<br />
there conceptions of learning that are rhetorically important but then marginalized in<br />
assessment practices? The approach avoids the pretence that assessment is fundamentally<br />
a technical matter (James, 2000) and argues that the idea of constructive alignment (e.g.<br />
Biggs, 2003) is ‘too good to be true’. Instead, the paper offers a cultural view of assessment<br />
practices that takes account of power, interests, relationships, and interactions. The view<br />
advocated is compatible with the humanistic concerns in the earlier seminal work of Heron<br />
(1988) and Boud (e.g. 1990), but combines their insights with a fresh ‘take’ on the capacity<br />
of (and scope for) tutors to act. The paper argues that understanding a learning culture<br />
provides a route to realism about worthwhile and possible change to assessment events<br />
and regimes.<br />
12 ENAC 2008
Invited Symposium: Socio-cultural perspectives / Paper 2:<br />
Straitjacket or springboard?: the strengths and weaknesses of using a<br />
socio-cultural understanding of the effects of formative assessment on learning<br />
Kathryn Ecclestone, Oxford Brookes University, United Kingdom<br />
Research into formative assessment in schools and higher education has pointed to a<br />
variety of techniques that supporters claim will raise achievement, engage students with<br />
learning and promote more democratic, transparent assessment practices. Yet, a major<br />
research project exploring formative assessment in further and adult education shows that<br />
techniques, in themselves, are neither progressive or unprogressive (see Ecclestone 2002,<br />
2008; Davies and Ecclestone, 2007; Marshall and Drummond, 2006). Instead, work by<br />
James and Biesta and colleagues shows the usefulness of a socio-cultural understanding of<br />
learning (see, for example, James and Biesta 2007). The paper explores how a sociocultural<br />
approach illuminates the subtle ways in which different learning cultures within the<br />
same institution or course can produce formative assessment that leads either to<br />
instrumental compliance or deep, sustainable engagement. Sometimes learning cultures<br />
encourage instrumental assessment as a springboard to deeper forms of learning;<br />
sometimes instrumental assessment acts as a straitjacket on learning.<br />
This paper draws on recent empirical studies that have explored the links between policy for<br />
formative assessment, espoused theoretical principles and the reality of day to day<br />
practices in different contexts. It examines how teachers’ and students’ ideas about<br />
formative assessment practices cannot be divorced from the learning cultures which both<br />
shape those ideas and practices and which, in turn, are shaped by them. This illuminates<br />
tensions between instrumental and sustainable formative practice and shows possibilities<br />
for affecting practice.<br />
However, there is also a danger that a socio-cultural understanding can also overemphasise<br />
the discursive effects of formative assessment on identities and the navigation<br />
of power and relationships within assessment practices. Whilst important, effects of<br />
assessment on identities can overlook attention to the quality of educational outcomes for<br />
students.<br />
The paper aims to makes proposals about how a socio-cultural understanding of formative<br />
assessment helps teachers influence their practice in positive ways, with specific examples<br />
from recent activities and discussions with teachers in further and adult education.<br />
ENAC 2008 13
Invited Symposium: Socio-cultural perspectives / Paper 3:<br />
Formative assessment: the discursive construction of identities<br />
John Pryor, University of Sussex, United Kingdom<br />
Barbara Crossouard, University of Sussex, United Kingdom<br />
This paper relates to recent work in higher education with professional doctorate students. It<br />
builds on empirical research conducted in a number of educational contexts over the past<br />
14 years (e.g. Torrance and Pryor 1998, 2001; Pryor and Crossouard 2008; Crossouard<br />
2008). In these studies the crucial importance of issues of student and teacher identity to<br />
learning cultures and therefore to the nature and consequences of formative assessment<br />
has emerged as an increasingly important theme.<br />
The analysis draws on social theory whereby identity is not seen in terms of the<br />
individualized psychological self but more in terms of identity embedded in social processes<br />
and practices (see Hey, 2006). This is related to a sociocultural perspective on learning as<br />
happening through dialogic processes of identity construction and performance, so that it<br />
involves ‘becoming a different person [where] identity, knowing and social membership<br />
entail one another’ (Lave & Wenger, 1991, p.53). The data on doctoral students were<br />
derived from observation of formative assessment, discourse analysis of online texts,<br />
including peer discussion forum interactions and tutor email feedback, and exploration of<br />
student perceptions through in-depth interviews. This yielded a close focus on the<br />
processes of formative assessment. This was supplemented by insider perspectives<br />
generated by the fact that the researchers were the main tutor during the part of the<br />
doctoral programme under study and a doctoral candidate. Thus the project was able to<br />
include an element of action research to develop and evaluate different aspects in more<br />
detail and explicitly ground the findings in practice.<br />
Our conclusions are that issues of identity, power and culture are part of the complexity of<br />
learning. These issues may act as barriers, but formative assessment offers opportunities<br />
for what might be described as an explicit meta-discourse which may also enhance<br />
learning. Thus power differentials emerge as potentially productive when different identity<br />
positions – assessor, teacher, practitioner, learner, disciplinary expert, critic – are<br />
deliberately invoked by the tutor. Similarly student identities, both as students and in relation<br />
to their past and future lives, can be deliberately invoked. Within this play of identities the<br />
disciplinary norms against which students’ performances are judged (the rules of the game)<br />
may be highlighted. Engagement with subject matter alongside identity thus has special<br />
potential for formative assessment as a means of promoting equity in education.<br />
This work establishes formative assessment at the heart of higher education practice and a<br />
key implication that its potency should not be underestimated. Despite its complexity we do<br />
suggest ways in which the play of identities can be incorporated into the practice of teaching<br />
and learning.<br />
14 ENAC 2008
Invited Symposium: C-tests<br />
Measuring language skills by means of C-tests:<br />
Methodological challenges and psychometric properties<br />
Organisers: Alexander Robitzsch, Olaf Köller, IQB, Humboldt University Berlin, Germany<br />
Chair: Olaf Köller, IQB, Humboldt University Berlin, Germany<br />
C-tests are widely used to assess overall language skills both in foreign languages as well<br />
as in mother tongue. These tests usually consist of texts interrupted by gaps that have to be<br />
filled in by examinees. The individual gaps within each text are typically the smallest unit of<br />
analysis, which can be treated as individual test items. The analytical focus, however, of the<br />
C-test assessment is usually not the individual item, but the overall level of achievement<br />
(i.e., the number of closed gaps).While there is broad consensus that these measures are<br />
quite reliable and valid, there are several methodological challenges associates with these<br />
measures. Particularly, when IRT-models are applied to these tests, different strategies can<br />
be used. Some authors (e.g., Eckes, 2007) recommend the application of polytomous IRT<br />
models, in which all gaps of one text are building one rating scale ranging from zero up to<br />
the number of gaps. Other authors, however, prefer analyzing each gap as a single item<br />
and then applying Rasch testlet models, in which dependencies among items are in the<br />
model. All three papers in the proposed symposium focus on this issue.<br />
The first paper provided by Eckes clearly prefers polytomous IRT models when analyzing Ctests.<br />
The appropriateness of this approach is shown on C-test results from approximately<br />
5.000 examinees from 116 countries, all of which have been working on C-tests measuring<br />
German as a foreign language.<br />
The authors of the second paper (Hartig & Harsch) argue that a more adequate strategy<br />
might be applying testlet models to C-tests. In this approach more is learnt about specific<br />
dependencies among gaps. This approach is illustrated by means of data from the German<br />
DESI large scale study in which foreign language skills of about 10.000 9th graders were<br />
assessed with C-tests.<br />
The third paper by Robitzsch, Karius, and Neumann offers an extended solution of the<br />
Hartig and Harsch approach. In their study, which was part of the German national<br />
assessment program, the authors propose a more detailed testlet model which models<br />
dependency of the gaps hierarchically, i.e., items are nested within sentences and<br />
sentences are nested within C-tests. Furthermore the authors analyze relationships of Ctests<br />
with tests on other language skills. In summary, the papers widen our understanding<br />
both on how to model responses in C-tests as well as what C-tests typically measure.<br />
ENAC 2008 15
Invited Symposium: C-tests / Paper 1:<br />
Constructing a calibrated item bank for C-test<br />
Thomas Eckes, TestDaF Institute, Germany<br />
C-tests are gap-filling tests that measure general language proficiency. In terms of efficient<br />
construction of C-tests, high-quality test development, and flexible test administration,<br />
including web-based testing, it is imperative to make use of a calibrated item bank, that is,<br />
an item bank in which parameter estimates for all items in the bank have been placed on<br />
the same difficulty scale. When constructing a calibrated item bank for C-tests, two major<br />
issues arise: (a) choosing an IRT model for item calibration and linking, and (b) choosing a<br />
design for collection of item-banking data.<br />
Regarding the first issue, it is important to realize that gaps within a given text are locally<br />
dependent to a significant degree. As a consequence, texts should not be analyzed on the<br />
level of individual gaps, but should rather be construed as super-items (item bundles,<br />
testlets), with item values corresponding to the number of gaps within a given text; that is,<br />
each text should be viewed as a polytomous item. Accordingly, Rasch models such as<br />
Andrich’s rating scale model or Masters’ partial credit model would seem appropriate.<br />
With respect to the data collection issue, one widely used design is the common-item<br />
nonequivalent groups (CING) design. In this design, various test forms are linked through a<br />
set of common items. The groups are not considered to be equivalent. Alternatively, the<br />
randomly-equivalent groups design could be employed. Examinees are randomly assigned<br />
the form to be administered; linkage between the forms is achieved by assuming that the<br />
different groups of examinees taking different forms are equivalent in ability.<br />
In the present paper, I report on an ongoing study aiming at the construction of a large<br />
calibrated item bank for use with an Internet-delivered C-test, the “Online Placement Test of<br />
German as a Foreign Language” (onDaF; www.ondaf.de). Building on research into the<br />
suitability of various polytomous Rasch models to the analysis of C-tests (Eckes, 2007), the<br />
rating scale model was employed for item calibration. Adopting a CING design, itembanking<br />
data were collected in a series of 23 different test sessions, covering a total of<br />
4,842 participants from 116 countries. In each session a set of 10 texts was administered,<br />
two of which were common to all sets. Reliability indices per set ranged from .94 to .98.<br />
Texts showing unsatisfactory model fit or DIF were eliminated. The remaining 174 texts<br />
were put on the same difficulty scale through a concurrent estimation procedure.<br />
Combined with a carefully designed client-server architecture, the Rasch-measurement<br />
approach to item banking currently provides the basis for a highly flexible administration of<br />
the onDaF at licensed test centers throughout the world. When taking the onDaF, each<br />
examinee is presented with a unique set of eight texts; that is, texts are drawn from the item<br />
bank according to a linear-on-the-fly test delivery model. In each instance, test assembly is<br />
subject to the constraints of increasing text difficulty and variation in text topic. Responses<br />
are automatically scored and test results are reported to examinees immediately after<br />
completing the test.<br />
16 ENAC 2008
Invited Symposium: C-tests / Paper 2:<br />
Gaining substantive information from local dependencies between C-test items<br />
Johannes Hartig, German Institute for International Educational Research (DIPF), Germany<br />
Claudia Harsch, IQB, Humboldt University Berlin, Germany<br />
The C-test is used as a screening tool to assess global performance levels in written<br />
language competence. The individual gaps within each text of the C-test are the smallest<br />
unit of analysis, which can be treated as individual test items. Typically, however, the focus<br />
of the C-test assessment is not the individual item, but the overall level of achievement (i.e.,<br />
the number of closed gaps). A technical argument against an analysis on the item level is<br />
that the solutions of individual gaps partially depend on the solutions of the remaining gaps<br />
within the same text. This means if individual gaps are treated as items, local independence<br />
of these items, as presumed in most measurement models, is a rather unlikely assumption.<br />
For this reason, many authors prefer to analyze performance in C-tests on the text level and<br />
not on the level of individual gaps, e.g. by treating each text as separate “super item”. In<br />
contrast to this approach, this paper will focus on the substantive information that can be<br />
gained about the C-test and the underlying language competencies if performance is<br />
analyzed on the item level. We use the dependencies on text level and between individual<br />
gaps to derive information about characteristics of texts and gaps that determine students’<br />
solution processes. The aim of the study is to predict these dependencies by using a priori<br />
defined text and item characteristics.<br />
Statistical and graphical methods to examine item dependencies on text and item level are<br />
presented. Dependencies on text level can be estimated within a Rasch testlet model,<br />
assuming additional latent dimensions for each test, over and above the common<br />
underlying ability dimension. Dependencies between individual gaps can be graphically<br />
analyzed based on the correlations between residuals from Rasch analyses within each<br />
text. These methods are applied to data from a large scale assessment of English language<br />
competencies of German 9th graders. Statistics for local dependencies are estimated on<br />
text and on item level. Results of the Rasch testlet model show substantial amounts of textspecific<br />
variance, indicating general dependencies between gaps within the same text. The<br />
analysis of residuals yields strong dependencies between few specific item pairs, while in<br />
some texts almost no marked dependencies are found on item level. Dependencies on text<br />
level as well as on item level can partly be explained by text and item characteristics. For<br />
instance, the deletion frequency of gaps seems to affect dependencies on text level, and<br />
dependencies between items can be found for gaps within the same phrases. The results<br />
widen our understanding of the C-test construct; it is discussed if this knowledge can be<br />
used to systematically construct C-tests with specific properties.<br />
ENAC 2008 17
Invited Symposium: C-tests / Paper 3:<br />
C-tests for German Students:<br />
Dimensionality, Validity and Psychometric Perspectives<br />
Alexander Robitzsch, IQB, Humboldt University Berlin, Germany<br />
Ina Karius, IQB, Humboldt University Berlin, Germany<br />
Daniela Neumann, IQB, Humboldt University Berlin, Germany<br />
C-tests are integrative tests which are designed to test a person’s command of language by<br />
making use of the principle of reduced redundancy. It is assumed that language is<br />
redundant in a way that allows successful communication although possible flaws in the<br />
language transmission (unclarity, ambiguity, noise) may impede understanding. The<br />
addressee of a message is able to reconstruct the form and meaning of morphologically<br />
incomplete words supported by the local and global context of the message, provided that<br />
he or she is familiar with the vocabulary, the grammatical rules and the cultural background<br />
of the language used.<br />
In Germany, the educational standards for the subject “German” (mother tongue) are<br />
supposed to ensure that every student is able to fully participate in written and spoken<br />
interaction. In the course of the measurement of the educational standards items were<br />
developed to assess the students’ competences in this area. A total of 1700 students of all<br />
secondary school types ranging from grade level eight to ten (14-17 years old) were tested<br />
in reading and listening comprehension, writing, orthography and language use. The<br />
assessment part on language use contained, among other items, C-tests. Altogether ten<br />
different C-tests were used in a sample of 560 students. Every student filled in four C-tests<br />
which resulted in a complete balanced multi-matrix sampling design.<br />
In many cases dimensionality of C-tests is assessed by regarding each C-test as one<br />
superitem that demands all blanks to be completed. We focus on a dimensional analysis on<br />
the item level and use NOHARM and DIMTEST to assess essential dimensionality of the Ctest<br />
construct. Due to lack of local stochastic independence of the Rasch model we propose<br />
a more detailed testlet model which models dependency hierarchically i.e. items are nested<br />
within sentences and sentences are nested within C-tests.<br />
This paper estimates this multilevel item response model on the item level. In addition<br />
variances of local stochastic dependency are being explained by linguistic characteristics of<br />
the material and the student properties such as grade level and school track. To provide a<br />
deeper understanding of validity, we study relationships of C-test results with subdomains of<br />
language use, orthography and listening comprehension by using a confirmatory factor<br />
analysis.<br />
This contribution gives insight into the C-test construct for native speakers with regard to<br />
dimensionality; it provides detailed validity evidence and finally proposes an alternative<br />
psychometric scaling model.<br />
18 ENAC 2008
Invited Symposium: E-Assessment<br />
Moving forward with e-assessment<br />
Organiser: Denise Whitelock, The Open University, United Kingdom<br />
Discussant: Kari Smith, University of Bergen, Norway<br />
Technology is increasingly being used to support assessment, but its effectiveness in this<br />
area of learning is still open to question. In part, this is because there is an awkward tension<br />
between assessment and constructivist approaches to learning. Even when an entire<br />
learning experience has been designed to be constructivist and learner-centred, formal,<br />
summative assessment sits uneasily with this constructivist pedagogy; it is ‘out there’ and is<br />
not part of the process of constructing knowledge. Constructivism is used as an approach<br />
for getting better performance on conventional measures, rather than as a radical<br />
philosophy about the nature of knowledge and its acquisition. When assessment is<br />
embedded within constructivist pedagogy, learners quickly adopt strategies that optimise<br />
their cognitive load, typically guessing what is expected of them rather than constructing<br />
their own conceptual frameworks.<br />
This symposium scrutinises the laudable aims of harnessing technology enhanced<br />
assessment to help shape learners as independent thinkers, making their own judgments<br />
and decisions about their learning process in partnership with their tutors. Assessment and<br />
learning need to be properly linked. As Elton and Johnston said, “if one changes the method<br />
of teaching, but keeps the assessment unchanged, one is very likely to fail.” And Rowntree:<br />
“if we wish to discover the truth about an educational system, we must look into its<br />
assessment procedures.” Despite decades of innovation in learning theory and technology<br />
and many different approaches to the problem of building conversational relationships in<br />
education, assessment is still the core of the problem.<br />
Assessment systems define the nature of subjects, and what is worth knowing, and act as<br />
gatekeepers to progress in education and careers, synchronising understanding between<br />
an individual and the world they live in. This symposium will discuss how e-assessment can<br />
overcome the barriers that are imposed upon both students and tutors in developing ICTbased<br />
systems and are tuned to the current net generation of learners.<br />
ENAC 2008 19
Invited Symposium: E-Assessment / Paper 1:<br />
The Future of E-Assessment: E-Assessment as a Dialog<br />
Cornelia Ruedel, University of Zurich, Switzerland<br />
E-Assessment has become more and more popular over the last decade. In contrast to the<br />
German-speaking countries where the impact has been limited compared to the advances<br />
in the USA and the UK, the Universities in Switzerland, Germany and Austria have now<br />
realised e-Asessment’s potential. The exam system in German-speaking countries is<br />
traditionally dominated by a competition between exams, assignments and oral<br />
presentations. The Bologna Reform with the modularisation of the university courses has<br />
put a burden upon the exam system which has started a rethink of new assessment<br />
methods. These new assessment forms should take all the criticism about the traditional<br />
system into account in terms of validity, reliability and fairness. The usual practice of<br />
marking in the German-speaking countries is not really transparent. There is no tradition of<br />
having an external examiner for written exams, only the lecturer is responsible for the<br />
marking process. Therefore the students have sometimes an uneasy feeling about the<br />
whole system.<br />
Generally, the students’ view of assessment is quite different from the lecturers’ view. The<br />
assessment is at the heart of the student learning but it is not at the heart of teaching.<br />
Assessment should become a classroom element which motivates, encourages and<br />
stimulates student learning. E-Assessment offers all these in a variety of maintainable<br />
solutions, like self-tests, e-portfolios and peer assessment.<br />
This paper will discuss the future possibilities for E-Assessment which should be a more<br />
holistic approach with a mixing of learning, teaching and assessing. This should be diverted<br />
from the usual ‘snapshot assessing’ towards an assessment over a period of time to avoid<br />
students’ exam anxiety and occasional blackout or blip. This approach of continuous<br />
assessment is only feasible with the electronic delivery and the use of Virtual Learning<br />
Space. This space should enable the students to be in control of their own learning and<br />
even their private notes and reports. The Virtual Learning Environments are too static at the<br />
moment and do not offer the freedom students require to network with their peers.<br />
Furthermore, the learning space should allow the flexibility that the students can decide<br />
when and where they are ready to take the test, which would lead to a self-organised<br />
assessment.<br />
Students could take the formative assessments as many times as they want, their progress<br />
would be recorded and would contribute to the final mark. In-depth and targeted feedback<br />
could guide the students so that they can learn from their own mistakes / misconceptions so<br />
this would help them to develop their reflective thinking skills. Here, E-Assessment would<br />
play the major role because is it possible to assess softer skills too since researching,<br />
validating data from different sources and working in a team are becoming more important<br />
on the wider opening job market. New forms of collaborative assessment techniques will be<br />
established where wikis and blogs are only the beginning. E-Assessment will be the new<br />
approach where the students’ expectations meet the teachers’ requirements.<br />
20 ENAC 2008
Invited Symposium: E-Assessment / Paper 2:<br />
Alcohol and a Mash-up: Understanding Student Understanding<br />
Jim Ridgway, University of Durham, United Kingdom<br />
Sean McCusker, University of Durham, United Kingdom<br />
James Nicholson, University of Durham, United Kingdom<br />
Informed citizenship depends on the ability of citizens to understand and reason from evidence. In<br />
the UK at least, school statistics focuses on the mastery of technique, rather than on interpretation<br />
of results (Ridgway McCusker & Nicholson, 2007). The techniques themselves focus on the<br />
analysis of univariate and bivariate data. As a consequence, school statistics is largely useless in<br />
dealing with any data sets students might encounter in their lives outside school.<br />
There is a large literature on the problems that students and adults have with simple concepts,<br />
such as interpreting static 2D graphs, and tabular information (e.g. Batanero, et al., 1994). One<br />
might predict that working with multivariate data would be impossible for people with no statistical<br />
training. However, empirical explorations (e.g. Ridgway McCusker & Nicholson, 2006) show that<br />
computer-based 3 variable tasks are no more difficult for 12-14 year olds than are 2D paper<br />
based tasks. Du Feu (2005) has shown that much younger children can work meaningfully with<br />
multivariate data displays that they have created in the form of tactile graphs built from LEGO®.<br />
The SMART Centre has designed a number of software ‘shells’ in Macromedia Flash® that run<br />
on web browsers, and that facilitate the display of MV data (http://www.dur.ac.uk/smart.centre/). A<br />
variety of displays is available, that allow up to 6 variables to be displayed under user control. An<br />
earlier study (Ridgway, Nicholson, and McCusker, 2008) reported a study based in 13 classes of<br />
pupils aged 12-14 years, covering the range of abilities typical in their school. Resources were<br />
created on topics that included alcohol use, drug use, and sexually transmitted infections, using<br />
data from large scale surveys, together with curriculum materials designed to provoke<br />
understanding of MV data. Classroom observations showed that young pupils across the<br />
attainment range can engage with and understand complex messages in MV data.<br />
The study to be reported here presents students with a mashup comprising recent survey data on<br />
alcohol use presented in an interactive display, and links to recent newspaper articles on alcohol<br />
consumption by young people (e.g. “Young girls drink nearly twice as much alcohol as they did<br />
7 years ago” Daily Mail). Students are asked to critique the articles in the light of the data. We<br />
believe that the ability to read critically in the light of evidence is a core literacy, and a fundamental<br />
requirement for informed citizenship. We will report the findings from this study in detail, along<br />
with a list of core heuristics that are essential when exploring MV data. We will also present<br />
examples from student work that illustrate key aspects of statistical literacy, and questions that<br />
are useful to diagnose student conceptions and misconceptions.<br />
References<br />
Batanero, C., Godino, J. D., Vallecillos, A., Green, D., & Holmes, P. (1994). Errors and difficulties in<br />
understanding elementary statistical concepts. International Journal of Mathematics, Education,<br />
Science and Technology, 25(4), 527 – 547.<br />
du Feu, C. (2005) Bluebells and bias, stitchwort and statistics. Teaching Statistics, 27(2), 34-36<br />
Ridgway, J., Nicholson, J. R., & McCusker, S. (2007). Teaching statistics despite its applications. Teaching<br />
Statistics, 29(2), 44-48.<br />
Ridgway, J., Nicholson, J., and McCusker, S. (2008, in press). Reconceptualising ‘Statistics’ and<br />
‘Education’. In C. Batanero (ed.). Statistics Education in School Mathematics: Challenges for<br />
Teaching and Teacher Education. Springer.<br />
ENAC 2008 21
Invited Symposium: E-Assessment / Paper 3:<br />
E-assessment for learning?<br />
The potential of short free-text questions with tailored feedback<br />
Sally Jordan, The Open University, United Kingdom<br />
A number of literature reviews have identified conditions under which assessment supports student<br />
learning (e.g. Gibbs and Simpson, 2004). Two common themes are assessment’s ability to<br />
motivate and engage students, and the role of feedback. However, if feedback is to be effective, it<br />
must be more than a transmission of information from teacher to learner. The student must<br />
understand the feedback sufficiently well to be able to learn from it i.e. to ‘close the gap’ between<br />
their current level of understanding and the level expected by the teacher (Ramaprasad, 1983).<br />
The work described is one of a number of projects in an ‘E-assessment for learning’ initiative at<br />
the Centre for the Open Learning of Mathematics, Science, Computing and Technology<br />
(COLMSCT) at the UK Open University. Most of the projects make use of the OpenMark eassessment<br />
system, which offers students multiple attempts at each question, with the amount of<br />
feedback provided increasing at each attempt. The provision of multiple attempts with increasing<br />
feedback is designed to give the student an opportunity to act on the feedback to correct his or<br />
her work immediately and the tailored feedback is designed to simulate a ‘tutor at the student’s<br />
elbow’. (Ross et al, 2006).<br />
The current project has extended the range of e-assessment questions offered to students via<br />
OpenMark to include those requiring free-text answers of up to around a sentence in length. The<br />
answer matching is written with an authoring tool provided by Intelligent Assessment<br />
Technologies Ltd. (Mitchell et al., 2002) which uses the natural language processing technique of<br />
information extraction and incorporates a number of processing modules aimed at providing<br />
accurate marking without undue penalty for poor spelling and grammar. A significant feature of<br />
the project has been the use of student responses to developmental versions of the questions,<br />
themselves delivered online, to improve the answer matching.<br />
Evaluation has included an investigation into student reaction to questions of this type and their use<br />
of the feedback provided. A human-computer marking comparison has shown the computer’s<br />
marking to be indistinguishable or more accurate than that of six course tutors. Reasons for this will<br />
be discussed. The two facets of the evaluation are linked; if students are to engage with the<br />
questions and to learn from the feedback provided, the marking must be accurate. Also, although<br />
most students like the questions and are impressed by the sophisticated answer matching, others<br />
appear to find multiple-choice questions less demanding and to be more trusting of human markers.<br />
The purpose of these interactive computer marked assignments (iCMAs) is to provide students<br />
with instantaneous feedback, pacing, and an opportunity to monitor their own progress and to<br />
discuss this with their tutor if appropriate. The iCMAs are complemented by tutor marked<br />
assignments.<br />
References<br />
Gibbs, G. and Simpson, C. (2004) Conditions under which assessment supports students’ learning.<br />
Learning and Teaching in Higher Education, 1:3-31.<br />
Mitchell, T., Russell, T., Broomhead, P. and Aldridge, N. (2002) Towards robust computerised marking of<br />
free-text responses. 6th International CAA Conference, Loughborough,UK.<br />
http://www.caaconference.com/pastconferences/2002/proceedings/Mitchell_t1.pdf<br />
Ramaprasad, A. (1983) On the definition of feedback, Behavioral Science, 28:4-13.<br />
Ross, S.M., Jordan, S.E and Butcher, P.G.(2006) Online instantaneous and targeted feedback for remote<br />
learners in C. Bryan and K.V.Clegg (eds) Innovative Assessment in Higher Education. London:<br />
Routledge:123-131.<br />
22 ENAC 2008
Symposia<br />
ENAC 2008 23
24 ENAC 2008
Symposium: Portfolios in Higher Education<br />
Portfolios in Higher Education in three European countries –<br />
Variations in Conceptions, Purposes and Practices<br />
Organiser: Olga Dysthe, University of Bergen, Norway<br />
Chair: Nicola Reimann, Northumbria University, United Kingdom<br />
Discussant: Anton Havnes, University of Bergen, Norway<br />
Portfolio assessment has been introduced in most countries both as an alternative<br />
assessment tool and as a tool for learning. The term ‘portfolio’, however, is used in very<br />
many different ways. This variation in portfolio conception is often presented as an<br />
advantage in the sense that portfolio is a very versatile tool that can be adapted to fit a long<br />
array of purposes and contexts. But it does create confusion for students who may<br />
encounter very different practices under the same name. In a Europe with extended<br />
educational mobility, we think it is timely to investigate and discuss whether there are any<br />
patterns of use that can be distinguished between countries (or: as characteristics in each<br />
country), and/or whether differences follow disciplinary and professional lines and thus cut<br />
across borders. Specific questions that will be raised in the discussion: Is there a need for a<br />
more unified understanding or definition of portfolio, and if so, is it possible? Is there a need<br />
for a clarificatory framework? The ‘collection-selection-reflection’ framework has been<br />
widely used, but is it useful in all contexts? What is useful for students, and what is needed<br />
in European or wider international fora where portfolios are discussed and researched?<br />
In this symposium we will present research from three countries that give some indication of<br />
how portfolios are used in higher education in Belgium, Norway and England, even though<br />
we fully realise that the picture in each country is even more varied than we are able to<br />
show.<br />
In England portfolios have developed from several directions, broadly summarised as<br />
learner-specific and subject-specific. This presentation, based on a study in 2007 will focus<br />
on e-portfolios growing out of the introduction of the Personal Development Planning, which<br />
can be seen as a formative assessment approach.<br />
In Norway portfolios were introduced as an alternative form of assessment in connection<br />
with a major reform of higher education implemented from 2002. This led to a proliferation<br />
of what can be called “disciplinary content portfolios”. A national survey of portfolio practices<br />
is presented and differences in understanding, content, use and assessment of portfolios<br />
between disciplines is discussed.<br />
A research study from the Flemish part of Belgium highlights a lot of differences between<br />
portfolio applications in different higher education courses. A framework was developed to<br />
compare portfolios from eight different courses in colleges and at universities.<br />
Interpretations of the portfolio concept are divergent and a portfolio standard is absent.<br />
All the contributions are based on research studies but the implications and the questions<br />
raised are practice-related. The presenters represent three European countries, UK,<br />
Norway and Belgium.<br />
ENAC 2008 25
Symposium: Portfolios in Higher Education / Paper 1:<br />
Learning opportunities through the processes of eportfolio development<br />
Elizabeth Hartnell-Young, The University of Nottingham, United Kingdom<br />
In Higher Education in the UK, the use of e-portfolios has developed from several directions,<br />
which can be broadly summarised as learner-specific and subject-specific. The first is often<br />
known as Personal Development Planning, which can be seen as a formative assessment<br />
approach, while the other is in high stakes summative assessment, in discrete subject areas,<br />
and increasingly in competency-based contexts in fields such as medicine. Consequently the<br />
form and content of the e-portfolios, and the issues that arise in their use, differ.<br />
England’s, e–Strategy intends that learners will have ‘a digital space that is personalised,<br />
that remembers what the learner is interested in and suggests relevant web sites, or alerts<br />
them to courses and learning opportunities that fit their needs’. As well as using such<br />
spaces in schools, colleges and universities, the intention is to enable the development of<br />
‘electronic portfolios that learners can carry on using throughout life.’<br />
This paper is based on a study conducted by the author in the UK in 2007 which considered<br />
the uses of e-portfolios in school, further education colleges, universities and the National<br />
Health Service. It concluded that e-portfolio systems include repositories and a range of<br />
tools for storing and organising material for planning, reflecting, and giving and receiving<br />
feedback. The processes undertaken are opportunities for learning, while the collections in<br />
repositories build up over time, allowing selections to be offered to various audiences and<br />
assessors. Thus they can support both formative assessment (ongoing assessment for<br />
learning), and allow relevant selections to be presented for summative assessment<br />
(assessment of learning). At present, however, users have little sense of the concept of a<br />
lifelong e-portfolio.<br />
E-portfolio development can commence from one of many starting points such as online<br />
reflective practice, or planning or capturing evidence. These processes form part of an ‘eportfolio<br />
culture’: a way of thinking about personal and collaborative learning over a longer<br />
period of time than a specific course. Fragments of life experience from individual subjects<br />
and artefacts must be made more coherent as expressions of identity. However, because<br />
almost all e-portfolio products are licensed by institutions rather than individuals, they are<br />
neither portable nor interoperable, thus creating potential roadblocks on the lifelong journey.<br />
Assessment is the formal means by which eportfolios or their disaggregated contents are<br />
judged, whether by self assessment, peer assessment, tutor assessment, or university<br />
admissions officers. There are also other contexts, such as employment applications, in<br />
which audiences must recognise, acknowledge and value material in new forms. Rowntree<br />
(1977) suggested five dimensions of assessment: Why assess? What to Assess? How to<br />
assess? How to interpret? and How to respond? The last two questions are particularly<br />
apposite in light of increasing use of digital images, animations and so on, and with<br />
increasing attention being paid to e-assessment as a means of judging the outcomes of an<br />
individual’s learning experiences.<br />
References<br />
Rowntree, D. (1977). Assessing Students: How shall we know them? London: Harper.<br />
26 ENAC 2008
Symposium: Portfolios in Higher Education / Paper 2:<br />
The Disciplinary Content Portfolio in Norwegian Higher Education – How and Why?<br />
Olga Dysthe, University of Bergen, Norway<br />
Knut Steinar Engelsen, Stord/Haugesund University College, Norway<br />
Considerable changes in assessment have taken place in Norway after 2002, in the wake of<br />
a major reform of higher education inspired by the Bologna Declaration. While ‘portfolio’ was<br />
an unknown concept for most teachers and students in higher education in Norway five years<br />
ago, an evaluation report about the reform documented that considerable changes had taken<br />
place and that portfolio assessment was now used in all types of educational institutions and<br />
across disciplines (Dysthe et al 2006). The empirical basis for this paper is a nationwide<br />
survey study of portfolio practices conducted in 2006, supplemented by case studies.<br />
The aim of the research study was to get an overview of portfolio practices in Norway. We will<br />
mention some of the findings and describe characteristic aspects of ‘the disciplinary content<br />
portfolio’ and systematic differences between different types of educational institutions as well<br />
as between disciplines within each institution. The main question we raise is what<br />
conceptions of portfolios and what practices are considered useful and under what conditions.<br />
Our survey was based on a randomized selection from all public universities and university<br />
colleges in Norway was conducted in the spring 2006. The purpose was to map<br />
assessment practices across different institutions and disciplines, focusing on issues like<br />
types and number of portfolio assignments, the use of feedback, final assessment formats<br />
and the use of evaluation criteria. Teacher attitudes towards the usefulness of portfolios in<br />
relation to student and teacher workload were also investigated. The informants in both<br />
surveys were professors and lecturers responsible for portfolio assessed courses. The<br />
survey data was analysed using standard statistical methods.<br />
We found that portfolio systems varied from advanced reflection-based models which<br />
included multiple text types and flexible feedback-practices to portfolios that consisted of<br />
factual texts with rudimentary feedback procedures and no reflective texts. We found<br />
systematic variations between professional educational institutions and universities, but also<br />
between ‘soft’ and ‘hard’ disciplines within the same institutions. Portfolio practices were<br />
diverse and a common understanding seemed lacking; a finding that may be due to the<br />
early stage of implementation and the complex motivation for initiating change. Feedback<br />
was considered very important, but even when peer feedback was being used, training in<br />
how to give feedback or discussion of quality criteria were not common.<br />
The Norwegian disciplinary content portfolio falls under the category that Hartnell-Young<br />
calls “subject-specific” but tries to combine formative and summative assessment. It is often<br />
digital but differs from the learner-specific e-portfolio described by Hartnell-Young by not<br />
aiming at building up repositories over time or presentation for an out-of-class audience.<br />
We base our discussion on socio-cultural perspectives on learning and focus particularly on<br />
how macro level policy decisions in Norway have affected the use of portfolios for<br />
assessment, and how disciplinary cultures at department level have shaped both the<br />
conceptual understanding and practical use of portfolios at meso level. A question arising<br />
from this is how portfolios in different European countries are influenced by different<br />
sociocultural contexts.<br />
ENAC 2008 27
Symposium: Portfolios in Higher Education / Paper 3:<br />
Portfolio diversity in Belgian (Flemish) Higher Education –<br />
A comparative study of eight cases<br />
Wil Meeus, University of Antwerp, Belgium<br />
Peter Van Petegem, University of Antwerp, Belgium<br />
An international literature study on portfolio in higher education led to a timeline<br />
distinguishing between four modes of implementation (Meeus et al, 2006). These range<br />
from the use of portfolio in admissions to higher education, during the higher education<br />
course, on entry into the profession and for ongoing professional development. In this study<br />
we focus on portfolios used during higher education courses.<br />
There are a large number of portfolio applications in use in higher education courses in the<br />
Flemish part of Belgium. Although practitioners and scholars talk about portfolios as a<br />
standard concept, most of the portfolios they refer to seem to be very different. This study<br />
investigates the diversity of portfolio applications used within higher education in Flanders.<br />
The research questions are: In what way do portfolio applications differ within the large area<br />
of higher education courses? Can enough commonality be detected to claim the existence<br />
of a standard portfolio concept?<br />
Eight portfolio applications were randomly selected, all in different higher education courses<br />
in Flanders: elementary teacher education, secondary teacher education, graphic and<br />
digital media, speech therapy, podiatry, nursing, academic teacher education, and physical<br />
education. The first six courses are organized at colleges, the last two at universities. A<br />
comparative framework was developed while gathering information on the portfolio<br />
applications. Source triangulation was used combining document analyses, interviews with<br />
portfolio supervisors, and focus groups with students.<br />
The comparative framework defines sixteen different characteristics within five categories:<br />
phase of implementation, function, ingredients, ICT-format and mode of supervision. All<br />
eight portfolios differ remarkably. No two portfolios have identical characteristics. This leads<br />
to the conclusion that general pronouncements on portfolio in higher education are<br />
problematic given the divergent interpretations of the concept. A clear description of the<br />
portfolio characteristics should be part of all scholarly papers on portfolio if conclusions are<br />
meant to be meaningful.<br />
References<br />
Meeus, W., Van Petegem, P., Van Looy, L. (2006). Portfolio in Higher Education: Time for a Clarificatory<br />
Framework. International Journal of Teaching and Learning in Higher Education, 17(2), 127-135.<br />
28 ENAC 2008
Symposium: Group work assessment<br />
Aims, values and ethical considerations in group work assessment<br />
Organiser: Lorraine Foreman-Peck, The University of Northampton, United Kingdom<br />
In this symposium ‘group work’ is understood as assignments carried out by students,<br />
largely independently of the tutor and usually outside normal class contact time. Group work<br />
is used extensively in higher education in the UK, in a variety of ways, from assignments<br />
that are relatively short to those forming the major part of the course. They are pervasive,<br />
not only because they are seen as providing an educationally valuable learning experience,<br />
but also because they are generally believed to develop skills useful to employers.<br />
The requirement for group work assessment to be fair and transparent is problematic (e.g.<br />
Race 2001). This is most evident with group dynamics. In these instances students may fail<br />
to work optimally together, and then,undergo a negative and damaging experience. The<br />
literature suggests that these cases occur regularly, but may affect only a minority of<br />
students in any one cohort (e.g. Parsons 2004), and are dealt with in an ad hoc and opaque<br />
manner.<br />
Proposed solutions to group dysfunction are usually technical, focussing for example on the<br />
validity and reliability of different methods of allocating marks (e.g. Magin 2001). However<br />
this approach does not appear to address a host of value questions that commonly arise,<br />
such as: ‘Is it right to allow students to evict underperforming students from their groups?’;<br />
‘Ought equal marks for unequal contributions be given?’; ‘Should a whole group fail as a<br />
consequence of plagiarism committed by one student?’<br />
These and other questions point to the need to conceptualise and contextualise in more<br />
depth: the practice of fair group work assessment. the principles and rules by which groups<br />
should abide, the extent and nature of tutor facilitation, and the prevention of dysfunctional<br />
group dynamics. These issues are explored by the symposium participants in the context of<br />
their own practices.<br />
Participants in the symposium belong to a team of tutors from the Universities of<br />
Northampton and Northumberland who have been working together on group work practice<br />
since October 2007. Their approach is action research as a form of ‘practical philosophy’<br />
(Elliott 2007). This involves tutors in identifying and clarifying ethical challenges in their own<br />
teaching and evaluating possible solutions based on defensible educational values. From<br />
these coordinated case studies, grounded insights into group work practice across a range<br />
of disciplines are derived, along with suggested institutional policy recommendations.<br />
ENAC 2008 29
Symposium: Group work assessment / Paper 1:<br />
Involuntary Free Riding – how status affects performance in a group project<br />
Julia Vernon, The University of Northampton, United Kingdom<br />
Many studies (Maguire and Edmondson 2001, Mills 2003, Gupta 2004, Greenan et al<br />
1997), note the positive effects on learning which groupwork may engender, and others<br />
(Knight 2004, Hand 2001) discuss the existence of phenomena such as social loafing<br />
(Latané et al.,1979), free-riding (Albanese & Van Fleet, 1985) and the inequity of workload.<br />
The findings of Whyte (1943), Cottrell (1972), Webb (1992) and Ingleton (1995) show how<br />
the effects of group dynamics can have a positive or negative effect on the performance of<br />
group members.<br />
In this action research case study, we attempt to counter the negative effect that working in<br />
a group may have on the self-esteem and confidence of individuals, when they are teamed<br />
with students considerably more skilled than themselves. It focuses on a prolonged<br />
groupwork project, part of a Level 5 undergraduate course in Business Computing, which<br />
simulates a web development consultancy company, and involves team members taking on<br />
a variety of roles, in which they demonstrate different skills.<br />
As the project proceeds, it is noted that the status of individuals within the groups becomes<br />
polarised. The high status individuals have a strong sense of ownership, and a decreasing<br />
degree of trust in the work of others. Low-status members are inclined to defer to their<br />
team-mates and to draw back from expressing opinion in decision-making situations, or<br />
giving explanations of their own work. It is suggested in this paper that students who have<br />
every intention of contributing fully to the project, nevertheless, through these effects, find<br />
themselves in a position of being involuntary ‘free-riders’.<br />
In this research measures were introduced to support the groups and individuals, and to<br />
counter the negative effects noted. These actions took place around the middle of the<br />
period of the project, when in previous years there has been a lull in group activity, and<br />
problems have arisen. A facilitated session was arranged for the students, to bring issues of<br />
groupwork into the open, and develop strategies to improve group cohesion. Discussions<br />
followed from this and students were counselled individually to trace areas of difficulty. A<br />
formative assessment was introduced in the form of an individual presentation to the group,<br />
where each student explained their role and how they were carrying it out.<br />
Revisiting issues, after students have been able to experience them first-hand, resulted in a<br />
much more thoughtful response than when discussed early in the project. In addition the<br />
requirement to prove individual contribution brought about some task re-negotiation.<br />
Dominant members were seen to rein back in some aspects of the control they had<br />
exercised, while submissive members pushed themselves to take a lead on an important<br />
part of the project. Crucially, awareness of group dynamics had increased and was seen in<br />
less simplistic terms. The measures taken had the effect of alleviating the effects noted,<br />
supporting positive help-giving and knowledge transfer within the group, and allowing<br />
members to contribute more fully to the group task.<br />
30 ENAC 2008
Symposium: Group work assessment / Paper 2:<br />
Facilitating Group work: Leading or empowering?<br />
Julie Jones, The University of Northampton, United Kingdom<br />
Andrew Smith, The University of Northampton, United Kingdom<br />
Year two students of the Foundation Degree in Learning and Teaching, study a module<br />
relating to special educational needs/inclusion. Assessment is through a collaborative group<br />
project and a personal project diary with a reflective statement. We felt that the assessment<br />
strategy did not sufficiently discriminate between students: virtually all achieved very high<br />
grades. Concern was further prompted by awareness of recent research into issues of the<br />
fairness, justice and reliability of group work (Maguire and Edmondson 2001, Barnfield<br />
2003, Knight 2004, Skinner et al 2004) and of motivational factors including the effect of<br />
rewarding the group product or the individual contribution (Chapman 2002) and issues of<br />
inter-relationships in groups (Arango 2007).<br />
Reflective statements and evaluation feedback from the 2006/7 cohort identified concerns relating<br />
to some students acting as ‘passengers’ ,but being awarded the same high grade for the module<br />
as those members who completely engaged with the work. This is a well documented problem<br />
identified by others (Ransom 1997, Parsons 2002, Hand 2001, Cheng and Warren 2000).<br />
In addition the Course Team found tutor guidance was a complicating factor: it was felt that<br />
it was a major contributor to the high grades awarded. There was a concern that this<br />
facilitation encouraged some students’ lack of engagement by allowing them to be led<br />
rather than, as was intended, empowering them to develop their own projects.<br />
These observations prompted a reformulation of the assessment strategy for the 07-08<br />
cohort. The weightings were altered from 80% to 60% for the group assessed project and<br />
from 20% to 40% for the individual elements. In order to assess the effects of this and to<br />
gain insight into issues such as empowerment, especially those involving tutor facilitation,<br />
data was collected on the following:<br />
How the group:<br />
• formed and decided upon the project focus<br />
• sustained motivation and whether this was linked to a perception that individual<br />
contributions supported the group assessment or individual assessment, or both<br />
• managed inter-personal professional working relationships<br />
• managed equitable sharing of the work-load<br />
• their perceptions and use of the guidance available from the module tutor<br />
This involved:<br />
• analysis of 2007-08 students’ diaries and reflective statements<br />
• interviews with 2007-08 students<br />
• analysis of diaries and reflective statements from the 2006-07 cohort<br />
• interviews with students from the 2006-07 cohort asking them to reflect retrospectively<br />
on their experiences.<br />
• A reflective dairy written by the facilitating tutor for 07-08<br />
From this comparative evaluation we will explore firstly whether the amendments to the assessment<br />
weightings made a difference in students’ perceptions of the fairness of the assessment strategy<br />
and secondly the effect the level and nature of tutor facilitation had on group dynamics, especially in<br />
the areas of, communications, task sharing, empowerment and ownership.<br />
It is expected that the research will have implications for tutors’ thinking about assessment<br />
weightings and will throw light on the ethical dilemmas surrounding the issues of the guidance and<br />
facilitation of group work.<br />
ENAC 2008 31
Symposium: Group work assessment / Paper 3:<br />
Marginalised students in group work assessment:<br />
ethical issues of group formation and the effective support of such individuals<br />
Antony Mellor, Northumbria University, United Kingdom<br />
Jane Entwistle, Northumbria University, United Kingdom<br />
In this project we focus on our experience of group work assessment over a number of<br />
years on a 20 credit, year-long, option module Soil Degradation and Rehabilitation, which<br />
forms part of the final year of our BSc (Hons) Geography degree programme and has a<br />
cohort of around 30 students each year. The group assessment comprises 40% of the<br />
module marks and includes a group oral presentation and written report. We became<br />
concerned about a number of issues adversely affecting the student learning experience,<br />
such as marginalised individuals, adverse group dynamics and unequal contributions by<br />
individuals within groups (Mills 2003, Hand 2001). Of specific concern, however, were the<br />
ethics of group formation (Chang 1999, Knight 2004). Allowing self-selected groups<br />
inevitably leaves some students marginalised and in a position where they may be not only<br />
disadvantaged materially in terms of marks but also could be personally affected in a<br />
negative way. In this paper we explore to what extent is it our duty to address the needs of<br />
these students, as well as ways of maintaining equity and transparency in tutor-led support<br />
across the entire cohort.<br />
Using an action research approach (Carr 2006, Elliott 2007), we implemented four key<br />
interventions:<br />
• To make timetabled sessions available for the groups to meet and also to discuss<br />
progress with the tutor, thus addressing the practical problem of lack of opportunity to<br />
meet and facilitating group interaction early on in the process.<br />
• To allow groups to play to their strengths. We encouraged the students to think about<br />
their strengths in terms of the tasks required as part of this assignment to identify what<br />
their contribution might be and their role within that group.<br />
• To provide formative feedback on drafts of the written report. This enabled us to<br />
encourage and promote the need for a dialogue between group members where a<br />
synthesis of materials was lacking.<br />
• To include an individual critical reflection component as part of the assignment. We<br />
aimed to promote reflection on the learning inherent in the activity regardless of the form<br />
of the experience or the summative mark of the end product.<br />
Data were collected using a written teacher log, the students’ critical reflections, and a<br />
student questionnaire following completion of the project. Of the four interventions noted<br />
above, all had a positive role to play in supporting isolated and marginalised students with<br />
their experience of group-work. The fourth, that of individual critical reflection, was perhaps<br />
the least successful across the cohort as a whole because the students were relatively<br />
inexperienced in this way of thinking and writing, coming largely from a scientific<br />
background. It did however provide a platform for student grievances and issues to be<br />
raised, and facilitated their ability to develop different approaches to solving more abstract<br />
problems. Outcomes from this intervention will also be considered in the planning of this<br />
group assessment in future years.<br />
32 ENAC 2008
Symposium: Multidimensional measurement models:<br />
Multidimensional measurement models of students’ competencies<br />
Organiser: Johannes Hartig, German Institute for International Educational Research<br />
(DIPF), Germany<br />
Most measurement models applied in traditional educational assessments implicitly or explicitly<br />
assume that test results can be described in terms of single ability dimensions. That is,<br />
individual performance differences in all assessment tasks are attributed to differences in one<br />
common ability dimension. These unidimensional models are useful in many contexts,<br />
especially if the performance domain of interest is relatively narrow, or if the goal of the<br />
assessment is a mere summative description of student achievement. Large scale<br />
assessments, for instance, keep the dimensionality of their instruments low by purpose since<br />
their goal is the description of achievement levels of large groups in broad content domains.<br />
However, if performance in a more complex domain of competence is to be assessed, or if the<br />
goal of the assessment is a deeper understanding of the underlying individual differences, the<br />
use of unidimensional measurement models may be unsatisfactory. For instance, performance<br />
in a complex domain of competence may be attributed to multiple, distinguishable abilities and<br />
the goal of an assessment may be to obtain differentiated individual profiles of these abilities.<br />
Or, the assumption that all tasks used in an assessment measure the same single ability<br />
dimension for all students may not be realistic because different students may draw on different<br />
knowledge and strategies to arrive at the same solutions.<br />
If the goal of the assessment is a deeper understanding of observed performance<br />
differences, or if unidimensional models fail to adequately explain test outcomes in complex<br />
tasks, more complex, multidimensional measurement models can be employed as an<br />
alternative to unidimensional models. These models can be used to identify systematic<br />
causes for violations of the unidimensional model, and to test more differentiated theoretical<br />
models of students’ competencies. In the latter case, the analysis of multidimensional<br />
models requires stronger theoretical assumptions than unidimensional models, the<br />
application of more advanced statistical techniques, and typically larger sample sizes. In<br />
exchange, multidimensional measurement models hold considerable promise for the<br />
empirical examination of differentiated models of performance in complex domains and<br />
heterogeneous populations.<br />
The symposium will present different approaches and applications of multidimensional<br />
measurement models. The first paper focuses on methods to systematically identify and<br />
explain violations of the assumption of unidimensional constructs in the domain of<br />
mathematical problem solving. Variables interacting with psychometric properties of single<br />
items or subgroups of items are identified in order to achieve a better understanding of the<br />
assessed competence. The second paper presents an application of cognitive diagnosis<br />
models (CDMs) to a mathematic test for elementary school. These multidimensional latent<br />
class models allow the construction of differentiated models of response processes, taking<br />
into account multiple basic abilities. The third paper focuses on a differentiated diagnosis of<br />
basic abilities in a foreign language assessment. Performance in listening comprehension<br />
items is decomposed into general text comprehension and auditory processing abilities<br />
using a two-dimensional IRT model.<br />
The papers will be discussed with respect to the potential benefit of multidimensional<br />
measurement models in different contexts of application, and the theoretical requirements<br />
of different models.<br />
ENAC 2008 33
Symposium: Multidimensional measurement models / Paper 1:<br />
Evaluation of non-unidimensional item contents using diagnostic results from Raschanalysis<br />
Markus Wirtz, Freiburg Universit of Education, Germany<br />
Timo Leuders, Freiburg Universit of Education, Germany<br />
Marianne Bayrhuber, Freiburg Universit of Education, Germany<br />
Regina Bruder, Darmstadt Technical University, Germany<br />
Competence scales, which have been developed and evaluated by means of Rasch analysis,<br />
possess optimal properties if diagnostic results are sought to reflect systematic and reliable<br />
differences between students in competence domains. Such scales allow a strictly<br />
unidimensional assessment of competencies: The response probability for all items is determined<br />
by only one latent dimension and thus person characteristics can be interpreted unambiguously.<br />
Hence, a fair and meaningful comparison of subjects or subgroups is admissible.<br />
Unidimensionality can be statistically tested because of the local independence of items within<br />
Rasch homogeneous scales: items and persons are calibrated on a common latent trait, and<br />
the position of items and persons on the latent trait (i.e., item difficulties and individual abilities)<br />
must suffice to predict the observed data structures. For summative large scale assessments<br />
as PISA and TIMSS it is important that items fulfil these criteria. It has been argued, however,<br />
that competence constructs become restricted if items covering more than one ability are<br />
systematically eliminated. This may especially pose a problem if diagnostic results are to be<br />
interpreted and used in a formative manner in classroom contexts. If scales are supposed to<br />
identify didactically relevant information about students’ competencies and individual potentials<br />
for development, multidimensional item contents may be desirable. Such “unscalable” items<br />
may be particularly important to enable teachers to identify processing problems and failures.<br />
In order to enhance the practical benefits of applying the Rasch model in competence<br />
diagnostics, strategies will be presented and discussed which allow to systematically analyse<br />
violations of the assumptions of the Rasch model. Differential Item Functioning (DIF) and<br />
Mixed-Rasch-Analysis both provide techniques to identify systematic violations of model<br />
assumptions. Furthermore person-fit-measures can be used to identify covariates, which predict<br />
the fit of individual student’s answer profiles to the model, and to identify conspicuous profiles.<br />
Variables that affect statistical properties of single items or item groups may be identified on<br />
the item (e.g. including or excluding specific tasks in the classroom by different teachers) or<br />
the student level (e.g. processing strategies, preference for different mental<br />
representations). Effects of these variables can provide important information concerning<br />
the structure of the competence to be assessed (e.g. different student types or existence of<br />
typical erroneous conceptions).<br />
Data will be presented from a research project on models for heuristic competencies in<br />
mathematical problem solving. The use of problem solving strategies which demand the<br />
application of different representations (numerical, graphical, symbolic and verbal) and the<br />
systematic change between these representations are assessed by psychometric scales.<br />
The item pool is based on a sound didactical framework. Possible causes of item misfits will<br />
be evaluated in order to enhance the knowledge about the according competence domain.<br />
The purpose of the study is twofold: A Rasch-homogeneous assessment of sub-dimensions<br />
of the use of problem solving strategies will be developed, and a sophisticated diagnostic<br />
instrument for the identification of areas for special support needs will be provided. Within<br />
this talk systematic psychometric strategies are discussed, which may allow to achieve both<br />
of these goals.<br />
34 ENAC 2008
Symposium: Multidimensional measurement models / Paper 2:<br />
Modelling multidimensional structure via cognitive diagnosis models: Theoretical<br />
potentials and methodological limitations for practical applications<br />
Olga Kunina, IQB, Humboldt University Berlin, Germany<br />
Oliver Wilhelm, IQB, Humboldt University Berlin, Germany<br />
André A. Rupp, IQB, Humboldt University Berlin, Germany<br />
In large educational studies like PISA usually unidimensional probabilistic models of latent<br />
traits are used, assuming that the observed test results can be sufficiently explained by a<br />
single latent ability. However, if a deeper understanding of the underlying cognitive basic<br />
skills is intended this approach seems to hold some limitations in terms of adequate<br />
mapping of the complexity of the addressed abilities. Most constructs assessed in<br />
educational studies (e.g. language comprehension, mathematic performance) supposedly<br />
require different cognitive skills to succeed on an item or in the test. Cognitive diagnosis<br />
models (CDMs) can yield individual profiles of relevant basic skills. Based on the profile<br />
information detailed feedback can be provided and used in teaching classes or formative<br />
interventions.<br />
In methodological terms CDMs are confirmatory multidimensional latent-variable models<br />
suitable for efficiently modelling within-item multidimensionality. They usually contain discrete<br />
latent variables that allow for a multivariate classification of respondents. Importantly, in<br />
prototypical applications of CDMs the definitions of the latent “attributes” or “skills” are based<br />
on a cognitively grounded theory of response processes at a fine grain size.<br />
In this contribution we will first discuss these key features of CDMs. We will then illustrate<br />
how CDMs can be used in large-scale educational assessment by applying a variety of<br />
models to data from a newly developed diagnostic mathematics assessment for elementary<br />
school children (3rd and 4th grade). The mathematics assessment comprises counting and<br />
modelling tasks requiring addition, subtraction, multiplication, and division skills and aims to<br />
provide a differentiated profile of counting and modelling skills in basic arithmetic<br />
operations.<br />
Specifically, we will compare multidimensional profiles of children from selected<br />
compensatory and non-compensatory CDMs. Competing measurement models are<br />
compared in terms of absolute and relative model fit, attribute difficulty distributions, and<br />
latent class membership probabilities. To provide some evidence for methodological<br />
generalizability of the results, we will then compare the discrete profiles from the different<br />
CDMs with continuous multidimensional profiles from item response theory and<br />
confirmatory factor analysis models. In combination, these analyses will provide empirical<br />
insight into the cost-utility trade-offs of CDMs as well as into the conditions under which<br />
their theoretical potential can be realized in large-scale educational assessment practice.<br />
ENAC 2008 35
Symposium: Multidimensional measurement models / Paper 3:<br />
Modelling Specific Abilities for Listening Comprehension in a Foreign Language<br />
with a Multidimensional IRT Model<br />
Jana Höhler, German Institute for International Educational Research (DIPF), Germany<br />
Johannes Hartig, German Institute for International Educational Research (DIPF), Germany<br />
Multidimensional Item Response Theory (MIRT) provides an ideal foundation to model<br />
performance in complex domains, simultaneously taking into account multiple basic<br />
abilities. In MIRT models with a complex loading structure, mixtures of different abilities can<br />
be modeled to be necessary for specific items. These models allow to investigate the<br />
relative significance of different ability dimensions for specific items, i.e. what kind of ability<br />
is required to what extent for solving a specific item. Hence, sound theoretical assumptions<br />
about the interaction between the person and the test items and the nature of the relevant<br />
ability dimensions are required. Often these assumptions are hard to test empirically, since<br />
different complex models may be equivalent in terms of model fit. However, assumptions<br />
about the demands of specific test items allow the prediction of which items should be<br />
particularly strong related to specific ability dimensions. The aim of this paper is to illustrate<br />
how theoretical assumptions about the nature of different ability dimensions represented in<br />
MIRT models can be validated by testing these predictions, i.e. by relating MIRT model<br />
parameters to characteristics of the item content.<br />
The data of our empirical application is from a German large-scale assessment of 9th grade<br />
students’ language competencies. The analyses are based on the data from reading and<br />
listening comprehension tests of English as a foreign language. The listening<br />
comprehension items are very similar to the reading comprehension items, and it is<br />
reasonable to assume that they require similar abilities. Both tests require the decoding and<br />
understanding of English, as well as the processing and integration of the information<br />
retrieved. Both tests require the reading of written text, the multiple-choice items being<br />
presented in written English. Consequently, one latent ability dimension can be assumed to<br />
represent the abilities required for both tests. However, the listening comprehension test<br />
additionally requires the processing and understanding of spoken language. It therefore<br />
appears reasonable to assume a second latent dimension representing the abilities required<br />
exclusively for the listening comprehension items.<br />
A two-dimensional two-parameter (2PL) IRT-Model is applied to the data. The first<br />
dimension represents the abilities common to the reading and listening comprehension<br />
tests (“general text comprehension”), while the second dimension represents the abilities<br />
specific to listening comprehension (“auditory processing”). The focus of our analysis is the<br />
strength of the loadings of the listening comprehension items on the auditory processing<br />
dimension. In order to identify items that draw particularly on this dimension, a priori defined<br />
task characteristics are used to predict the respective items’ loadings. It can be shown that<br />
the loading on the auditory processing dimension is related to specific item characteristics,<br />
e.g. the complexity of the relevant text passage and the speed of speech. The results<br />
provide support for the presumed nature of the “auditory processing” dimension.<br />
Additionally, groups of students differ in their relative strength on both dimensions,<br />
illustrating the benefit of a differentiated analysis of basic ability dimensions in applied<br />
contexts.<br />
36 ENAC 2008
Symposium: Computer-based assessment:<br />
Recent Developments in Computer-Based Assessment:<br />
Chances for the measurement of Competence<br />
Organiser: Thomas Martens, German Institute for International Educational Research<br />
(DIPF), Germany<br />
During the last decade Computer-Based Assessment (CBA) has become more and more<br />
popular in the international testing and educational community. The major reason is that<br />
Computer-based tests can include elements that cannot be rendered on paper.<br />
For conducting CBAs powerful software systems have become available providing support<br />
to the entire assessment process, i.e., item and test development, test delivery and result<br />
reporting (cf. presentation by Latour, Martin, Plichart, Jadoul, Busana, and Swietlik-Simon).<br />
Moreover, to assess innovative constructs user-friendly tools for authoring complex<br />
interactive stimuli have been developed (cf. presentation by Goldhammer, Martens,<br />
Naumann, Rölke, and Scharaf).<br />
Basically, computer-based tests allow for a greater diversity of test stimuli and test<br />
interaction than Paper–Based Assessments (PBAs). This especially holds true with regard<br />
to the assessment of competencies (cf. presentation by Naumann, Jude, Goldhammer,<br />
Martens, Roelke, and Klieme). With PBAs it is almost impossible to measure competencies<br />
that involve a dynamic situation or settings that are drawn from real life. In contrast, CBA<br />
can integrate multimedia test content and complex interaction modes that simulate real-life<br />
situations, and, thereby, the validity with regard to the measured competencies can be<br />
increased.<br />
Regarding testee-item interaction CBAs enables automatic recording of reactions and<br />
response times, which could not be accomplished with printed material. In combination with<br />
interactive stimuli this set-up allows for the performance-based assessment of, for example,<br />
ICT literacy.<br />
Another test format which can only be administered using computers is computer-adaptive<br />
testing (CAT). Here the difficulty of the items is tailored to the individual competence level of<br />
the test taker, as to ensure that the subject does not receive items that are clearly too easy<br />
or too difficult for him. This method saves time and allows for a more accurate estimation of<br />
the test taker’s ability.<br />
In an educational context students increasingly use computers to study and to complete<br />
their tasks. The computer may even have become the standard tool for studying and<br />
problem solving, so using PBAs to assess these students might be inappropriate. For<br />
educational monitoring, CBAs and even web-based CBAs have become feasible and the<br />
benefits of applying CBA seem to far outweigh the challenges. Also, for some test persons<br />
CBAs might have positive influences on motivation during the test.<br />
In sum, CBA offers a great potential and is most probably the testing mode for<br />
competencies in the future.<br />
ENAC 2008 37
Symposium: Computer-based assessment / Paper 1:<br />
Enlarging the range of assessment modalities using CBA:<br />
New challenges for generic (web-based) platforms<br />
Thibaud Latour, Raynald Jadoul, Patrick Plichart, Judith Swietlik-Simon,<br />
Lionel Lecaque, Samuel Renault<br />
CRP Henri Tudor, Luxembourg<br />
It has long been advocated that Computer-Based Assessment (CBA) bears significant<br />
advantages with respect to paper-and-pencil instruments on both the testee as well as at the<br />
logistic and management levels. However, CBA does not cover all existing assessment<br />
modalities and will hardly replace other delivery modes and contexts, or human-restricted tasks.<br />
Taking full benefit of advanced computer and information technologies when simply shifting from<br />
paper-and-pencil tests to computerized instruments remains challenging on various aspects.<br />
Security issues related to both organisational and technological aspects are prevalent in high-stake<br />
testing, but also in most situations were strict measurement validity is crucial, such as large-scale<br />
assessments and monitoring. Security challenges range from test, item and content protection,<br />
processes integrity and secrecy, diffusion and validity control of tests and items, cheating detection,<br />
identity management, etc. Depending on the testing context, these issues are more or less easily<br />
tackled. However, when considering networked and loosely constrained testing situation addressing<br />
these issues becomes more challenging and technologically demanding.<br />
Advanced Result Exploitation techniques and models are necessary to take full benefit of<br />
the various kinds of user tracking capabilities enabled by the use of IT platforms. The<br />
challenge includes discovery, extraction, analysis and exploitation of potential patterns in<br />
the huge number of behavioural and chronometric data recorded during test execution<br />
together with the identification of their psychometric significance.<br />
New Forms of Testing and new form of instruments to perform social, collective and situational<br />
skill assessments are now becoming more conceivable following the maturation of so-called<br />
ambient intelligence technologies (pervasive computing, ubiquitous computing and advanced<br />
user experience). These kinds of assessments can be achieved through the use of simulations<br />
and games, 3D and immersive technologies, ubiquitous and mobile testing, collaborative testing,<br />
etc. Assessing business-related skills and jobs in an economically viable manner, i.e., reducing<br />
the testing time and test development costs with respect to the number of dimensions that<br />
compose a job description in terms of competencies; this requires the design of new multidimensional<br />
instruments, including techniques to rapidly screen the subject capabilities<br />
respective to a series of jobs reference descriptions.<br />
The Intelligent Management of e-Testing Resources becomes a key element for the<br />
generalisation of CBA in collaborative management settings where contents, models, possibly<br />
data, items, tests, etc are shared by remotely located stakeholders. Improving the capacity to<br />
qualify, annotate, exchange, and search e-testing resources in a distributed community will<br />
soon become a key element in the item and test production capacity enhancement. This<br />
challenge includes collaborative aspects in stakeholder networks, including P2P frameworks,<br />
semantic annotations of multimedia resources and definition of related ontologies, query<br />
propagation and advanced semantic searches, ruled-based item creation support, etc.<br />
In this contribution, we shall explore these challenges we consider important for the future and<br />
provide hints and potential roadmap to address these challenges from both a technological and<br />
psychometric perspective. The TAO (the French acronym for technology-based assessment)<br />
framework provides a general and open architecture for computer-assisted test development<br />
and delivery, with the potential to respond to most of the raised issues.<br />
38 ENAC 2008
Symposium: Computer-based assessment / Paper 2:<br />
Developing stimuli for electronic reading assessment: The hypertext-builder<br />
Frank Goldhammer, Thomas Martens, Johannes Naumann,<br />
Heiko Rölke, Alexander Scharaf<br />
German Institute for International Educational Research (DIPF), Germany<br />
Reflecting the increasing prevalence of technology in peoples’ everyday lives, the<br />
conception of reading literacy has evolved into a more comprehensive concept. More<br />
specifically, reading literacy referring to printed and mostly linear texts has been extended<br />
to also include the ability to successfully navigate and process non-linear electronic<br />
documents (hypertexts). During the last decade reading of electronic texts has become an<br />
activity of increasing importance amongst youths as well as adults.<br />
Against this background, sound research on the cognitive processing of electronic text is<br />
necessary both on fundamental and applied levels. To promote and facilitate research on<br />
reading electronic text, we have developed an authoring system to create electronic reading<br />
stimuli.<br />
The purpose of the present paper is twofold. First, we present a new graphical front-end tool<br />
for the computer based assessment platform TAO (the French acronym for technologybased<br />
assessment), the “Hypertext Builder”, which was developed to author items for<br />
electronic reading assessment. The Hypertext Builder was designed to facilitate the rapid<br />
development and implementation of complex electronic reading stimuli, covering all major<br />
text-types encountered in electronic reading such as websites, e-mail client environments,<br />
forums, or blogs.<br />
Second, after presenting the Hypertext Builder itself and demonstrating its features, we<br />
report first evidence for the proposition that Hypertext Builder created text stimuli capture<br />
specific features of electronic reading. We used Hypertext Builder created materials in an<br />
experiment designed to test the assumption that a greater degree of executive control is<br />
needed for the processing of hypertext compared to linear text because of the navigation<br />
demands imposed by hypertext. Sixty students read hypertexts and linear texts (withinsubject<br />
factor) under three secondary task conditions that either imposed no additional load,<br />
general dual-task load, or executive control load (between-subject factor).<br />
ENAC 2008 39
Symposium: Computer-based assessment / Paper 3:<br />
Component skills of electronic reading competence<br />
Johannes Naumann, Nina Jude, Frank Goldhammer, Thomas Martens,<br />
Heiko Rölke, Eckhard Klieme<br />
German Institute for International Educational Research (DIPF), Germany<br />
With the Internet having become a ubiquitous means for dissemination and distribution of<br />
opinions, news, and all other kinds of information around the world, skill in reading<br />
electronic documents may well be regarded as a key competence required for successful<br />
participation in society. Electronic documents are typically represented as non-linearly<br />
structured hypertexts. This means, on the one hand, compared to the reading of traditional<br />
printed text, that successful processing of electronic documents poses a number of<br />
additional demands on readers, such as making decisions whether to follow a certain link or<br />
not, or, if a link is followed, to keep in mind the original reading goal. On the other hand,<br />
electronic text allows for the implementation of signalling devices that may in fact facilitate<br />
processing, provided they are used adequately. Thus, reading of electronic text is a<br />
competence that cannot be easily mapped upon traditional text-processing skills. Rather, to<br />
successfully use electronic text, readers must have ample working memory resources to<br />
simultaneously accommodate for text processing and navigation in the first place. For that,<br />
basic reading processes need to be well-routinized as well, so that available working<br />
memory does not have be devoted to basic operations of text processing such as word<br />
recognition or semantic parsing. In addition, to deal with an electronic text’s non-linearity,<br />
e.g. to efficiently use navigational aids such as overviews or typed links, readers must have<br />
at their disposal adequate metacognitive strategies. Finally, computer skills may affect<br />
electronic reading competence, in that for reading electronic text readers must have at least<br />
some very basic computer knowledge, such as how to access an internet address or how to<br />
use a mouse.<br />
The present paper investigates which of these component skills actually affect individual<br />
competence in reading electronic documents. To assess electronic reading competence<br />
and related component skills, newly developed tests were implemented in an new testingtool.<br />
To assess electronic reading competence, this tool presents interactive stimuli that<br />
mimic real-life web sites, e.g. a medical or a job search site, with corresponding testquestions.<br />
Subsequently, students’ basic reading skill (lexical access and sentence<br />
comprehension), working memory capacity, metacognitive strategies and computer skill<br />
were assessed. A total of three-hundred students were sampled from 30 German schools.<br />
Test sessions lasted for three hours. Electronic reading competence was regressed on<br />
working memory capacity, basic reading skill, knowledge of metacognitive strategies, and<br />
computer skill. Using hierarchical linear models with students as level-1-units and schools<br />
as level-2-units, a substantial proportion of variance in electronic reading competence was<br />
explained by the proposed set of predictor variables. In addition, both level-1-intercepts and<br />
regression weights were found to vary between level-2-units (schools). As a consequence,<br />
future research should address not only which school-level conditions cause high average<br />
levels of electronic reading competence, but also the conditions under which electronic<br />
reading skill is more or less dependent on stable trait variables that cannot be changed<br />
easily, such as working memory capacity.<br />
40 ENAC 2008
Symposium: High-Stakes Performance-Based Assessment:<br />
Issues in High-Stakes Performance-Based Assessment of Clinical Competence<br />
Organiser: Godfrey Pell, University of Leeds, United Kingdom<br />
Chair/Discussant: Trudie Roberts, University of Leeds, United Kingdom<br />
In the current era of audit and accountability in medicine, stakeholders want assurance that<br />
graduates have attained the required level of clinical competence to be awarded their<br />
degree and a licence to practise. Final exit examinations are designed to provide that<br />
assurance.<br />
Within the field of medical education, the Objective Structured Clinical Examination (OSCE)<br />
is currently favoured for assessing clinical skills as it has been shown to be the most<br />
reliable, valid, fair and defensible format for this purpose (Harden and Gleeson;Newble).<br />
Assessing students’ clinical skills just prior to graduation provides the assurance that they<br />
have achieved the minimum level of clinical competence required for a licence to practice<br />
by both the General Medical Council (General Medical Council) and the Medical Council of<br />
Canada (Reznick, Blackmore, Cohen et. al., 1993).<br />
In an OSCE, candidates rotate through a series of time-limited ‘stations’ and perform a<br />
particular clinical task at each one. These tasks include clinical examination skills,<br />
communication skills or practical procedures (Boursicot and Roberts). In this way,<br />
candidates are tested across the range of skills and patient problems required for<br />
graduation and this is equated to ‘clinical competence’.<br />
Each station is observed by an examiner (usually a clinician or a trained observer) who<br />
scores the candidate using a checklist in which the steps of the particular clinical skill being<br />
assessed are listed as individual items. The examiners mark whether a candidate has<br />
performed each step correctly and the overall mark is the summation of the checklist item<br />
scores for any one station.<br />
The basic structure of an OSCE may vary; for example, the length of time allowed per<br />
station, whether checklists or rating scales are used for scoring, who scores (a clinician, a<br />
standardised patient, a trained lay observer) and whether real patients or manikins are<br />
used. However, the fundamental principle is that every candidate has to complete the same<br />
assignments in the same amount of time and is marked according to structured scoring<br />
instruments.<br />
Although the symposium is based on the OSCE, the issues being discussed should be of<br />
interest to delegates who use performance-based assessments for assessing professional<br />
competence.<br />
ENAC 2008 41
Symposium: High-Stakes Performance-Based Assessment / Paper 1:<br />
Lessons Learned from Administering a National OSCE for Medical Licensure<br />
David Blackmore, Medical Council of Canada, Canada<br />
Rationale<br />
Canada was the first country in the world to use a performance-based objective structured<br />
clinical examination (OSCE) as part of a national medical licensing examination. The<br />
Medical Council of Canada (MCC) awards the Licentiate (LMCC) which, in turn, is used as a<br />
prerequisite for licensure by the Canadian medical regulatory authorities. The pilot work and<br />
early testing of this examination took place in the late 1980s and examination<br />
implementation occurred in 1992. Since then, the Medical Council OSCE has undergone<br />
many changes in both its format and procedures. The experience gained from over<br />
35 administrations is the basis for the lessons learned.<br />
Methodology<br />
To be awarded the LMCC, an examinee must complete a two-part MCC Qualifying<br />
Examination (MCCQE). The MCCQE Part I is a one-day, computer-administered<br />
examination which the examinees usually take upon graduation from medical school. The<br />
examinee must then complete at least 12 months of postgraduate training before attempting<br />
the MCCQE Part II which is a multi-station OSCE administered to over 3000 examinees<br />
annually across 16 examination sites within Canada. Each OSCE consists of multiple<br />
stations where a physician examinee interacts with a standardized patient. A physician<br />
examiner observes the encounter in real-time and scores a checklist and global ratings.<br />
Some stations are followed by a written exercise and some stations contain a structured<br />
oral question administered by the physician examiner at the end of the encounter.<br />
Discussion<br />
In 1992, the MCCQE Part II consisted of 20 stations: 10 ten-minute patient-encounter<br />
stations and 10 five-minute patient-encounter stations followed by five-minute written<br />
exercises known as post encounter probes (PEPs). Two forms of this examination were<br />
constructed where one form was administered on a Saturday and the second form was<br />
administered on the following day. Two forms were required to accommodate the many<br />
examinees which needed to be tested concurrently across Canada. By 2008, the MCCQE<br />
Part II has evolved into an examination consisting of 14 stations: 7 ten-minute stations,<br />
5 five-minute stations followed by PEPs, and 2 pilot or pretest stations.<br />
Many lessons have been learned from 16 years of testing tens of thousands of physicians<br />
at multiple examination sites in a high-stakes licensing examination. Issues related to<br />
examination administration such as standardized patient recruiting and training; examiner<br />
recruitment, training, and retention; examination implementation, scoring, standard setting,<br />
and reporting have arisen over time. In addition, there have been several challenges arising<br />
from administering a multi-site examination across six different time zones. Some<br />
examination techniques have worked out better than others and the MCC examination has<br />
improved as it has matured. The MCC is also addressing new challenges related to<br />
changing examination content such as measuring professionalism and team participation.<br />
This presentation outlines the challenges and solutions to performance testing that have<br />
presented themselves as a result of the MCC employing the OSCE format in a high-stakes<br />
licensing examination.<br />
42 ENAC 2008
Symposium: High-Stakes Performance-Based Assessment / Paper 2:<br />
Quality Assurance through the OSCE Life Cycle<br />
Sydney Smee, Medical Council of Canada, Canada<br />
Rationale<br />
While the Objective Structured Clinical Examination (OSCE) is a format that allows for valid,<br />
reliable and fair testing of clinical skills, creating an OSCE that meets these criteria requires<br />
effort. When that effort is made, decisions based on the OSCE scores become defensible.<br />
A range of quality assurance measures can be taken throughout the life cycle of an OSCE<br />
to ensure that it meets testing standards such as those set by the American Educational<br />
Research Association (AERA), American Psychological Association (APA) and the National<br />
Council of Measurement in Education (NCME). Which quality assurance measures and to<br />
what degree should they be implemented is a judgment call based on the consequences of<br />
the decisions for test takers and test users.<br />
The Medical Council of Canada’s Qualifying Examination Part II is an OSCE scored by<br />
physicians and administered across multiple sites to candidates who have successfully<br />
completed 12 months of post-graduate training. This OSCE is a prerequisite for medical<br />
licensure in Canada and so considerable effort is made to ensure the testing process is fair<br />
and the scores are sufficiently valid and reliable for making high-stakes pass-fail decisions.<br />
The Medical Council’s quality assurance practices and the rationale behind them are<br />
discussed so the value of these practices for other settings can be considered.<br />
Methodology<br />
The processes used by the Medical Council at each of five stages in an OSCE cycle will be<br />
described, with links to the 1999 Standards for Educational and Psychological Testing<br />
(AERA, APA, & NCME):<br />
1. Validity through Case Development<br />
2. Improving Reliability with Standardized Patient Training, Staff Orientation and Examiner<br />
Briefing<br />
3. Steps to Ensure Fair OSCE Administration<br />
4. Validity and Reliability - Psychometrics and Standard Setting<br />
5. More about Fairness - Incidents and Appeals<br />
Discussion<br />
The discussion will look at the importance of these quality assurance processes, the<br />
Medical Council’s rationale for certain approaches and the implementation challenges that<br />
have been encountered over sixteen years experience with the Part II OSCE. Quality<br />
assurance minimizes the risk of false positive and false negative pass-fail decisions.<br />
Without quality assurance, pass-fail decisions are not defensible. Therefore the discussion<br />
will consider which of the approaches used for this high stakes OSCE could be adapted to<br />
other settings; for example, medical schools.<br />
ENAC 2008 43
Symposium: High-Stakes Performance-Based Assessment / Paper 3:<br />
Investigating OSCE Error Variance when measuring higher level competencies<br />
Godfrey Pell, University of Leeds, United Kingdom<br />
Richard Fuller, University of Leeds, United Kingdom<br />
Rationale<br />
Standardization and reliability are major concerns with Objective structured Clinical<br />
Examinations (OSCEs), but quality metrics permit deeper analysis of examination<br />
performance. This paper investigates the relationship between OSCE structure and error<br />
variance (i.e., variance due to factors other than student performance), building on previous<br />
research into sources of error variance.<br />
Methodology<br />
Analysis of recent 3rd, 4th & 5th (final) year OSCE results from the University of Leeds is<br />
considered to highlight the important problems that exist with error variance in OSCE<br />
scores. The impact of revisions to examiner instructions and item checklists / mark sheets,<br />
most notably the inclusion of intermediate grade descriptors and a reduction in the number<br />
of checklist items, is then assessed using 2007 and 2008 5th year OSCE data.<br />
Discussion<br />
Although error variance may be simply defined as that variance which is due to factors other<br />
than differences in performance caused by varying student ability, it remains possible to<br />
construct a variety of models of differing complexity to quantify this error. Discussion will<br />
include consideration of which of these models may be the most appropriate.<br />
Other questions to be addressed include:<br />
• What can we learn about problem OCSE stations from the metrics available to us, and<br />
how might this inform us with respect to development of improved assessments?<br />
• Should we have long or short checklists?<br />
• How can we measure higher level competencies?<br />
• What effects do these and other issues have on reliability?<br />
44 ENAC 2008
Symposium: High-Stakes Performance-Based Assessment / Paper 4:<br />
Beyond checklist scoring –<br />
clinicians’ perceptions of inadequate clinical performance<br />
Katharine Boursicot, University of London, United Kingdom<br />
Trudie Roberts, University of Leeds, United Kingdom<br />
Jenny Higham, Imperial College London, United Kingdom<br />
Jane Dacre, University College London, United Kingdom<br />
Rationale: There has been concern among medical educators and practitioners that OSCEs<br />
test only practical technical skills and do not scrutinise the deeper layer of understanding<br />
supporting those skills. The notion of having a checklist of items to measure the performance of<br />
clinical skills has been criticised for being reductionist and failing to capture the higher-order<br />
nature of clinical judgement and diagnosis. To try to understand what elements assessors felt<br />
were not captured by existing checklists we instituted the use of ‘Cause for Concern’ forms<br />
whereby examiners could report a candidate’s performance they found to be unacceptable.<br />
Methodology: Four medical schools introduced the use of the Cause for Concern forms as an<br />
addition to the checklists in each of their respective graduation OSCEs . The OSCEs consisted<br />
of 17 to 26 different stations across the four schools. The examiners were clinicians, some of<br />
whom were academics from medical school faculties. All were medical or other healthcare<br />
practitioners involved in the teaching and assessment of medical students; all were familiar with<br />
the standards expected at graduation and current professional medical practice. All examiners<br />
were offered training sessions on examining at OSCEs. Examiners were briefed to use the<br />
‘Cause for Concern’ forms about students whose performance in any area would mean that<br />
patient care could be compromised but where the issues causing concern were not captured by<br />
the checklists. After the OSCEs, the forms were collected and analysed. Three raters<br />
independently reviewed the comments from all the medical schools and derived recurring<br />
themes. These were refined into seven themes. The raters independently ascribed all the<br />
comments to each of the themes and dominant themes were identified.<br />
Results: The total numbers of forms completed from all four medical schools was 152 out of<br />
a total of 25,800 student-examiner encounters; this represents a reporting rate of 0.6%. The<br />
seven themes identified by the three raters were: Clinical skills–poor technique; Clinical<br />
skills–failure to elicit or recognise correct signs; Poor diagnostic ability (interpretation of<br />
signs); Poor/inadequate knowledge; Professional behaviour–personal (e.g. anxiety, lack<br />
of/over-confidence, appearance); Professional behaviour–towards patient (e.g. rough,<br />
inappropriate attitude); Poor communication skills (written and oral).<br />
Discussion: There was much commonality in the themes reported across the four medical<br />
schools. The themes reflected the anecdotal evidence which prompted this study, in that<br />
clinician examiners were concerned that professional behaviours and higher level<br />
diagnostic skills were lacking in students who nonetheless managed to pass the OSCE<br />
based on their checklist score. The main areas of concern which were reported related to<br />
fundamental medical skills or professional behaviour. Overall the reporting rate was very<br />
low, indicating that the overwhelming majority of students satisfied the clinician examiners<br />
with their clinical competence and professional behaviour. When making decisions about<br />
graduation, the ‘Cause for Concern’ forms could be used in addition to checklists to gain a<br />
fuller perspective on those students where there are concerns about meeting minimum<br />
clinical competence requirements and unacceptable professional behaviours.<br />
ENAC 2008 45
Symposium: Peer Assessment:<br />
Towards (quasi-) experimental research on the design of peer assessment<br />
Organiser: Dominique Sluijsmans, Open University, The Netherlands<br />
Peer assessment is an arrangement where equal status students judge a peers’<br />
performance with a rating scheme or qualitative report (Topping, 1998), and stimulates<br />
students to share responsibility, reflect, discuss and collaborate with their peers (Boud,<br />
1990; Orsmond, Merry, & Callaghan, 2004). To date, most studies on peer assessment<br />
treat the quality of peer assessment – i.e., performance improvement and learning benefits<br />
– as a derivative of the accuracy of peer marks compared to teacher marks (Falchikov &<br />
Goldfinch, 2000). In addition, many researchers advocate that peer assessment has a<br />
positive effect on learning, but the empirical evidence is either based on student self-report<br />
ratings or anecdotal evidence from case studies and not on standardised performance<br />
improvement measures. Furthermore, peer assessment is rarely studied in (quasi-)<br />
experimental settings (comparing an experimental group to a control or baseline group),<br />
which considerably limits the claims and evidence regarding specific conditions that are<br />
believed to affect learning. Hence, the empirical support for learning effects, as well as for<br />
specific peer assessment conditions, is scarce.<br />
In this symposium, three contributions are presented that are interlinked in three ways: 1)<br />
they investigate peer assessment from a (quasi) experimental perspective; 2) they address<br />
the value of peer assessment for the individual learner, and 3) they aim at providing clear<br />
guidelines on the design of peer assessment.<br />
In the first contribution by van Zundert et al., it is studied how task complexity and the<br />
structure of the peer assessment formats used by students to conduct the peer<br />
assessment, affects students’ domain-specific learning and their peer assessment skill. In<br />
addition, the impact of cognitive load, students’ attitudes towards peer assessment, and<br />
transfer are considered. Finally, generalisability analyses will performed to determine<br />
reliability of peer assessments in relation to different formats. In the second contribution by<br />
Sluijsmans et al. peer assessment is investigated from the perspective of group work. The<br />
effects of four peer assessment design variations on individual marks and the reliability of<br />
peer assessment are investigated. The results show that the design of a peer assessment<br />
method strongly influences the transformation of a group mark into individual marks. The<br />
third contribution by Strijbos et al. studies the impact of peer feedback content and<br />
characteristics of the sender. Previous studies show that students express concerns about<br />
the fairness and usefulness of peer assessment, and Strijbos et al. hypothesise that this<br />
finding may be related to sender characteristics that feedback perception may influence the<br />
effect of peer feedback and subsequent performance.<br />
For peer assessment research to advance, identifying the gap between what we know<br />
about peer assessment and what we claim about peer assessment is crucial. In this respect<br />
we advocate more (quasi) experimental research – enabling the investigation of specific<br />
components and conditions as compared to holistic evaluations via case studies. Although<br />
the thick description in such case studies of specific peer assessment provide a wealth of<br />
evidence for hypothesis generation, these theories or guidelines should subsequently be<br />
tested in controlled experimental settings to warrant generalisations.<br />
46 ENAC 2008
Symposium: Peer Assessment / Paper 1:<br />
The effects of peer assessment format and task complexity on learning and<br />
measurements<br />
Marjo van Zundert, Open University, The Netherlands<br />
Dominique Sluijsmans, Open University, The Netherlands<br />
Jeroen van Merriënboer, Open University, The Netherlands<br />
Former research by Van Zundert, Sluijsmans, and Van Merriënboer (in progress)<br />
emphasised that variety in peer assessment practices and holistic reports (i.e., without<br />
specifying all variables) in peer assessment research reveal the nesessty to specify what<br />
exactly contributes to learning and measurements (e.g., reliability). Moreover, it was shown<br />
that the share of (quasi-) experimental peer assessment studies is insufficient. The current<br />
study examined the effects of peer assessment formats and task complexity on learning<br />
(domain skill, peer assessment skill, and student attitudes) and measurements (agreement<br />
between peer and expert assessment, between multiple peer assessments, and between<br />
qualitative and quantitative assessments). It was assumed that a highly structured peer<br />
assessment format increases learning and measurements. High structured formats differed<br />
from low structured formats by integration of first and higher order skills, whole-task<br />
approach, and low cognitive load. It was additionally assumed that a high structured format<br />
is especially beneficial for complex tasks. Complex tasks induce a higher cognitive load,<br />
which was achieved by editing simple tasks according to three principles of element<br />
interactivity of Cognitive Load Theory. Participants were 110 secondary education students.<br />
They worked in an electronic learning environment through a series of questionnaires and<br />
tasks. The students were randomly assigned to one of the four conditions: low/high<br />
structured formats – simple/complex tasks. After an introduction students logged on to a<br />
computer and completed an attitude questionnaire. Then they studied four study tasks<br />
accompanied by a peer assessment format. In the tasks, which consisted of short<br />
descriptions of biology research, students were supposed to learn to recognise the six steps<br />
of scientific research (i.e., observation, problem statement, hypothesis, experimental stage,<br />
results, and conclusions). The study tasks were read carefully. After each study task<br />
students reported the cognitive load measure of Paas, Van Merriënboer and Adams (1994).<br />
Next, they solved two transfer tasks (attaching the steps of research to the matching<br />
research description), again followed by the cognitive load measure. Subsequently students<br />
received two peer assessment tasks (evaluating the solution of a fictitious peer), and<br />
reported the cognitive load measure. Finally the attitude questionnaire was completed again<br />
and students logged out. Data will be analysed by ANOVA and generalisability analyses. As<br />
opposed to much previous research, this study attempted to clarify peer assessment effects<br />
by applying quasi experimental research, and by using specific instead of holistic reports.<br />
More (quasi-) experimental research is required in the future, to provide transparency in<br />
peer assessment variety and to account for peer assessment effects.<br />
ENAC 2008 47
Symposium: Peer Assessment / Paper 2:<br />
Modelling the impact of individual contributions on peer assessment during<br />
group work in teacher training: In search of flexibility<br />
Dominique Sluijsmans, Open University, The Netherlands<br />
Jan-Willem Strijbos, Leiden University, The Netherlands<br />
Gerard Van de Watering, Eindhoven University of Technology, The Netherlands<br />
During collaborative learning students work together to accomplish a specific group task,<br />
e.g. performing an experiment, writing a collaborative report, carrying out a group project or<br />
a group presentation. These group tasks aim to facilitate peer learning and the development<br />
of collaboration skills. However, since the assessment strongly influences learning in any<br />
course, utilising collaborative learning must have assessment that promotes collaboration<br />
(Frederiksen, 1984). Social loafing (tendency to reduce individual effort when working in<br />
groups compared to individual effort expended when working alone; see Williams & Karau,<br />
1991) and ‘free riding’ (an individual does not bear a proportional amount of the group work<br />
and yet s/he shares the benefits of the group; see Kerr & Bruun, 1983) are two often voiced<br />
complaints by students regarding unsatisfactory group-work experiences (Johnston & Miles,<br />
2004). Positive interdependence and individual accountability (the latter explicitly introduced<br />
to counter free-riding) play a crucial role during group work In order for a group to be<br />
successful, all group members need to understand that they are each individually<br />
accountable for at least one aspect of the group task. Teachers regard a peer assessment<br />
as a valuable and practical tool to reduce social loafing and free-riding effects. Moreover, it<br />
can serve as a tool to increase students’ awareness of individual accountability and to<br />
promote positive interdependence.<br />
Although a fair amount of studies acknowledges the significance of individual contributions<br />
in groups via peer assessment (Lejk & Wyvill, 1996), there are two serious weaknesses in<br />
the design of the methods that are used to transform group marks into individual marks<br />
using peer ratings. First, they take a psychometric perspective (calculate) rather than an<br />
edumetric (design) perspective – which fits better with contemporary developments such as<br />
competency-based education. Second, they are weak when it comes to flexibility of peer<br />
assessment in group work: the students are not involved in choosing criteria, weighting of<br />
criteria and their participation in peer assessment is obligatory. In this study, the effects of<br />
four peer assessment design variations on individual marks and the reliability of peer<br />
assessment are investigated. These variations are modelled using the baseline dataset with<br />
self- and peer assessment ratings of 72 teacher training students in their fourth year for the<br />
Bachelor of Education. The results show that 1) the design of a peer assessment method<br />
strongly influences the transformation of a group mark into individual marks, and 2) that the<br />
reliability of a peer assessment depends on the weight of the criteria, the rating scale, the<br />
inclusion of self-assessment, and maximum deviation of an individual mark from the group<br />
mark. A more in-depth discussion of the goal of peer assessment and its implications for the<br />
design of peer assessment with respect to the flexible and adaptive use of peer assessment<br />
in group work is required.<br />
48 ENAC 2008
Symposium: Peer Assessment / Paper 3:<br />
Peer feedback in academic writing: How do feedback content, writing ability-level and<br />
gender of the sender affect feedback perception and performance?<br />
Jan-Willem Strijbos, Susanne Narciss, Mien Segerss<br />
Leiden University, The Netherlands<br />
The shift towards student-centered learning places a high emphasis on students to assume<br />
responsibility for their learning. Peer assessment is well-suited in this respect: equal status<br />
students judge a peers’ performance with a rating scheme or a qualitative report (Topping,<br />
1998). Many peer assessment researchers stress that feedback is essential for<br />
performance improvement and learning benefits, but the evidence for performance and<br />
learning effects is scarce. Moreover, the impact of feedback content types is hardly studied.<br />
Students also express concerns about the fairness and usefulness of peer assessment<br />
(Cheng & Warren, 1997), which appears related to sender characteristics that may<br />
influence the effect of peer feedback (Leung, Su, & Morris, 2001).<br />
We conducted two studies to investigate the impact of feedback content and sender<br />
characteristics (writing ability-level and gender) using a factorial pre-test treatment post-test<br />
control group design in the context of academic writing in higher education. Study 1<br />
consisted of a two-way factorial design (Nexp = 71, Ncontrol = 18) and Study 2 had a threeway<br />
factorial design (Nexp = 160, Ncontrol = 19). In each study subjects in the experimental<br />
condition received a scenario in which a fictional student received fictional peer feedback.<br />
Subjects’ feedback perception (i.e., fairness, usefulness, acceptance, willingness to improve<br />
and affect) and performance (text revision quality) were investigated.<br />
Study 1: Subjects in experimental conditions received concise evaluative feedback (CEF) or<br />
elaborated informative feedback EIF) and ability-level of the sender was high or low. A<br />
principal component analysis revealed the latent factor ‘Perceived Adequacy of Feedback’<br />
(PAF, comprising fairness, usefulness and acceptance, 9 items, α = .89). MANOVA<br />
revealed that EIF is perceived as more adequate. A two-way interaction for affect (AF, 6<br />
items, α = .81) revealed that students with EIF by a high-ability peer express more negative<br />
affect compared to CEF by a high-ability peer, and the opposite was observed for feedback<br />
by a low-ability peer. A repeated measures MANOVA showed that performance increases<br />
in all conditions over time, but performance for EIF by a high-ability peer was significantly<br />
lower compared to the control condition.<br />
Study 2: In addition to the variations in Study 1, the experimental conditions also varied with<br />
respect to gender of the sender (typical male/ female name: Joost versus Astrid). As in<br />
Study 1, a principal component analysis revealed the latent factor PAF (α = .90). MANOVA<br />
revealed that EIF is perceived as more adequate. A three-way interaction effect revealed<br />
that subjects’ willingness to improve (WI, 3 items, α = .70) for the feedback by ability-level<br />
combinations appears to be different for gender of the sender. The performance data are<br />
currently being analysed and will be presented by the time of the conference.<br />
The results of both studies reveal that feedback perception is an important aspect to be<br />
considered with respect peer assessment. It should be noted that both samples were<br />
female dominated (7 to 1 ratio), which limits strong generalisations. However, a comparative<br />
study with a more even gender distribution is currently conducted in secondary education.<br />
ENAC 2008 49
Symposium: Assessment in kindergarten classes:<br />
Assessment in kindergarten classes:<br />
experiences from assessing competences in three domains<br />
Organiser: Marja van den Heuvel-Panhuizen, FIsme, Utrecht University, The Netherlands/<br />
IQB, Humboldt University Berlin, Germany<br />
Chair/Discussant: Kees de Glopper, University of Groningen, The Netherlands<br />
Assessment of young children is a challenging endeavour. Defining and measuring<br />
development and learning in young children is quite complex, for a variety of reasons (see<br />
e.g. Shepard, 1994). Test performance of 4- and 5-year-olds can be highly variable. With<br />
young children, mismatches between the content of tests and children’s existing knowledge<br />
and experiences lie in wait. Young children may also encounter difficulties in participation,<br />
due to unfamiliarity with rules and conventions for obtaining responses. In response to<br />
these problems, several guiding principles for the assessment of young children have been<br />
proposed: assessments should bring about benefits for children, they should reflect and<br />
model progress toward important learning goals, their methods must be appropriate to the<br />
development and experiences of young children, and they should be tailored to a specific<br />
purpose.<br />
This symposium discusses experiences with assessing mathematical, literary and socialemotional<br />
competences in kindergartners. Children’s learning in these domains is<br />
investigated in three interlinked research projects that are part of the PICO research<br />
programme (PIcture books and COnceptual development). Each project aims to determine<br />
the instructive value of picture books and follows the same approach. First, potential<br />
contributions of picture books to children’s development are identified through analyses of<br />
picture books. Second, we develop ‘keys’ that help teachers to unlock the richness of<br />
picture books and to establish engaging and instructive interaction. We use design-based<br />
research to develop keys for 24 picture books. Third, we do a quasi-experimental study to<br />
assess the developmental yield of the picture books and their corresponding keys.<br />
To evaluate the effect of the intervention program each PICO-project developed procedures<br />
and tasks for assessment, trying to adhere to the abovementioned principles and, at the<br />
same time, doing justice to the nature of the competence domain that is assessed. This<br />
resulted in a set of tools for assessing young children in three quite diverse domains of<br />
children’s development in which different assessment formats are used ranging from<br />
individual to group assessment, from oral to written assessment, and from open questioning<br />
to multiple-choice questioning. What all the assessment tools have in common is that they<br />
are grounded in a picture-book context.<br />
In the symposium we like to share with the audience our experiences with developing the<br />
assessment tools and analyzing the collected data, and the knowledge we gained regarding<br />
the children’s development and the way to assess this. We hope to draw the audience into<br />
a discussion about the obstacles and opportunities in assessing young children’s<br />
development in different competence domains and to touch issues of further research.<br />
50 ENAC 2008
Symposium: Assessment in kindergarten classes / Paper 1:<br />
A picture book-based tool for assessing literary competence in 4 to 6-year olds<br />
Coosje van der Pol, Tilburg University, The Netherlands<br />
Helma van Lierop-Debrauwer, Tilburg University, The Netherlands<br />
Introduction: The paper addresses the challenging topic of assessing literary competence of<br />
young children. The study reported here is part of the PICO-li project, in which we<br />
investigate whether and how picture books contribute to the literary development of<br />
kindergartners. Literary competence starts by looking with children at picture book stories<br />
as aesthetic compositions of text and pictures. The composition of a story is based on<br />
literary and social codes and conventions. The PICO-li project investigates three subdomains<br />
of literary competence: understanding story characters, suspenseful story<br />
elements and ironic humour.<br />
Development of the assessment tool: The developed tool is partly based on the Narrative<br />
Comprehension (NC) task for assessing children’s comprehension of narrative picture<br />
books (Paris & Paris, 2001). The NC-task has been adapted in order to concentrate on<br />
literary codes and narrative conventions and their aesthetic evaluation by the reader.<br />
Description of the assessment tool: The PICO-li assessment tool uses the picture book<br />
Cottonwool Colin (2007) by Jeanne Willis and Tony Ross to elicit the children’s responses.<br />
This book covers all three sub-domains of literary competence. The assessment has three<br />
parts: first, the child is invited to go through the book and respond spontaneously. In the<br />
second round, the book is taken away and replaced by an electronic version. During the<br />
assessment a child views the scanned pages of the real book on a computer screen whilst<br />
listening to the text being read aloud through the speakers. This ensures that the story is<br />
read to all the children in exactly the same way. Afterwards the child is asked to retell the<br />
story. In the third round, the child answers ten questions related to the three sub-domains.<br />
The final question is a productive question about the story’s main character. His<br />
development from small and weak to tall and strong is the story’s main theme. At the end of<br />
the story Cottonwool Colin no longer seems an appropriate name for the protagonist. The<br />
child is asked to think up a more appropriate name for him.<br />
Data collection and analysis: Almost 100 children from 18 different classrooms have been<br />
tested individually using the PICO-li assessment tool. Their responses have been videorecorded<br />
and transcribed onto answer sheets. Data from the first round are analysed for<br />
spontaneous comments on pictures, story line and attempts at interpretation. The retellings<br />
from the second round are analysed for six story structure elements: setting; characters;<br />
goal/initiating event; problem/episodes; solution; resolution/ending. Scoring rubrics have<br />
been created with higher scores reflecting a more literary stance towards the story than<br />
lower scores.<br />
Discussion: Although we are still in the process of analyzing data, our first impression is that<br />
the tool reveals valuable information on the children’s development of literary competence.<br />
At the conference we will concentrate on what we found out about the children’s concepts of<br />
story characters. In our presentation we will discuss the implications of our analyses for the<br />
practice of assessment and the problems and benefits that may arise when assessing<br />
literary competence in young children.<br />
ENAC 2008 51
Symposium: Assessment in kindergarten classes / Paper 2:<br />
Assessing the social-emotional development of young children by means of<br />
storytelling and questions<br />
Aletta Kwant, University of Groningen, The Netherlands<br />
Jan Berenst, University of Groningen, The Netherlands<br />
Kees de Glopper, University of Groningen, The Netherlands<br />
Introduction: Social and emotional competences are important for adequate functioning.<br />
This is true for human beings at almost any age including kindergartners. Unclear is where<br />
and how young children can learn these competences. The PICO-sem (PIcture books and<br />
COncept development in the social and emotional domain) project is aimed at investigating<br />
whether the use of picture books in kindergarten classes can contribute to the development<br />
of these competences. The children are read a series of picture books that address a<br />
number of events that are identified as highly instructive for understanding and using social<br />
and emotional behavior. To evaluate the effect of the picture book program a tool is<br />
developed to assess the children’s social and emotional and to some extend moral<br />
development.<br />
The PICO-sem assessment tool: The tool consists of a series of about 40 tasks which are<br />
all connected to short sketches in which social and emotional components play a role.<br />
The complete administration of the test took about half an hour split up in two parts. The<br />
tasks do not require writing and reading skills. To avoid a too strong dependence from<br />
verbal skills, we used not only production tasks but also recognition tasks. In the latter, it is<br />
assessed with the help of pictures whether the children can recognize aspects of social and<br />
emotional behavior. Because the children might be influenced by these pictures and their<br />
description, these recognition tasks come after the production tasks.<br />
Data collection and analysis: In January and May 2008 about 110 children from 20<br />
kindergarten classes have been tested individually by two trained research assistants. The<br />
collected data are scored for the use of social-emotional expressions by the children. In the<br />
analysis we tried to explore whether there is a difference between the production and<br />
recognition tasks.<br />
Results and discussion: In our presentation we will focus on questions about the feelings of<br />
the main character of a short story. The results from the production tasks will be compared<br />
with them from the recognition tasks in which the children have to react to a set of depicted<br />
emotions. The data are all from the assessment in January. In this test, that was<br />
administered before the program was carried out, we found that only asking the children to<br />
tell about emotions did not reveal their deep understanding. When the children had to react<br />
to pictures their answers were more differentiated then in the production tasks.<br />
Assessing young children is rather challenging. An important issue for us is to know to what<br />
extent our assessment really catches the children’s understanding and knowledge about<br />
emotions. In the discussion we will address the question whether our approach to assess<br />
kindergartners’ social and emotional development is a fruitful avenue to follow.<br />
52 ENAC 2008
Symposium: Assessment in kindergarten classes / Paper 3:<br />
Assessing mathematical abilities of kindergartners:<br />
possibilities of a group-administered multiple-choice test<br />
Sylvia van den Boogaard, Utrecht University, The Netherlands<br />
Marja van den Heuvel-Panhuizen, FIsme, Utrecht University, The Netherlands/<br />
IQB, Humboldt University Berlin, Germany<br />
Introduction: Knowledge of children’s mathematics development is of crucial importance for<br />
offering them support for further learning. In the case of kindergartners (i.e. 4- and 5-yearolds<br />
who have not yet entered formal education), observations and interviews are mostly<br />
used to collect this information. Group-administered, multiple-choice tests often are not<br />
considered to be an adequate assessment tool for young children (e.g. Fuson, 2004).<br />
Nevertheless, this assessment format does have potential to provide relevant information<br />
about children’s development, as is shown earlier (see Van den Heuvel-Panhuizen, 1996).<br />
The present study builds on these previous experiences and aims at increasing our<br />
knowledge about kindergartners’ mathematical understanding and designing a multiplechoice<br />
test to reveal this understanding. The test is developed in the context of the PICOma<br />
(PIcture books and COncept development in mathematics) project that investigates<br />
whether and how picture books contribute to the mathematical understanding of<br />
kindergartners.<br />
The PICO-ma Test: The mathematical content included in the test covers three subdomains<br />
of early mathematics: number (with special attention to “structuring numbers”),<br />
measurement (in particular the theme “growth”), and geometry (with the focus on “taking a<br />
point of view”). The guiding principle for developing the test is offering children a meaningful<br />
and familiar context in which they can show their understanding regarding these content<br />
domains. Therefore, all items are presented in a picture-book-like style; most of the<br />
questions are inspired by picture-book stories.<br />
For most items, four alternative solutions are given. The children have to put a line under<br />
the correct solution. We made sure that there is no need for the children to read text or<br />
numbers. The accompanying questions are read aloud to the children.<br />
A draft version of the test was tried out on a number of individual children. After this, the<br />
final selection of items was made and several items were revised. The final test contains<br />
42 items; 14 items for each sub-domain.<br />
Data collection and analysis: In January 2008, about 400 children from 18 kindergarten<br />
classes took the test. The test was administered in two sessions with an interval of one<br />
week. Half the children of one class were assessed at a time. A trained research assistant<br />
led the children, who marked the right answer in their own test booklets, through the test.<br />
The collected data are scored as correct or incorrect and analyzed in connection with<br />
scores on a standardized mathematics test including classification, seriation, and<br />
comparison, and with information about age, sex, and socio-economic status.<br />
Presentation of results and discussion: In our presentation, we share our experiences with<br />
designing, administering, and analyzing the test, and give details about the psychometric<br />
quality of the test. We present our findings regarding the kindergartners’ mathematical<br />
understanding in the three sub-domains and how the sub-domains are related. In the<br />
discussion we like to reconsider the potential of group-administered paper-and-pencil tests<br />
for assessing young children’s mathematical development.<br />
ENAC 2008 53
54 ENAC 2008
Papers<br />
ENAC 2008 55
56 ENAC 2008
Ethical Dilemmas:<br />
‘Insider’ action research into Higher Education assessment practice<br />
Linda Allin, Northumbria University, United Kingdom<br />
Lesley Fishwick, Northumbria University, United Kingdom<br />
Newman (2000) documents that investigative enterprises with a focus on practice inquiry<br />
have emerged over the last decade, and have been identified as teacher research<br />
(Cochran-Smith and Lytle, 1993), action research (Winter, 1987) and reflective practice<br />
(Schon, 1987). The key aim of such studies is to try to solve the immediate and pressing<br />
day-to-day problems of practitioners. Within university departments, how to provide<br />
authentic, innovative and student centred assessment for learning within the context of<br />
increasing student numbers is one of the most pressing problems for programme managers<br />
and lecturers. Recent national student surveys highlight assessment and student feedback<br />
to be two key areas of dissatisfaction for students. We suggest that insider action research<br />
to understand staff experiences of setting assessments is a valuable starting point in<br />
evaluating and then improving assessment practices. The main aim of this paper is to<br />
examine tensions of teaching in Higher Education in relation to identifying the constraints<br />
and pressures which impact on lecturer’s daily work. The focus is on understanding the<br />
various influences and decision-making as education professionals in terms of uncovering<br />
assumptions which drive our assessment practices. The methodology is in-depth interviews<br />
asking staff their views on the purpose of assessment, the barriers to setting innovative<br />
assessments and their concerns over assessment processes. The study has led to a series<br />
of unanticipated ethical issues relating to conducting of qualitative research with colleagues.<br />
Oliver and Fishwick (2003) argue that there is room for discussion and debate about ethical<br />
considerations in qualitative work. Such debates focus on key principles including not doing<br />
harm (nonmaleficence), justice, autonomy and research related benefits for participants.<br />
Several papers discuss practical ethical problems, but they are often are more oriented<br />
towards issues with involving students, rather than staff, as participants (Ferguson, Yonge<br />
and Myrick, 2004; Hammack, 1997). In this paper, we aim to stimulate discussion into some<br />
of the ethical dilemmas faced by ‘insider’ research into assessment. We highlight the<br />
experiences of a CETL assessment for learning team in developing and implementing<br />
research into staff views and experiences of assessment within one particular university<br />
department. Such ethical dilemmas include insider/outsider perspectives, role conflicts,<br />
accessing staff and the process of interviewing as well as the more usually identified ethical<br />
concerns relating to informed consent, anonymity, confidentiality and the right to withdraw.<br />
A key concern for participants became the issue of trust and safeguarding privacy as well<br />
as assuring anonymity. In the paper we reflect on discussions within the team following the<br />
pilot study, and identify actions taken to address some of the dilemmas encountered. We<br />
identify the need to take a critically reflective stance on research into assessment practices<br />
and highlight the way in which minimising power relations and creating an atmosphere of<br />
trust are central if such research is to reach its purpose of enhancing assessment practice.<br />
ENAC 2008 57
Reciprocal Peer Coaching as a Formative assessment strategy:<br />
Does it assist student to self regulate their learning<br />
Mandy Asghar, Leeds Metropolitan University, United Kingdom<br />
Research has shown that cognitive gains are significantly higher in pairs that work together<br />
when compared to students studying independently (Ladyshewsky 2000, Topping 2005).<br />
Higher achievement, more caring and supportive relationships, greater psychological<br />
health, social competence and self esteem are all valuable consequences of introducing<br />
peer assisted learning strategies into the curriculum. In reciprocal peer coaching students<br />
goals are inter-related and the most successful outcome is dependent on mutual coaching.<br />
Reciprocal peer coaching is used as an innovative formative assessment strategy to test<br />
the competency of physiotherapy students’ abilities to carry out the practical skills required<br />
to become a successful therapist. Traditionally these skills were assessed exclusively in a<br />
summative format at repeated points throughout level 1 which resulted in many students<br />
trailing failure throughout the year. Module evaluation provided anecdotal evidence of the<br />
benefits of this change in the assessment strategy. A subsequent qualitative research<br />
project has explored “students’ perceptions of reciprocal peer coaching as a strategy to<br />
formatively assess practical skills”. Individual interviews and focus groups were used to<br />
collect data which has been analysed from a phenomenological perspective that considers<br />
the lifeworld as a lens through which to view the students’ lived experience (Ashworth 2003)<br />
Initially 4 themes have emerged from the data and include Motivation and Learning, the<br />
Emotional Experience of Learning, Learning Together and Contextualising the Learning<br />
Experience. Although students valued the feedback about their knowledge and abilities from<br />
the formative assessment process they expressed frequently a willingness to engage with<br />
reciprocal peer coaching as it provided that “pressure” which made them study. When<br />
considering the theoretical models of self regulation of learning many of the participants<br />
described a view that fitted with this model. Key aspects of which include self efficacy,<br />
motivation and emotion, the nature of an individuals goals and the ability to engage<br />
metacognitive processes. Issues identified included time management and students<br />
tendency to procrastinate as they find it hard to set themselves short term goals but that<br />
they felt that this formative assessment strategy helped them to manage. The variance<br />
between student goals some seeking mastery goals, others performance goals and the<br />
associated emotions of angry, anxiety, frustration, and relief associated with this<br />
assessment were all reported by these level 1 students.<br />
Self regulation in new environments and subject areas can be difficult for students who as<br />
novices often fail to employ metacognitive strategies to set goals for themselves and self<br />
assess their progress, many tending to compare themselves with others in order to judge<br />
the need to learn (Zimmerman 2002). It is suggested that self regulation is not easy and one<br />
that requires a scaffolding of strategies to encourage its development. (Pintrich 1999)<br />
Although recognised for its valuable role in the provision of feedback (and indeed the<br />
influence this may have on students self –efficacy), formative assessment has a role in<br />
assisting students being able to self regulate their learning, learning how to learn. It is this<br />
theory related dimension to formative assessment that I would like to discuss.<br />
58 ENAC 2008
Using an adapted rank-ordering method to investigate<br />
January versus June awarding standards<br />
Beth Black, Cambridge Assessment, United Kingdom<br />
Aims: The dual aims of this research were (i) to pilot an adapted rank-ordering method and<br />
(ii) to investigate whether UK examination awarding standards diverge between January<br />
and June sessions.<br />
Background: Standard maintaining is of critical importance in UK qualifications, given the<br />
current ‘high stakes’ environment. At qualification level, standard maintaining procedures<br />
are designed to ensure that a grade A in any particular subject in one year is comparable to<br />
a grade A in another year, through establishing the equivalent cut-scores on later versions<br />
of an examination which carry over the performance standards from the earlier version.<br />
However, the current UK method for standard setting and maintaining - the awarding<br />
meeting as mandated by the QCA Code of Practice (2007) - introduces the potential for the<br />
awarding standards of the January and June sessions to become disconnected from each<br />
other. Additionally, from a regulatory and comparability research point of view, the January<br />
sessions have been largely ignored, despite the increasing popularity of entering candidates<br />
for January units since Curriculum 2000.<br />
Given the difficulties in quantifying relevant features of the respective cohorts, (e.g. the<br />
January candidature is more unstable), and the problems in meeting the assumptions<br />
necessary for statistical methods (e.g. Schagen and Hutchison, 2008), arguably the best<br />
way to approach this research question is to use a judgemental method focusing on<br />
performance standards. In this study, the chosen method involves expert judges making<br />
comparisons of actual exemplars of student work (‘scripts’).<br />
Method: A rank-order method was employed adapted from Bramley (2005). Archive scripts<br />
at the key grade boundaries (A and E) from the previous six sessions (comprising three<br />
January and three June sessions) from two AS level units in different subjects were<br />
obtained. Whilst previous rank order exercises (e.g. Black and Bramley, in press) required<br />
judges to rank order ten scripts per pack spanning a range of performance standard, in this<br />
study each exercise involved scripts which were, at least notionally, of more similar quality<br />
(e.g. all exactly E grade borderline scripts) and therefore an adaptation was required. Rankordering<br />
sets of three scripts retains many of the advantages of rank-order over traditional<br />
Thurstone paired-comparisons as well as an additional advantage - asking of judges a more<br />
natural psychological task: to simply identify, on the basis of a holistic judgement, the best,<br />
middle and worst script.<br />
Analysis: Rasch analysis of the rank order outcomes produced a measure of script quality<br />
for each script and, using an ANOVA, it was possible to examine effects of session and<br />
session type (i.e. January versus June). The research indicated that the two AS units<br />
displayed different patterns of performance standards for January versus June.<br />
Discussion: The discussion will be research-related and practice-related. The paper will<br />
address the potential for using this method to investigate comparability in a variety of<br />
contexts, and implications for standard-maintaining processes.<br />
ENAC 2008 59
Generating dialogue in coursework feedback:<br />
exploring the use of interactive coversheets<br />
Sue Bloxham, University of Cumbria, United Kingdom<br />
Liz Campbell, University of Cumbria, United Kingdom<br />
This two year study examines feedback designed to create a dialogue between tutor and<br />
student without additional work for staff. Research is developing conceptual understanding<br />
regarding how feedback can effectively contribute to student learning (Higgins 2000, Gibbs<br />
& Simpson 2004, Nicol & Macfarlane-Dick (2004), Brown & Glover, 2006). Emphasis is<br />
being paid to the notion of feedforward (Hounsell, 2006) designed to reduce the gap<br />
between the standards students are expected to achieve and their current level of<br />
performance (Sadler 1998). Studies have examined the extent to which different types of<br />
tutor feedback better enable students to ‘close the gap’ (Brown & Glover 2006). Evidence<br />
suggests that feedback tends to focus on assignment ‘content’ whereas students find<br />
comments on their ‘skills’ to be more useful for future writing (Walker, 2007). Furthermore,<br />
feedback which elaborates on corrections is rare (Millar, 2007), but is considered more<br />
likely to help students make the link between the feedback and their own work (Brown &<br />
Glover).<br />
Failure to understand feedback is also associated with the tacit discourses of academic<br />
disciplines (Higgins 2000). However, learning tacit knowledge is an active, shared process,<br />
and thus writers such as Ivanic et al (2000) and Northedge (2003a) stress the importance of<br />
feedback which seeks to engage the student in some form of dialogue. This theoretical<br />
approach suggests that tutor-student dialogue could significantly aid feedback for learning,<br />
enabling students to understand feedback so that they can act on it to ‘reduce the gap’.<br />
This study emerged from concerns that staff on an Outdoor Studies Programme were<br />
devoting inordinate amounts of time to written feedback whilst students were reporting that<br />
they did not receive enough, nor was there evidence that feedback was being used to<br />
improve future assignments. Consequently, staff attempted to set up a dialogue with<br />
students by providing written feedback in response to students’ questions about their work,<br />
requested on their assignment coversheets. In the second year of the experiment, training<br />
was given in asking effective questions.<br />
Data was collected in the form of their feedback questions, interviews with staff,<br />
administration of the Assessment Experience Questionnaire (Dunbar-Goddet,Gibbs, 2006)<br />
and a supplementary questionnaire asking students for their preferences for guidance and<br />
feedback. Coding of students’ questions indicated that they differed markedly in the quality<br />
of their questions and rarely posed queries about the ‘content’ of their assignment, being<br />
much more concerned with their ‘skills’. The quantitative data indicated high mean scores<br />
for ‘quantity and quality of feedback’ and ‘use of feedback’ although students gave mixed<br />
preferences for different types of feedback. Staff reported achieving a sense of dialogue,<br />
finding it easier to write feedback in response to specific questions. Evidence of the impact<br />
of ‘question’ training will also be presented.<br />
Discussion will consider whether the research findings support conceptual models regarding<br />
the place of dialogue in creating learning-oriented assessment. It will also consider the<br />
practical implications of attempting to create a dialogue with students, given resource<br />
constraints, tutor and student expectations and quality assurance.<br />
60 ENAC 2008
Reforming practice or modifying Reforms?<br />
The science teacher’s responses to MBE and to assessment teaching in Chile<br />
Saul Alejandro Contreras Palma, Chile<br />
In 1996, the biggest Education Reform in the history of Chile was about to be implemented.<br />
Its aim was to strengthen the teaching profession. Therefore a framework for good teaching<br />
(MBE) was developed, which was focused on the need to know what and how to teach and,<br />
on an assessment system establishing the standards for the teachers’ work. However, little<br />
is known about the impact that the reform and its tools have had on how teachers think<br />
about teaching and learning, particularly science subjects. As Smith and Southerland (2007)<br />
said a "missing link" has been the investigation and understanding of the interaction<br />
between the teachers’ internal structures and the externally imposed ones.<br />
In this context, it makes sense to investigate what teachers think and do, and the relation<br />
between this reality and the one proposed by education reforms. This study examines the<br />
interactions between teachers’ beliefs and their actions concerning what and how to assess,<br />
and what is the degree of coherence between the teachers’ models and the ones proposed<br />
by the reform. This exploratory study analyzed and compared six Chilean high school<br />
science teachers who all participated in the reform and its training programs (PPF). The<br />
dates were obtained through the MBE document, a questionnaire, an interview and a nonparticipant<br />
observation. The information presented here was subjected to a content analysis<br />
centered in the assessment category.<br />
Our results indicated three important issues: First, behind the framework for good teaching<br />
(MBE) exists a constructivist model that indicates what and how teachers should assess or<br />
evaluate their students. Second, unlike the teachers thinking, their practice –independently<br />
of their subject– is traditional and inconsistent with the proposals of the reform. Third, the<br />
teachers’ thinking is organized in several levels, which differ from each other. There is a<br />
difference between what teachers "think they do, they think that should be done and what<br />
they say do”. This difference is more consistent with the fact that the teachers do not<br />
implemented the reform in their classes although having agreed to and participated in the<br />
reform. Therefore, the teacher thinking influences the interpretation of the –sometimes<br />
contradictory– messages of the proposals of the reform.<br />
In consequence, what we are discussing is not the reform and its tools, because we<br />
recognize that they have led to an advance by introducing the assessment culture and the<br />
measure concept. However, we believe that it would have been much more efficient to put<br />
the fases of the process of change process in another order: First, to explore what teachers<br />
know and what they are able to do and then to set standards that determine the<br />
professional knowledge to be obtained. In other words, it is necessary to determine what<br />
and how certain aspects of the teachers’ model support or impede the implementation of<br />
reforms, their instruments and the professional development of teachers, and to work on<br />
these aspects to achieve a real impact.<br />
ENAC 2008 61
Expanding student involvement in Assessment for Learning:<br />
A multimodal approach<br />
Bronwen Cowie, Alister Jones, Judy Moreland, Kathrin Otrel-Cass<br />
University of Waikato, New Zealand<br />
Assessment for learning (AfL) encompasses actions to assess student learning and steps to<br />
move that learning forward (Black and Wiliam, 1998). These actions can undertaken by<br />
teachers or students but the ultimate goal is that students are actively engaged in<br />
monitoring their learning. AfL was initially theorized within a cognitive frame. In the last<br />
decade research has begun to consider the implications of sociocultural views of learning<br />
(Gipps, 2002). These expand the possibilities for student participation in AfL. In this paper<br />
we use data generated in primary science and technology classrooms to illuminate the<br />
affordances of multimodal assessment practices.<br />
This paper reports on one outcome of the classroom Interaction in Science and Technology<br />
Education (InSiTE) study. The study involved 12 primary teachers and over 800 students<br />
over three years. A key goal was to understand AfL interactions around science and<br />
technology ideas and practices and the factors that afforded these interactions. Student and<br />
teacher reflective interviews, and teacher and researcher joint planning, reflection and data<br />
analysis meetings complemented the classroom work. The classroom data generation<br />
methods were videos of teacher interactions; audio taping of teacher and student talk; field<br />
notes; and the collection of teacher documents and student work. Post lesson discussions<br />
and meeting days provided a forum for data analysis.<br />
The InSiTE teachers and students employed multiple and multimodal means (Kress et al.,<br />
1999) to make and communicate meaning. Their interactions encompassed talk, text, action<br />
and the visual mode. The teachers explicitly developed students’ oral language proficiency<br />
but almost invariably talk was augmented by action, writing and the visual mode. Written<br />
text anchored and augmented talk. Drawing was useful for illustrating and complementing<br />
talk when students could not express tentative ideas by talk alone. Actions, including<br />
gesture, were useful for demonstrating and illustrating skills and practices. A combination of<br />
modes multiplied meaning (Lemke, 2001). In the full paper we will present examples from a<br />
Year 1-3 students learning about fossils, Year 1 students designing and making kites, and<br />
Year 7-8 students designing and making musical instruments.<br />
Teacher and student talk plays a pivotal role in AfL interaction but talk is invariably<br />
anchored and augmented by other modes. When multiple modes are used in combination<br />
teacher-student AfL interactions are enriched. The likelihood that diverse groups of students<br />
are able to express what they know and can do is increased when classroom interaction is<br />
deliberately multimodal. Students do benefit from multimodal opportunities to gain feedback<br />
and to consider the ideas of others as part of their active engagement in AfL.<br />
References<br />
Black, P. & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education, 5(1), 7-74.<br />
Gipps, C. (2002). Sociocultural Perspectives on Assessment. In G. Wells & G. Claxton (Eds.) Learning for<br />
Life in the 21st Century (pp. 73-83). London: Blackwell Publishing Ltd.<br />
Kress, G., Jewitt, C., Ogborn, J. & Tsatsarelius, C. (2001). Multimodal teaching and learning: The rhetorics<br />
of the science classroom. London: Continuum.<br />
Lemke, J. (1990). Talking science: language, learning, and values. Norwood, N.J.: Ablex Pub.<br />
62 ENAC 2008
Assessment Center Method to Evaluate Practice-Related<br />
University Courses<br />
Julian Ebert, University of Zurich, Switzerland<br />
Introduction: To provide high-quality education at universities courses are evaluated. Mostly,<br />
written tests and subjective ratings are used to check for learning effectiveness of the course<br />
and students’ satisfaction. This discriminates courses with didactic concepts that are based<br />
upon activity and interaction (e.g. problem-based learning; Schmidt & Moust, 2000) because<br />
teaching and testing modus are different (Sternberg, 1994). Declarative and factual<br />
knowledge can be tested by written tests but procedural knowledge and skills require different<br />
assessment methods. Due to changes in curricula as a result of the “Bologna reformation<br />
process” Universities are increasingly challenged to foster their students’ meta-disciplinary<br />
competencies. Therefore, courses that intend not only to teach factual knowledge but rather<br />
train skills have to be given and evaluated concerning their efficiency. To evaluate the<br />
efficiency of such a practice-related university course concerning social and method skills in<br />
project management given at the University of Zurich an assessment center (AC) was<br />
developed, applied and analysed. The aim of the current paper is to present and discuss this<br />
innovative evaluation method for practice-related courses at universities.<br />
Methodology: 80 students and professionals of different subjects participate twice in 1-day-<br />
ACs before and after course attendance. The applied assessment methods varies from<br />
dyadic role-plays over planning and presentation tasks (for the evaluation of skill<br />
improvement) to written tests on procedural knowledge (to check for concordance of both<br />
applied measures). Additionally, questionnaires on motivation and work related selfassessments<br />
are being used. The assessors are specifically trained psychologists who<br />
achieved satisfying inter-rater reliability (ICC=. 75 on average). The assessees are – among<br />
others – tested on their abilities to delegate, lead discussions and argue, organize and<br />
present project plans, solve conflicts, mediate, and give feedback.<br />
Results: Results show significant increases in both procedural knowledge and skills. The<br />
baseline-corrected effect sizes for the subtasks of the procedural knowledge tests vary from<br />
d=.45 to d=2.07 (d=1.10 on average), those for the different assessment dimensions of the<br />
skill tests vary from d=.14 to d=1.32 (d=.57 on average). Nevertheless, increased<br />
knowledge did not automatically result in increased skills, which supports the transfer<br />
problem hypothesis. It also re-rises the question, whether written tests (even on procedural<br />
knowledge) sufficiently inform about the actual ability to perform the respective skills. Also,<br />
almost no gender effects have been found, i. e. male and female participants benefit equally<br />
from the course.<br />
Discussion: The findings indicate that different didactical concepts and teaching methods<br />
require different effectiveness testing and assessment centers seem appropriate to<br />
demonstrate the efficiency of skill-focused versus factual knowledge-focused courses.<br />
Students highly appreciated the detailed individual feedback that they received afterwards.<br />
We consider the opportunity to provide students with feedback beyond scores and grades<br />
the most important advantage compared to usual assessments at universities.<br />
Nevertheless, assessment centers are complex, time-consuming and expensive<br />
assessment methods and therefore not (yet) established in the field of course evaluation.<br />
We are interested in sharing our experiences with others who also work on innovative<br />
evaluation methods for innovative courses.<br />
ENAC 2008 63
Democracy, Assessment and Validity. Discourses and practices concerning<br />
evaluation and assessment in an era of accountability<br />
Astrid Birgitte Eggen, University of Oslo, Norway<br />
The paper is an empirically based discussion of the relationship between multiple<br />
understandings of democracy with the multiple purposes and practices of assessment.<br />
Assessment is seen as an asset of the overall evaluation processes at school and municipal<br />
levels. The conceptualization is inspired by three broad democratic evaluation orientations:<br />
elitist democratic evaluation, participatory democratic evaluation and discursive democratic<br />
evaluation as well as four dimensions of democracy (agency, voice, audience and<br />
influence). Underpinning the discussions are the various validity concerns of the democratic<br />
orientations, emphasizing in particular consequential, communicative, reflective and<br />
catalytic validity in additional to the traditional validities.<br />
This paper is presenting the main results of three ethnographic research projects among<br />
teachers and school leaders in secondary education concerning assessment and evaluation<br />
practices and discourses. These projects have been developed in cooperation with three<br />
municipal educational authorities and 20 school communities. The schools in the<br />
surroundings of Oslo have been participating in R&D projects during a phase of<br />
implementing “Kunnskapsløftet” and the “National evaluation program” (National curriculum<br />
(2006) combined with National strategy for assessment and evaluation) with possibilities for<br />
both summative and formative strategies. A consequence of national steering has been a<br />
cry for building assessment and evaluation literacy at both municipal and school level.<br />
A critical ethnographic research methodology based on emancipative, developmental and<br />
progressive ideology is signalling research as democratic enterprise. Data gathering has<br />
been twinned with in-service training of teachers and school leaders. Hence methods of<br />
instruction are closely connected to methods of inquiry. Issues grounded in both practices<br />
and theory has been accountability, democracy and ideological aspects like equity, equality,<br />
justice, values and ethics. The paper focuses on the democratic challenges of evaluation<br />
and assessment in an era of market driven accountability, however multiple accountabilities<br />
as well as multiple contents of democracy are identified in the participating communities.<br />
A situated learning perspective has been applied in order to view evaluation and<br />
assessment as joint enterprises depending on shared vocabulary and repertoire of<br />
assessment and evaluative tools in each community of practice. Consequently, the<br />
relevance of the traditional dichotomies of evaluation and assessment (summative and<br />
formative, internal and external etc) is questioned based on the findings, and boundary<br />
objects are introduced as an alternative analytical tool. The school communities find<br />
themselves within an overall ideological and epistemological controversy between a drive<br />
for goal oriented new public management steering combined with “evidence based”<br />
practices on one hand, and on the other hand the emancipative bottom up developmental<br />
strategies. Hence the projects points towards several tensions between the central and<br />
local governmental vocabulary and strategies for outcome measures and the discourses<br />
and practices in these schools. These projects have been feeding documentation for the<br />
development of the methodology and content of a program for teacher educators<br />
emphasizing assessment and evaluation literacy.<br />
64 ENAC 2008
Using Assessment for Learning:<br />
exploring student learning experiences in a design studio module<br />
Kerry Harman, Northumbria University, United Kingdom<br />
Erik Bohemia, Northumbria University, United Kingdom<br />
This paper explores the relationships between assessment for learning elements and<br />
student learning experiences in a design studio module. Our focus is on the Global Studio<br />
(Bohemia & Harman, 2008 forthcoming), a design module recently conducted at<br />
Northumbria University. Using a case study methodology with the aim of compiling rich,<br />
practice-based knowledges (Denzin & Lincoln, 2005; Gherardi, 2006), we draw on data<br />
gathered throughout the development and delivery of a particular design studio module in<br />
order to undertake our analysis.<br />
In the first part of the paper we briefly describe the Global Studio with a focus on the overall<br />
aims and the structure of the module. One aim of the Global Studio was the development of<br />
distance communication skills, thereby preparing students for work in geographically<br />
distributed workgroups. Thus, an important aspect of the course was the incorporation of<br />
the element of distance between geographically distributed student design teams. We also<br />
outline the assessment for learning elements that we use in our analysis. These include an<br />
emphasis on authentic assessment tasks, the extensive use of ‘low stakes’ confidence<br />
building opportunities, the provision of a learning environment that is rich in both formal and<br />
informal feedback and the development of students’ abilities to evaluate their own progress<br />
(McDowell et al., 2006; Sambell, Gibson, & Montgomery, 2007).<br />
Using the above assessment for learning framework we map various assessment for<br />
learning elements used in the Global Studio. We suggest in the first section of the paper<br />
that a number of assessment for learning elements were implicitly embedded in the<br />
structure and delivery of this particular module.<br />
In the second part of the paper we explore the relationships between assessment for<br />
learning elements (implicitly) used in the module and student learning experiences. Drawing<br />
on student evaluation data, both qualitative and quantitative, collected throughout the<br />
module we examine the learning experiences of students undertaking the module. Our<br />
focus here is on what students considered useful in the module in terms of learning and the<br />
links with assessment for learning elements. This analysis contributes to the collection of<br />
‘rich’ case study material on assessment for learning in Higher Education with a focus on<br />
the subject area of design.<br />
We conclude that the assessment for learning elements used in the analysis provided a<br />
useful frame for examining student learning experiences in this particular design studio<br />
module. Therefore, we suggest that assessment for learning may provide a useful language<br />
for developing ongoing discussion and research in relation to teaching and learning in the<br />
subject area of design. For example, the following research questions might be explored:<br />
does the design studio, in general, incorporate assessment for learning elements? And if<br />
so, how are these contributing to enhanced student learning experiences?<br />
ENAC 2008 65
Chasing Validity – The Reality of Teacher Summative Assessments<br />
Christine Harrison, Paul Black, Jeremy Hodgen, Bethan Marshall, Natasha Serret<br />
King's College London, United Kingdom<br />
The King’s-Oxfordshire-Summative-Assessment-Project (KOSAP) was an 18 month project<br />
aimed to investigate how to help teachers enhance the validity and reliability of their<br />
assessments so that these can play a significant and trustworthy part in all summative<br />
assessments of their students. This was a collaborative development between teachers in<br />
three Oxfordshire schools, their schools’ managements, assessment and subject advisers<br />
in the Local Education Authorities, and experts in school assessment from King’s College<br />
London. It involved investigation of the possibilities and practicability of assessment of year<br />
8 (Y8) pupils within the domains of English and of mathematics. Key research foci were the<br />
constraints and affordances that arise as teachers take a more active part in designing,<br />
using and evaluating summative assessment tools and activities and the subsequent effects<br />
in the classroom as the formative-summative interface is brought closer together.<br />
The intention of this mixed method research (through interviews, field notes, teacher writing<br />
and transcripts from teacher meetings) was to work with teachers to discover their working<br />
assessment practices, how they judged and valued the assessment tools that they used<br />
and whether they could be supported in improving their assessment tools and practices.<br />
Our interest lay in teachers’ perceptions, skills and practices and we wanted to do more<br />
than simply evaluate whether teachers could implement assessments that were provided for<br />
them. Rather, we wanted to understand the ways in which they interlaced assessment with<br />
curriculum and pedagogy through allowing them to explore for themselves how they might<br />
develop and evolve better teacher assessment<br />
The research revealed how the pervasiveness of tests constrained teachers’ own<br />
summative assessments. Teachers felt the pressures of the external tests system, in part<br />
through the priority given to the published test results by school managements and by<br />
parents and pupils, both to achieve in these tests, but also and to report on learning in<br />
terms only of a single level or grade. We also found that the teachers’ grasp and application<br />
of the principles that guide quality in assessment, notably the concept of validity, seemed<br />
weak. Linked to this was the generally conservative attitude that the teachers have towards<br />
the task of making summative judgments, and this was recognised by the project teachers.<br />
There is a general acceptance of the tests and tasks that they already do, despite their<br />
concerns that these assessment tools may not be fair, valid nor reliable in measuring the<br />
capabilities of their students. We therefore conclude that the more ambitious aim, of<br />
establishing the quality of teachers’ own summative assessments so that they may claim to<br />
supplement or even replace formal tests externally set and marked (ARG 2006), will be<br />
difficult to achieve without considerable professional development. We suggest it would take<br />
several years of modest steps towards such an aim, before that aim could be approached<br />
or a new system could be designed and implemented to meet the multiple purposes of<br />
public assessment.<br />
66 ENAC 2008
There is a bigger story behind.<br />
An analysis of mark average variation across Programmes<br />
Anton Havnes, University of Bergen, Norway<br />
In a UK university some undergraduate programmes have been consistently above the<br />
University average, others consistently below. Preliminary analyses have controlled for level<br />
entry grades, gender, group size, the assessment weighting on modules between<br />
coursework and exams, and assessment forms (coursework vs. exam). None of these<br />
factors explain the variation in average marks. Students who take a combined degree with<br />
one Field in the high mean group (HM) and another in the low mean group (LM) on average<br />
get higher marks in their HM modules than in their LM modules. One possible explanation is<br />
that the variation is due to diverse assessment and marking cultures. This project took<br />
another potential explanation as the starting point: Are there variations between<br />
Programmes in the way coursework, formative assessment and feedback is organised that<br />
make it reasonable to expect that students in the HM Fields probably will reach the<br />
standards of their Field, while it is less likely that students in the LM Fields do? If so, it is<br />
also reasonable to expect that the HM Fields should have higher average marks than the<br />
LM Fields. Also, there should be something to learn from the HM Fields that shed light on<br />
potentials for improvement of the educational programmes across the whole University.<br />
Because of the sensitivity of this issue I was engaged to do this study as an external and<br />
non-UK researcher. Four categories of data was obtained:<br />
• documents: Study Guides, Module Descriptions, Coursework tasks<br />
• semi-structured interviews with Field Chairs, representing two HM and two LM Fields,<br />
the fifth Field chair represented a Field that has risen from LM to mean (taped,<br />
transcribed and analysed)<br />
• examples of written feedback on students’ coursework<br />
• mark transcripts for each Field and each module in each Field<br />
The recruitment for students was not successful, unfortunately. The main restriction was the<br />
Data Protection Act, which prevents the obtainment of contact details for those students<br />
who have not already agreed to be contacted by a researcher.<br />
The analysis shows that the teachers in all Programmes comply to the University<br />
assessment regime and the guidelines for marking and feedback. It is hard to identify<br />
essential differences in the assessment cultures, instead, it seems that there is one<br />
assessment culture that dominates across the University. The analysis points to a series of<br />
contextual and conceptual factors that varies systematically between HM and LM Fields.<br />
• the consistency of the conceptual construct that students’ learning is about (the core<br />
around which students’ learning rotate throughout the whole degree programme),<br />
represented by the consistency of what teaching, assessment and feedback relate to<br />
across modules.<br />
• the relationship between learning activities and the students potential future professional<br />
and/or academic practice<br />
• the way the complexity of the Field (at the Programme level) and its thematic<br />
components (at the module level) are laid out as a Field and as a trajectory of learning<br />
across modules<br />
• inter-modular planning, coordination and communication<br />
• the integration of feedback in lectures, seminars and coursework.<br />
ENAC 2008 67
Course design and the Law of Unintended Consequences:<br />
Reflections on an assessment regime in a UK “new” University<br />
Anton Havnes, University of Bergen, Norway<br />
Assessment is known to drive learning. The attempt to improve students’ learning has led to<br />
revisions of assessing students to institute more diverse and more learning-oriented<br />
assessment strategies. In many universities in UK and elsewhere coursework has become<br />
the most common assessment form. Coursework assessment offers the opportunity to<br />
ensure diversity and frequent feedback. In a study of three UK universities Gibbs (2007) he<br />
found that the assessment environments differed “very widely” across institutions. The<br />
variation was particularly large in the volume of formative-only assessment. Butler (1987)<br />
has documented that formative-only feedback has a significant influence on learning<br />
contrary marks-only feedback. The importance in increasing and improving formative<br />
feedback to support student learning is stressed in policy documents and research (e.g.<br />
Hattie & Timperley, 2007; Nicol & Macfarlane-Dick, 2006).<br />
This paper is based on a study of coursework, marking, formative assessment and<br />
feedback practice in a UK University. 15 teachers (five Field Chairs and 10 lecturers) in five<br />
undergraduate programmes were interviewed, Institutional Guidelines, Study Guides,<br />
Module Descriptions and Coursework Tasks were collected and analysed. Interviews were<br />
transcribed and analysed to identify how assessment and feedback supported students’<br />
learning. Finding show that in spite of the fact that markers invested a vast amount of<br />
resources in writing feedback to the students, only a small number of students actually<br />
collected their feedback. The analysis explains why this neglect of feedback by the students<br />
turns out to be a regrettable but rational response on the assessment system. Firstly,<br />
assessment was predominantly associated with marking mark justification: “Done that.”<br />
Secondly, the links between what was assessed in one module often did not link to what<br />
was assessed in another module. Likewise, coursework would often cover different thematic<br />
fields, they would be assessed in relation to different criteria and the modes of assessment<br />
would vary. This triple-variation made feedback of minor interest (except of students who<br />
had to re-sit and the very engaged students) created inconsistency in students’ learning<br />
trajectory: “The next assessment task is on something different, some other criteria and you<br />
have to perform in a different way. The problem fundamental is not any individual teacher’s<br />
assessment, the assessment of a specific achievement or any given feedback given to an<br />
achievement. Instead the problem is how the various teachers assessment interrelate<br />
thematically and rotate around a set of core criteria. Another problem is the marking.<br />
Lecturers argued that they could not give formative assessment on a piece of work that was<br />
subject to summative assessment.<br />
These and other finding will be discussed in the perspective of research on formative<br />
assessment, the use of criteria and the influence of feedback on learning. The findings –<br />
though the assessment system was expected to increase formative feedback – will also be<br />
discussed in the perspective of “Murphy’s law of unintended consequences”: goal-oriented<br />
activities will generate unexpected and often counterproductive results that can nullify the<br />
desired outcomes, or, things will go wrong in any given situation, if you give them a chance.<br />
68 ENAC 2008
Reliability and validity of the assessment of web-based video portfolios:<br />
Consequences for teacher education<br />
Mark Hoeksma, Judith Janssen, Wilfried Admiraal<br />
ILO Graduate School of Teaching and Learning, University of Amsterdam, The Netherlands<br />
In this paper, we will evaluate the quality of using web-based video portfolio for the<br />
assessment of competences in teacher training. The web-based video portfolio has been<br />
designed and tested in the DiViDossier-project, which was financed by the National eLearning<br />
Programme of SURF, the Dutch foundation for ICT in higher education. Since 2003, our<br />
institute has been using an electronic portfolio system in order to provide a realistic portrait of<br />
a student's abilities, offer an opportunity for a student’s self-reflection and, communicate a<br />
student's performance to others. These portfolios contained written documents, for example<br />
reflections about one’s own behaviour and classroom situations. A major advantage of a<br />
video portfolio is that students are able to demonstrate their competences in authentic<br />
professional situations. Therefore, the system of web-based video portfolio is expected to<br />
improve the quality of assessment in teacher education: compared to written texts, videos can<br />
give a realistic, or a more valid, view of teaching competences.<br />
Two characteristics of our video portfolio system:<br />
• The video portfolio includes video-narratives in which teacher trainees can demonstrate<br />
both integrated competences and their growth in these competences.<br />
• The video portfolio includes reflections and narratives demonstrating knowledge of<br />
methodological and pedagogical approaches.<br />
The use of a video portfolio has serious consequences for our procedures of assessment. A<br />
list of fairly ‘open’ criteria is used by two teacher educators, the students’ supervisor and an<br />
uninvolved teacher educator. They assess the professional competences of student<br />
teachers and the congruency between their performance on video and their reflections.<br />
In this study, our main question is how to enhance the quality of our assessment procedure<br />
of web-based video portfolio, in terms of reliability and validity. We will gather the following<br />
data:<br />
1. Individual interviews with 10 teacher educators on their assessment procedures of video<br />
portfolios<br />
2. Individual think-aloud interviews with 10 teacher educators on their assessment of a<br />
particular video portfolio<br />
3. Assessment forms with the assessments of 40 portfolios (80 forms, 2 per portfolio)<br />
The first set of data will result in general information of assessment procedures, evaluation<br />
of the criteria used, and justifications of the assessments. The second and the third set of<br />
data will be used for the analysis of reliability and validity. The reliability will be reported in<br />
Cohen’s kappa; the validity will be analysed by using qualitative research methods. The<br />
validity will be examined by investigating whether the assessors are biased in their<br />
assessment of the performance or reflections (cf., Heller Sheingold, & Myford, 1998). In<br />
addition, we will examine the consistency between their beliefs and practice of assessment.<br />
References<br />
Heller, J. I., Sheingold, K., Myford, C. M. (1998). Reasoning about evidence in portfolios: cognitive<br />
foundations for valid and reliable assessment. Educational Assessment, 5, 5-40.<br />
ENAC 2008 69
Diversity in patterns of assessment across a university<br />
Jenny Hounsell, University of Edinburgh, United Kingdom<br />
Dai Hounsell, University of Edinburgh, United Kingdom<br />
Over the last quarter-century, there has been a far-reaching transformation in the practices<br />
and processes of assessment in higher education. What was once a rather limited diet of<br />
essays, reports and exams has undergone a remarkable diversification and today's<br />
university teachers have before them an abundance of possible ways of assessing their<br />
students' progress and performance. Keeping track of these changes has proved far from<br />
easy: assessments need to be tailored not only to subject requirements but also to level of<br />
study, degree programme aims and the learning outcomes for a given course unit or<br />
module. This in turn means that within universities, responsibilities for designing and<br />
conducting assessments are in various respects devolved to departments. While there have<br />
been some surveys of changes and developments in assessment, the mapping that has<br />
been done has been mostly global rather than localised, and has tended to focus on<br />
changes that have been considered worthy of documenting in the literature (Bryan and<br />
Clegg, 2006; Hounsell et al., 2007; James et al., 2002).<br />
This paper reports the findings of a study which was distinctive in its attempt to survey<br />
undergraduate assessment methods and weightings across a large and long-established<br />
university in which subject areas enjoyed a considerable degree of autonomy in devising<br />
patterns of assessment. It draws on data that has recently become much more readily<br />
available following the introduction of a new computerised database on degree programmes<br />
and course units and through departmental websites. It focuses on two aspects of current<br />
practices: methods of assessment, and weighting of examinations and coursework. A total<br />
of 91 methods of assessment (68 types of coursework and 23 kinds of exam) were found to<br />
be in use across 20 subject areas, while the total number of methods deployed within a<br />
subject area ranged from 10 to 48.<br />
The choice of methods was to a significant extent a function of subject area: a small number<br />
of assessment methods was found across the subject range, while a much larger number<br />
were confined to a limited number of departments. There were also striking differences in<br />
how assessments were weighted across subject areas and over successive years of<br />
undergraduate study. Four contrasting models were identified, differing in terms of whether<br />
weightings were uniform or variable from unit to unit and from one year to the next, and the<br />
extent to which coursework or exams was preponderant.<br />
The paper concludes by exploring the implications of these findings, both for assessment<br />
practice within the university concerned and in higher education more generally.<br />
References<br />
Bryan, C. and Clegg, K. (2006) (eds.) Innovative Assessment in Higher Education. London: Routledge<br />
Hounsell, D., Blair, S., Falchikov, N., Hounsell, J., Huxham, M., Klampfleitner, M. and Thomson, K. (2007)<br />
Innovative Assessment Across the Disciplines: An Analytical Review of the Literature. York: Higher<br />
Education Academy<br />
James, R., McInnis, C. and Devlin, M. (2002) Assessing Learning in Australian Universities. Melbourne:<br />
University of Melbourne.<br />
70 ENAC 2008
Learning-oriented assessment: A critical review of foundational research<br />
Gordon Joughin, University of Wollongong, Australia<br />
The concept of ‘learning-oriented assessment’ draws attention to a range of conceptual,<br />
research and practice issues concerning the relationship between assessment and the<br />
process of learning in higher education. The conceptual framework for learning-oriented<br />
assessment proposed by Carless, Fun and Joughin (2006) provides a convenient<br />
framework for highlighting these issues. In this paper, one of the authors of that framework<br />
draws attention to, and challenges two propositions that have become maxims in the<br />
literature of assessment and learning, namely that assessment drives learning and that<br />
feedback through formative assessment is critical to the learning process. A careful review<br />
of repeatedly cited research casts doubt on the first of these propositions. For example, the<br />
treatment of the research reported in frequently cited works such as Making the Grade<br />
(Becker, Geer, & Hughes, 1968) and The Hidden Curriculum (Snyder, 1971) has often<br />
oversimplified, and thus misrepresented, the research findings, leading to singular<br />
interpretations of complex, multi-faceted phenomena. Other research suggesting serious<br />
limitations to the capacity of assessment per se to improve students’ approaches to learning<br />
is often under-emphasized, leading to the risk of exaggerated claims for the capacity of<br />
‘alternative’ forms of assessment to foster effective learning processes in students. Finally,<br />
research on students’ experience of assessment contrasts with the prominence accorded to<br />
feedback in learning and assessment theory, highlighting a worrying gap between theory<br />
and practice.<br />
This paper provides a critical review of the empirical research basis of the above<br />
propositions regarding the roles of assessment and feedback in directing and forming<br />
students’ learning. On the basis of this review, the paper proposes an empirical research<br />
agenda that addresses what seems to be serious gaps in our understanding of fundamental<br />
aspects of the interactions between assessment and learning.<br />
ENAC 2008 71
Implementing standards-based assessment in Universities:<br />
Issues, Concerns and Recommendations<br />
Patrick Lai, The Hong Kong Polytechnic University, China<br />
Studies were made to identify practices of implementation of standards-based assessment.<br />
Tan & Prosser (2004) conducted a phenomenographic study of academics’ conceptions of<br />
grade descriptors. This study illustrates that academic staff understand grade descriptors in<br />
markedly different ways, ranging from conceptualizing the descriptors as having nothing to<br />
do with standards to understanding those which are directly related to standards. Sadler<br />
(2005) conducted another study to find out the grading practices of universities. None of the<br />
approaches identified delivered the aspirations of standards-based assessment. However,<br />
there is a need to shift the focus from criteria to standard. Whilst these two representative<br />
studies emphasize only the final grading step, there is a need to have a more thorough<br />
investigation into each of the implementation steps of standards-based assessment.<br />
A series of focus group interviews with 51 academic staff from 21 departments and two<br />
open forums were conducted. Participants were invited to comment on the issues and<br />
problems encountered at the preparation, marking and post-marking stages of standardsbased<br />
assessment. This paper summarizes the issues and concerns identified in the<br />
discussions with various stakeholder groups.<br />
In developing criteria and performance standards for assessment tasks, often the<br />
assessment task is selected first and matched to the learning outcomes. It is also difficult to<br />
set clear criteria that can be understood easily by assessors and students and that will<br />
discriminate effectively between the good students and the weaker ones. Setting up and<br />
grading students’ work based on a matrix, with descriptors for each criterion at different<br />
performance levels, is tedious and becomes unmanageable.<br />
In making assessment criteria and standards explicit to assessors and students, the key<br />
concern here is that colleagues are not clear about the depth of detail required by students.<br />
If colleagues give too many detailed examples of the expected responses for different<br />
award levels it may become too much like a model answer and will not help students<br />
develop independent study habits.<br />
In ensuring the consistency in marking and grading, the concern expressed by staff is the<br />
need to ensure consistency in marking and grading when there are multiple markers. In<br />
case if there is a large number of assignments to be marked, the marker’s perceptions of<br />
the criteria and standards may “drift” from start to finish.<br />
Finally there are two issues that can affect consistency in the development of standards.<br />
One issue concerns the different perceptions of minimum passing standards held by<br />
academic colleagues. It is quite a commonly-held perception that a skewed distribution of<br />
students’ grades in a particular subject is abnormal and should be “normalized”.<br />
Based on a series of feedback-collecting exercises, this paper will also present a number of<br />
strategies for setting assessment tasks, marking and post-marking mechanisms that can be<br />
utilized to address the concerns expressed by university academic staff about their<br />
endeavours to implement standards-based assessment. Recommendations on strategies<br />
and support to facilitate academics to implement standards-based assessment made in this<br />
paper certainly add to the literature in higher education.<br />
72 ENAC 2008
Test-based School Reform and the Quality of Performance Feedback:<br />
A comparative study of the relationship between mandatory testing policies and<br />
teacher perspectives in two German states<br />
Uwe Maier, University of Education Schwäbisch Gmünd, Germany<br />
Comparative research on test-based school reform revealed that positive impact of<br />
performance feedback information on school improvement depends on accountability policy<br />
and testing system in the respective jurisdiction (Firestone, Winter & Fitz 2000; Cheng &<br />
Curtis 2004; Herman 2004). Test-based school reform is meanwhile a prominent instrument<br />
of educational policy in Germany. But state-mandated testing systems vary from jurisdiction<br />
to jurisdiction since the German federal constitution guarantees state autonomy in<br />
educational policy. Particularly the testing systems in the two German states Baden-<br />
Württemberg and Thüringen are rich in contrast. State-mandated tests in Baden-<br />
Württemberg (Vergleichsarbeiten) are not based on competency models, provide little<br />
feedback information (raw data) and teachers are responsible for data analysis. Statemandated<br />
tests in Thüringen (Kompetenztests) are based on competency models,<br />
performance feedback includes value-added data, and external support is high. Both<br />
obligatory tests were given at the end of Grade 6 in core subjects including German<br />
language and mathematics.<br />
The hypothesis was that the elaborated and value-added feedback information in Thüringen<br />
is more accepted by schools and can rather prompt teachers to reflect upon professional<br />
improvement. Random samples of schools in both states were approached for data<br />
collection. A total of 1136 teachers completed the questionnaire (nBW=825; nThü=311).<br />
Measurement of the dependent variables is based on a quantitative survey instrument. An<br />
exploratory factor analysis revealed seven scales: General acceptance of mandatory testing<br />
(6 Items, alpha = .89), mandatory testing as a burden for schools (4 Items, alpha = .79),<br />
curricular alignment of the test (4 Items, alpha = .84), performance feedback supports<br />
diagnostic activities (5 Items, alpha =.89), performance feedback supports grading (5 Items,<br />
alpha =.80), performance feedback indicates further revision (3 Items, alpha =.90),<br />
performance feedback indicates curricular changes (4 Items, alpha =.77). The hypothesis<br />
proved to be correct. General test acceptance and the use of performance indicators for<br />
diagnostic activities and reflection upon teaching were substantially higher among teachers<br />
in Thüringen. By contrast, teachers in Baden-Württemberg show higher average scores on<br />
the scale "performance feedback supports grading". The results show again that sound<br />
testing policies are a crucial precondition for standard-based school reforms.<br />
References<br />
Cheng, L./Curtis, A. (2004): Washback or Backwash: A Review of the Impact of Testing on Teaching and<br />
Learning. In: Cheng, L./Watanabe, Y./Curtis, A. (Eds.): Washback in Language Testing. Research<br />
Contexts and Methods. Mahwah/London: Lawrence Erlbaum, pp. 3-17.<br />
Firestone, W. A./Winter, J./Fitz, J. (2000): Different assessments, common practice? Mathematics testing and<br />
teaching in the USA and England and Wales. In: Assessment in Education, 7, 2000, 1, pp. 13-37.<br />
Herman, J. L. (2004): The Effects of Testing on Instruction. In: Fuhrman, S.H./Elmore, R.F. (Eds.):<br />
Redesigning Accountability Systems for Education. New York/London: Teachers College Press, pp.<br />
141-166.Lesh, R. (1999) The Development of Representational Abilities in Middle School<br />
Mathematics. In Sigel (Ed.) Development of Mental Representation: Theories and Applications<br />
(pp.323-349). London: Lawrence Erlbaum Associates, Inc.<br />
ENAC 2008 73
Motivational aspects of complex item formats<br />
Thomas Martens, Frank Goldhammer<br />
German Institute for International Educational Research (DIPF), Germany<br />
Some benefits from Computer-Based Assessments (CBAs) are generally accepted, e.g.,<br />
shorter testing time or instant scoring. However, the question of whether CBA is more<br />
enjoyable for students is still in the focus of research. For example, Computer-Adaptive<br />
Tests (CATs) certainly allow for shorter testing times, but CATs may also irritate testees by<br />
administering constantly items with a fixed solution probability like 50% despite of the<br />
testees’ perceived test effort (see Frey, 2006). Another line of research (Björnsson, 2007)<br />
shows that 15 year old students from Denmark, Iceland and Korea enjoyed CBA more than<br />
Paper-Based Assessment (PBA) and - if they could freely chose their personal test mode –<br />
they would select the CBA mode only (53,2%) or a combination of CBA mode and PBA<br />
mode (37,6%). We assume that complex item formats like browser simulations are even<br />
more enjoyable for students than the items used by Björnsson (2007).<br />
In a first study with N=70 students the computer based assessment platform TAO (the<br />
French acronym for technology-based assessment) and the “Hypertext Builder” were used<br />
to develop and deliver complex electronic reading stimuli, covering all major text-types<br />
encountered in electronic reading such as websites, e-mail client environments, forums, or<br />
blogs. Furthermore motivational state and trait variables (see Rheinberg, 2003) as well as<br />
ICT literacy were assessed. First comparisons with older PBA studies that used the same<br />
motivational items revealed that the students like the complex stimuli much more and<br />
therefore reported that they tried harder to solve the corresponding test items. This selfreported<br />
effort is partly an effect of general computer motivation and ICT-literacy but<br />
nevertheless a direct positive effect of the used stimuli for electronic reading remains.<br />
This first result - that test items that try to mimic real world settings, like browsing a website,<br />
are more attractive for students - is not very surprising. However, it has also to be<br />
investigated, whether influences that might spoil test reliability of PBAs like boredom or<br />
refusal will be replaced by other bothering random influences that stem from complex CBAs<br />
like loosing time control or going into unimportant details. These side effects that might be<br />
related to the testee’s interaction with complex stimulus material are under ongoing<br />
research using thinking-aloud and eye tracking techniques. Systematic results from this<br />
research (N=20) will also be presented at the conference.<br />
74 ENAC 2008
Remarkable Pedagogical Benefits of<br />
Reusable Assessment Objects for STEM Subjects<br />
Michael McCabe, University of Portsmouth, United Kingdom<br />
Reusable assessment objects (McCabe, 2007) have been used to transform the learning of<br />
STEM (Science, Technology, Engineering and Mathematics) subjects by making eassessment<br />
more dynamic. The resulting “peer moderation” has improved student<br />
engagement through closer cooperation with the lecturer in the development of learning<br />
resources.<br />
The concept of peer moderation for summative examinations seems absurd. How can<br />
mathematics or science questions be released to students before an exam? If solutions are<br />
known in advance, the incentive to learn is lost. Traditional written exams are prepared in<br />
advance and moderated by academic staff under tight security. Traditional e-assessment<br />
exams require even more care over moderation and security, since they need to be<br />
checked both for their academic content and technical correctness.<br />
One possible approach is to use large e-assessment question banks. If hundreds or<br />
thousands of questions are available, then it may be possible to release them to students in<br />
advance of a formal exam involving a small subset of, say ten, questions. Students are<br />
motivated to try a large number of the formative questions and receive feedback on their<br />
progress. Lecturers can perform item analysis on the questions to derive e.g. facility and<br />
discrimination, based upon the trials. They can also identify pedagogical or technical<br />
mistakes at an early stage and then modify questions accordingly. The end result is an<br />
improvement in the quality of the question bank, but at the expense of revealing the precise<br />
questions to students. Of course, a student willing to attempt the complete question bank<br />
might be regarded as worthy of success, regardless of their ability! I have used large<br />
question banks available from publishers and national projects in this way, e.g. for<br />
mathematics and astronomy. Unfortunately, individuals rarely have the time to develop<br />
sufficient questions for large banks.<br />
Reusable assessment objects are different. They automatically generate computer-based<br />
questions, guidance, hints and feedback for students, through the use of random<br />
parameters and algorithms. The random parameters can include numbers, characters,<br />
words, text, diagrams, graphs, pictures, algebraic expressions, mathematical operators,<br />
equations, variables, functions and symbols. The algorithms specify how these random<br />
parameters interact and can include conditions applied to questions and answers. An<br />
interesting example of their use is in statistical hypothesis testing where several<br />
intermediate questions or steps lead up to a final decision. The final decision changes<br />
according to the data provided in the question. Algorithms are also useful for defining<br />
questions with open-ended answers, such as asking for an example which satisfies a set of<br />
criteria.<br />
Lecturer benefits include a reduced need for large question banks and the remarkable<br />
opportunity for students to peer moderate their summative assessment questions. Student<br />
benefits include greater motivation to attempt formative test questions, better feedback<br />
(Nicol and Milligan, 2005), greater involvement in the assessment process itself, higher<br />
quality questions and opportunities to assess their own progress more accurately.<br />
Examples of reusable assessment objects generated using MapleTA<br />
http://perch.mech.port.ac.uk/classes will be used to illustrate these ideas.<br />
ENAC 2008 75
Demystifying the assessment process:<br />
using protocol analysis as a research tool in higher education<br />
Fiona Meddings, Christine Dearnley, Peter Hartley<br />
University of Bradford, United Kingdom<br />
Marking and assessing student submissions is a fundamental part of contemporary<br />
education. To the casual observer undertaking assessment of student work may appear to<br />
be a simple process, after all the student has done the hard part - engaging with the<br />
process by completing the required assessment task. On closer examination however it<br />
seems that very little is known about the actual lecturer process of marking. Limited<br />
literature exists to inform us about how lecturers come to the decisions they do, and what<br />
influences them in reaching those decisions. What we do know is that marks are provided<br />
for the student to give an indication of their success or otherwise at the assessment task. In<br />
some cases marks are accompanied by written feedback, in the guise of qualitative<br />
statements sitting alongside the quantitative (possible alphanumeric) mark. Outcomes<br />
following the marking process depend upon the purpose for which it is seen i.e. lecturing<br />
staff may feel it reflects the quality of the educational process and student engagement,<br />
whereas the student may see the mark and feedback as giving them an idea of individual<br />
achievement and whether it relates to their own self assessment of their abilities.<br />
Although the problem of assessment does feature in the literature it is often concerned with<br />
what we do to assess students (Nicol 2007) i.e. the actual assessment approach or the tool<br />
to be used e.g. portfolio, examination etc or coursework, seminar etc, respectively; with<br />
blurring between the two. What is known is that good assessment choices will consider the<br />
teaching methods as well as the subject matter. Less attention has focused on the impact of<br />
assessment feedback on students (Higgins et al 2001). Others identify how improved<br />
feedback can assist student learning (Nicol and MacFarlane Dick 2006) or highlight<br />
feedback as an important feature (Gibbs and Simpson 2004). What remains unclear from<br />
the literature is how this feedback, which is of high importance to students (National Student<br />
Survey U.K. 2007), is constructed.<br />
This paper presentation will explore the potential of protocol analysis as a method of data<br />
collection used to uncover the thought processes involved in marking and assessing<br />
undertaken by lecturing staff at a higher education institution. This is a method of gathering<br />
concurrent verbal reports of lecturer judgements during the marking and assessing process,<br />
by recording their verbalised thoughts. Marking is almost always a solitary process, limited<br />
interactions between markers, with little being known about the cognitive processes<br />
undertaken. In a study undertaken at this university protocol analysis was used to uncover<br />
the thinking processes related to a marking and assessing task; by asking participants to<br />
speak aloud and to verbalise their cognitive processes (Ericsson and Simon 1993). A study<br />
by Orrell (2006), uses a similar approach (in Higher Education) therefore validating the use<br />
of this data collection method. The paper will examine the concepts of validity and reliability<br />
as well as provide some opportunity for discussion and examination of generalisability of<br />
findings from the utilisation of this data collection tool.<br />
76 ENAC 2008
Challenging the formality of assessment: a student view of<br />
‘Assessment for Learning’ in Higher Education<br />
Catherine Montgomery, Northumbria University, United Kingdom<br />
Kay Sambell, Northumbria University, United Kingdom<br />
This paper explores students’ understandings of ‘Assessment for Learning’ (Black et al,<br />
2003) or ‘Learning-oriented assessment’ (Carless, 2006) by outlining some of the findings of<br />
a systematic university-wide, cross-disciplinary study into student perceptions of this<br />
approach. Whilst the concept of ‘Assessment for Learning’ (AfL) has developed related<br />
theoretical bases over the last decade, there is little research that reveals students’<br />
conceptions of the meanings associated with the term. Previous research has focused on<br />
specific elements of AfL such as self and peer assessment (Dochy, Segers and Sluijsmans,<br />
1999), feedback (Higgins, Hartley and Skelton, 2002) or on specific approaches to<br />
assessment (Birenbaum, 1996). However, there has been relatively little research aiming to<br />
illuminate students’ understandings of the concepts of AfL. Early research has indicated that<br />
students construct different concepts about the meanings of assessment tasks and this<br />
‘hidden curriculum of assessment’ can sometimes be at odds with the ‘formal’ curriculum<br />
(Sambell and McDowell, 1998). This paper contributes to the rapidly growing literature that<br />
explores the socio-cultural context of learning through investigating students’ experiences of<br />
assessment for learning as ‘lived’ (Orr, 2007; Montgomery, 2007).<br />
The paper forms part of a wider study that employs a multi-site case study design with each<br />
case site across the disciplines of Engineering, Education and English, representing an<br />
implementation of AfL in a learning context. Multiple methods of data collection are used<br />
with interview, observation and focus groups generating data within an interpretive<br />
approach. The approach is predicated on the claim that the activities of learning and<br />
teaching are best understood if they are investigated as activities in their ‘natural’, sociocultural<br />
context, rather than on the basis of ‘experimental’ interventions, or on the basis of<br />
actor-related variables, such as student characteristics or motivations (Haggis, 2007).<br />
The findings suggest that although tutors were likely to draw upon assessment-related<br />
discourse, such as ‘self-assessment’, ‘feedback’ and ‘peer-evaluation’ to refer to AfL, it was<br />
notable how far students did not. Students did, however, construct AfL as markedly different<br />
from more traditional teaching, learning and assessment experiences. Their heightened<br />
engagement with learning and assessment tasks invested their experiences with personal<br />
meanings, enabling them to see the ‘real world’ application of their learning. For the students,<br />
AfL meant they were no longer simply ‘jumping through hoops’. AfL was viewed by the students<br />
as being part of an informal, personal and social context for learning where conversation and<br />
the informal exchange of views were highly prized and student emphasis lay with ‘talk’,<br />
‘listening’ and ‘seeing’ in relation to informal dialogue. Students noted that their learning was<br />
often characterised by ‘informal chat’ and that it was ‘light-hearted banter’ and ‘story-telling’.<br />
This paper may stimulate discussion of Hawe’s (2007) point that in an aligned teaching,<br />
learning and assessment setting staff and students should engage in meaningful dialogue<br />
about elements of teaching, learning and assessment. This may contribute to a shared<br />
language with which to engage in dialogue because without this AfL may risk constructing<br />
discrete and, to students, ‘foreign’ assessment systems.<br />
ENAC 2008 77
Secondary students’ motivation to complete written dance examinations<br />
Patrice O'Brien, The University of Auckland, New Zealand<br />
Mei Kuin Lai, The University of Auckland, New Zealand<br />
There is little research that focuses on students’ experience of national examinations in<br />
non-traditional curriculum such as dance. In this standard, we examine why New Zealand<br />
students sitting national examinations (NCEA) in dance, were not attempting one of two<br />
written standards of their assessment. A non-attempt by a student present at the<br />
examination is referred to as a void in New Zealand.<br />
The research used a case study approach in order to gain rich information directly from<br />
students. Initially, data was obtained from questionnaires that were completed by students<br />
(n=26) of a national cohort (n=516) as they left the examination room, in three secondary<br />
schools. This enabled students to report immediately on what they had done in the<br />
examination and the reasons for their decisions. We conducted in-depth interviews with all<br />
students (n=4) who voided a standard and also interviewed a randomly selected<br />
comparison group representing a range of achievement levels (n=5) that did not void<br />
standards.<br />
Results showed that the greatest difference between these two groups involved the<br />
students’ belief in their ability to succeed. In line with attribution theory (Weiner, 1985),<br />
students who attempted both standards attributed their success to internal factors such as<br />
the effort they put into study. Students who voided a standard attributed their results to<br />
external factors such as the appeal of the topic they had studied, the difficulty of the<br />
questions or the layout of the standard. Interestingly some students who voided a standard<br />
could provide correct answers to the interviewer.<br />
The research also found that lack of success with school practice examinations, which are<br />
intended to assist students’ preparation for NCEA examinations, had the unintentional effect<br />
of contributing to the beliefs of students who voided standards that they were not capable of<br />
success. Initially, researchers and teachers made assumptions such as lack of literacy<br />
skills, or having already achieved sufficient credits for a certificate, impacted on students’<br />
motivation to attempt dance standards but this research found that these factors did not<br />
significantly influence students to void standards.<br />
The implications of this research reinforce the importance of testing assumptions and<br />
obtaining feedback from students as a way of supporting their learning. It also reinforces the<br />
importance of contextualizing research results to each school’s unique situation. In a<br />
national study on student motivation to complete NCEA, Meyer and colleagues (2006)<br />
predicted that students would be influenced by the number of credits they had accumulated<br />
but credits did not affect students’ motivation in this case study. The study also revealed the<br />
negative effects of grade-focused assessment feedback on students with low expectations<br />
of success.<br />
References<br />
Meyer, L., McClure, J., Walkey, F., McKenzie, L., & Weir, K., (2006). The impact of NCEA on student<br />
motivation. Wellington: Victoria University of Wellington.<br />
Weiner, B., (1985). An attributional theory of achievement motivation and emotion. Psychological Review,<br />
92, 548-537.<br />
78 ENAC 2008
Mind the gap:<br />
assessment practices in the context of UK widening participation<br />
Michelle O'Doherty, Liverpool Hope University, United Kingdom<br />
This paper reports on findings from research funded by the UK Higher Education Academy;<br />
the study aimed to explore staff and student perceptions of quality feedback within the<br />
context of transition between educational sectors. Whilst seminal research has been<br />
conducted on the assessment experience of students in schools (Black and Wiliam, 1998;<br />
Black et al, 2003) and universities (Hounsell, 2003), there are relatively few studies that<br />
investigate the impact of the former on the latter. This qualitative study makes this cross<br />
sector connection, presenting data collected across nine education institutions (three<br />
schools, sixth forms and universities respectively) on perceptions of assessment. As a<br />
result, our findings address a gap in the current literature, positioning first year<br />
undergraduate expectations of quality feedback within the context of their prior experience<br />
of formative assessment.<br />
Current theory conceptualises assessment as a dialogic process (Higgins et al, 2001) in<br />
which quality feedback is the most powerful single influence on student achievement<br />
(Hattie, 1987); therefore, the provision of quality feedback is perceived as a key requirement<br />
of effective teaching in higher education (Ramsden, 2003). In practice, lecturers often<br />
believe their feedback to be more useful than students do (Careless, 2006; Maclellan, 2001)<br />
and feedback has consistently been identified as the least satisfactory aspect of the student<br />
experience in UK universities (National Student Survey, 2007, 2006, 2005). As a<br />
consequence of this mis-match in staff and student perceptions, assessment in UK higher<br />
education is being challenged.<br />
Frameworks for good practice in assessment have been developed, but attempts to<br />
conceptualise quality feedback within the context of higher education have been positioned<br />
within a formative rather than a summative process (Gibbs & Simpson, 2004-5; Nicol &<br />
Mcfarlane-Dick, 2004; 2006). However, resource constraints coupled with a widening<br />
participation agenda of mass expansion in the UK have limited the opportunities for<br />
formative assessment to be practised (Yorke, 2003; Gibbs, 2007), At the same time, within<br />
the school sector a formative Assessment for Learning Culture (Assessment Reform Group,<br />
1999) has been developed which means students experience a significant cultural gap in<br />
feedback practices between educational sectors. In particular, our findings reveal students<br />
perceive quality feedback as part of a dialogic, guidance process rather than a summative<br />
event. Conversely, in higher education concerns relating to the ‘dumbing down’ of<br />
Independent learning through spoonfeeding (Haggis, 2006) are leading to increasing<br />
tensions between the theory of good practice and the practice of assessment.<br />
This longitudinal study reports on the consequences of these conflicting expectations of<br />
guidance and independent learning for first year undergraduates and their tutors in three<br />
subject disciplines. These findings have informed recent initiatives to scaffold students’<br />
autonomous learning through formative assessment and the presentation will provide an<br />
opportunity to discuss these interventions. Thus, the presentation of our cross sector<br />
findings aims not only to reframe the context of the debate challenging current assessment<br />
practices in UK higher education, but also to contribute to the re-conceptualisation of<br />
feedback practice for future learning (Hounsell, 2007; Boud & Falchikov, 2007).<br />
ENAC 2008 79
Measuring writing skills in large-scale assessment:<br />
Treatment of student non-responses for Multifaceted-Rasch-Modeling<br />
Raphaela Oehler, IQB, Humboldt University Berlin, Germany<br />
Alexander Robitzsch, IQB, Humboldt University Berlin, Germany<br />
Researchers in large-scale assessment need to make decisions about how to handle<br />
student non-responses in their analyses. Particularly when the sample contains of lowachieving<br />
students, a considerable amount of responses might be missing by intention or is<br />
not interpretable. Reducing costs by not giving raters such texts is a common procedure.<br />
However, if rater effects are to be analysed and item difficulties are to be obtained using<br />
Multifaceted-Rasch-Modeling, a coding by the test administrators and thus introducing a<br />
new hypothetical “rater” is problematic for using standard IRT programs.<br />
One central project of the IQB is the development of large item pools on assessing<br />
students’ foreign language skills, particularly the reading, writing, and listening<br />
comprehension skills. Item development is based on the Common European Framework for<br />
Languages (CEF) which proposes six proficiency levels (A1 to C2). Item development for<br />
testing writing skills at the IQB follows the uni-level approach that means that writing tasks<br />
are developed for each CEF-level. The rating scales also demand the rater to judge texts<br />
within one level, i.e. performing quite well in an A2-task means that students’ writing skills<br />
are at least on A2. The rating criteria are either dichotomous (e. g. text organisation) or<br />
polytomous (e.g. global impression).<br />
A sample of N = 2.700 students from five different school types in Germany at grade eight<br />
to ten were tested within a multi-matrix design in 2007. Along with reading and listening<br />
comprehension items, 17 writing tasks were distributed. The study to be presented was<br />
carried out to analyse whether the newly developed rating approach works, particularly<br />
whether the order of the item difficulties obtained for the different rating criteria corresponds<br />
the CEF levels. In order to cope with the large amount of student non-responses, especially<br />
for the A1- and A2-task and to scale the data using Multifaceted-Modeling, a first approach<br />
was the distribution of the codings '8' (not interpretable texts) and '9' (empty pages) to the<br />
raters of one tasks (four ratings by six raters) percentwise according to the number of texts<br />
they had to rate in total. First results of an ad-hoc procedure of the IRT-analyses in<br />
ConQuest using the data for which the codings of the blank and not interpretable responses<br />
were distributed among the raters and when polytomous variables were recoded into binary<br />
codes showed that the order of the item difficulties fit the aimed at CEF-levels. We contrast<br />
this approach with an analysis in WinBUGS where every blank (i.e. a '9') is not modelled<br />
with a contamination of a true response by a rater effect, because rater discrepancies<br />
cannot occur by definition (besides lack of rater concentration). In addition, we extend the<br />
classical multifaceted Rasch analyses to include all rating criteria. A confirmatory factor<br />
analysis on these several rating scales and included rater effects will be presented.<br />
Along with a presentation of the rating system developed in the French project based on a<br />
uni-level approach, general implications for the treatment of student non-responses testing<br />
writing skills in large-scale assessment contexts are discussed.<br />
80 ENAC 2008
Collaborating or fighting for the marks?<br />
Students’ experiences of group assessment in the creative arts<br />
Susan Orr, York St John University, United Kingdom<br />
This paper reports on a research project on group assessment in creative disciplines in<br />
higher education that is funded by the university’s Centre of Excellence in Teaching and<br />
Learning (CETL). The central premise of this CETL is that creativity is enhanced through<br />
participation in collaborative activity.<br />
In the UK the National Student Survey identifies that students who study in arts-based<br />
subjects register lower levels of satisfaction in the areas of assessment and feedback. In<br />
addition, students and lecturers in the arts express concerns about the fairness of group<br />
assessment practices (Bryan 2004).<br />
Group assessment usually aims to measure the product created and the skills and efforts<br />
put in by members of the group (Bloxham and Boyd 2007). The effort and skills element can<br />
also be referred to as the process or the contribution. Cowdray and de Graaf (2005) point<br />
out that in arts education process and product are valued, however, process is an elusive<br />
concept. For example, as Heathfield (1999) points out, the term ‘contribution’ might refer to<br />
a student’s contribution to the task, or their contribution to group dynamics.<br />
Taking the view that assessment is a socially situated practice informed by, and mediated<br />
through, the socio-political context within which it occurs (Layder 1997), this research takes<br />
the form of an ethnographic study employing semi-structured interviewing and semiparticipant<br />
observation (Silverman 2004). I explore the ways that group assessment is<br />
experienced by students and lecturers in the subjects of dance, performance, music, and<br />
film.<br />
Across the disciplines studied I identified variation in the ways that marks were allocated to<br />
students for the process and product elements. These marking approaches represent local<br />
disciplinary and historical norms.<br />
Jacques and Salmon (2007) remind us that the process element of group work can take<br />
place out of the view of the lecturer. As a consequence, lecturers in this study have devised<br />
assessment strategies to help them assess process elements. For example, some students<br />
are asked to write about the process of the group work project in learning journals or<br />
production logs. However, students reported that they sometimes felt disadvantaged when<br />
they were asked to represent process in a text. As one student asked, ‘if it is a film, why<br />
write it?’. The written element is introduced by lecturers to help them disentangle individual<br />
contribution, however by asking students to represent process in this way lecturers may be<br />
creating unintended barriers to high achievement for some of our most creative visual<br />
students. As Smart and Dixon (2002:192) observe ‘those who are best able to articulate the<br />
collaborative […] process in a written form might gain an advantage even though their<br />
creative contribution may have been poor’.<br />
My analysis suggests that students recognise the importance of group work in terms of its<br />
vocational authenticity but that they are keen for lecturers to recognise and reward<br />
individual contribution fairly. This paper will initiate discussion about the role of process and<br />
how it might be assessed fairly and rigorously.<br />
ENAC 2008 81
Constructing a new assessment for learning questionnaire<br />
Ron Pat-El, M.Segers, P. Vedder, H. Tillema<br />
Leiden University, The Netherlands<br />
Aims/goals: In many countries, during the past decade, researchers and educationalists<br />
have put assessment on the agenda. More specifically, since the pivotal review study by<br />
Black and Wiliam (1998), the value of implementing assessment as a tool to support<br />
student learning, has been stressed. Based on qualitative studies in secondary education in<br />
the UK, the Assessment Reform Group (2002) has formulated 10 principles of assessment<br />
for learning (AfL), a constructivistic assessment strategy, where assessment is made a part<br />
of learning, and where emphasis is taken away from grading.<br />
Although reports have been published on the increasing extent to which AfL is implemented<br />
in schools, it is argued by researchers such as Black and Wiliam (1998) and McLellan<br />
(2001) that teachers tend to overestimate how well they use assessment as a tool to<br />
achieve learning gains in students. However, questionnaires used in AfL-research often<br />
suffer from methodological shortcomings such as low internal consistency of scales (e.g.<br />
Gibbs & Simpson, 2003), low factor loadings (e.g. James & Pedder, 2006) or inability to<br />
match student and teacher results (e.g. McLellan, 2001). Researching congruency of<br />
perceptions of AfL practice requires a valid instrument that enables direct comparisons<br />
between teachers and their students, and is based on a widely recognized<br />
operationalization of AfL. Therefore, this study aims to evaluate a new questionnaire, based<br />
on the principles of AfL as put forward by the ARG (2002), in which perceptions of<br />
implemented AfL-practices between teachers and their students can be compared. Based<br />
on pilot-results on a prototype questionnaire, a 48-item questionnaire is proposed that<br />
operationalizes four out of ten principles of AFL, namely: assessment for learning should (1)<br />
be central to classroom practice; (2) promote understanding of clear goals and criteria; (3)<br />
help learners know how to improve; and (4) develop the capacity for self-assessment. It is<br />
the aim of this study to evaluate whether a four-factor model based on the four mentioned<br />
principles of AfL can be confirmed.<br />
Method<br />
Procedure: A prototype self-report questionnaire was constructed and piloted in conjunction<br />
with educational experts at Leiden University at the department of Education and Child<br />
Studies. The initial 111-item questionnaire was administered as a semi-structured interview.<br />
Based on pilot-results, a prototype 48-item questionnaire was constructed and administered<br />
in conjunction with students participating in a bachelor-thesis project.<br />
Sample: The prototype self-report questionnaire was administered in 88 junior vocational<br />
high schools in the Netherlands to 1422 students (49% girls, 51% boys), who were on<br />
average 14.6 years old (SD = 1.52), and 237 teachers (43% females, 57% males), who<br />
were on average 42.3 years old (SD = 11.89).<br />
Results: Confirmatory factor analysis showed that the hypothesized four-factor model<br />
provided a good fit on the data (RMSEA = .05) for the student questionnaire and teacher<br />
questionnaire (RMSEA = .07). In the four-factor model all items corresponded to their<br />
intended factor. Cronbach’s alphas for the subscales in both teacher and student<br />
questionnaires were high.<br />
82 ENAC 2008
Assessing Professional Learning: the challenge of the<br />
UK Professional Standards Framework<br />
Ruth Pilkington, University of Central Lancashire, United Kingdom<br />
This paper addresses issues of assessment from the perspective of assessing academic<br />
professional learning.<br />
Within the UK since 2006 there has been a Professional Standards Framework with three<br />
standards descriptors which can be used to recognise the performance and professional<br />
standing of academic members of staff in HE. It is proposed that Institutions of Higher<br />
Education in the UK should adopt these standards descriptors when establishing continuing<br />
professional development (CPD) frameworks for academic staff. This initiative extends the<br />
existing range of postgraduate certificates widely used across the HE sector to structure<br />
initial professional development.<br />
Three issues emerge from this relating to the assessment themes in this conference:<br />
1. Most existing PG Certificates for recognising initial professional development for<br />
academics are accredited at M- level – as masters study. What does this mean for a<br />
professional context framed by professional values which embraces both formal and<br />
non-formal learning?<br />
2. How do you assess performance against standards in a way that is meaningful,<br />
developmental, acceptable to the academy and which is NOT competence-based?<br />
3. Professional performance and development is reliant on a notion of professional<br />
reflection and learning that is challenging for certain discipline cultures. Does the<br />
research provide sufficiently rigorous and flexible models that can adapt to more<br />
professionally appropriate tools of assessment than the current reliance on written<br />
reflective documents?<br />
The paper explores assessment at University of Central Lancashire (UCLan), UK, where an<br />
initial professional development award, the PG Certificate in Learning and Teaching in HE,<br />
has been in operation since the 1990s. The PG Certificate was originally graded using<br />
percentages but shifted to a simpler pass/refer system of grading against achievement of<br />
learning outcomes. This system has refined over the years to provide a rigorous model that<br />
has also been adopted across a Masters in Education (Professional Practice in HE). This<br />
Masters award forms a formal component of the academic CPD framework currently being<br />
developed at UCLan. Assessment of professional development within the wider framework<br />
is based on a professional dialogue using outcomes designed around the UK Professional<br />
Standards Framework descriptor statements.<br />
Recent involvement in a literature review of reflective practice as part of a national project<br />
has prompted a number of questions about the assessment of professional academic<br />
practice (Kahn et al). Within the literature I identified valuable tools for assessing academic<br />
development which explored professional learning in relation to stages of teacher<br />
development (Bell, 2001; Manouchehri, 2002; Kreber, 2004). This complements models<br />
structuring levels of reflective engagement (Moon,2004; Van Manen,1991; Hatton & Smith,<br />
1995).<br />
Practice emphasises a particular level of reflective engagement and engagement with<br />
literature to set assessment parameters appropriate to masters’ study. This is applied even<br />
when marking against learning outcomes. The spoken word shifts the parameters for<br />
measurement and judgement especially where it is part of a developmental process. What<br />
criteria will suit assessment of professional learning against standards and how will the<br />
criteria inform judgement within professional dialogues?<br />
ENAC 2008 83
Feedback – all that effort but what is the effect?<br />
Margaret Price, Karen Handley, Berry O’Donovan<br />
Oxford Brookes University, United Kingdom<br />
As resource constraints in higher education impact on the student experience, the<br />
importance of effectiveness of our practices is brought into sharp focus. This is particularly<br />
true for formative feedback which is arguably the most important part of the assessment<br />
process in its potential to affect student learning and achievement and develop deeper<br />
understanding of assessment standards. The process of giving and receiving feedback is<br />
considered limited in its effectiveness (Gibbs & Simpson, 2002; Lea & Street, 1998). This<br />
paper argues that measuring the effectiveness of feedback is fraught with difficulties.and<br />
draws on findings from a 3 –year project addressing student engagement with assessment<br />
feedback to illustrate staff and student views of effectiveness and engagement<br />
Effectiveness can only be judged if the feedback’s purpose is clear and the outcomes (e.g.<br />
learning, or student engagement) are measurable. Our study reveals the difficulties of easily<br />
evaluating feedback given the variability in staff views about the purpose of feedback and<br />
student expectations about what feedback really 'is' .Such diversity of views will rarely result<br />
in a perfect match between assessor and assessee which may explain the high levels of<br />
dissatisfaction and ineffectiveness (National Student Survey).<br />
If, as is widely accepted, feedback should primarily support future learning, its effectiveness<br />
should ideally demonstrate impact on learning. However the problem of isolating the effect<br />
of feedback within the multifaceted learning environment means that causal relationships<br />
are difficult if not impossible to prove (Salamon, 1992). Our study revealed that generally<br />
staff had no real expectation of measuring feedback’s effectiveness. There was extensive<br />
use of passive feedback methods which had no mechanisms to monitor engagement with or<br />
the effect of the feedback provided. In addition fragmented course structures limited the<br />
opportunity for the monitoring of future application of feedback.<br />
Our study confirmed the well documented and largely negative student view of feedback<br />
(Holmes & Smith 2003; McLellan 2001; Hounsell 1987) but also revealed student<br />
disillusionment with passive methods which they saw as only justifying the grade and<br />
precluded the opportunity for dialogue. For many students this led to disengagement with all<br />
feedback, engendering an impossible situation for staff seeking to engage them in the future.<br />
Simple performance measures for the effectiveness of feedback are not obvious. We may<br />
have to settle for measures of engagement rather than effects on learning but even<br />
engagement is difficult to evaluate. However meeting the students’ strong desire for more<br />
opportunity for dialogue may offer a way forward. Dialogue offers staff the opportunity to<br />
check effectiveness of feedback provided as well as an indication of student engagement.<br />
Resource constraints will not allow the return to traditional approaches to engendering<br />
dialogue but innovative ways can and must be found if feedback is be effective and<br />
demonstrably useful. Discussion will address the pitfalls of some traditional feedback<br />
processes and suggest approaches which provide performance measures of feedback<br />
within the process of increasing engagement.<br />
84 ENAC 2008
Student teachers on assessment:<br />
First year conceptions<br />
Ana Remesal, Universidad de Barcelona, Spain<br />
In the last decade there was a strong claim for formative assessment and important<br />
attempts of changing assessment practices have been made both at multiple national levels<br />
and also at an international level (Black & Wiliam, 2005; Coll et al. 2000). Nevertheless, in<br />
the author’s opinion, any attempt to change school practices confronts at least two big<br />
challenges. On the one hand, the evaluation practices at an institutional level often do not<br />
really support formative practices in the classroom; on the other hand, the teachers’ own<br />
conceptions of assessment often hinder the implementation of innovative practices<br />
(Remesal, 2007). Some studies have been carried out up to now in order to investigate<br />
teachers’ and secondary students’ about assessment (Brown, 2005, Remesal, 2006).<br />
These previous studies point at the key importance of the step from being just a student to<br />
starting to become a teacher in the professional career. In this paper the author wants to<br />
present results of the application of one scale on teachers’ conceptions of assessment<br />
(Brown, 2006). 450 teacher-student freshmen from a European country were asked to<br />
respond to a Likert questionnaire. The instrument consists of 27 items with 6 options<br />
response, positively packed. The questionnaire is based on a 4 conceptions-model:<br />
assessment as a tool for improving teaching and learning, assessment as a certifying tool,<br />
assessment aimed at accounting functions and assessment with no use at all on education.<br />
Results show a wide diversity of conceptions among student teachers and some significant<br />
differences related to the students previous educational experience (whether they were in<br />
first or second career). These results put an important challenge to us, teacher educators, if<br />
we aim at changing school practice from the root. In this latter sense, as a close future line<br />
of research, the author proposes a second answering to the questionnaire in 3 years time,<br />
when these 450 students will finish their university studies in order to identify the occurrence<br />
of changes along the teacher education program.<br />
References<br />
Black, P. & Wiliam, D. (2005). Lessons from around the world: how policies, politics and cultures constrain<br />
and afford assessment practices. The Curriculum Journal, 16(2), 249-261.<br />
Coll, C., Barberà, E., & Onrubia, J. (2000). La atención a la diversidad en las prácticas de evaluación.<br />
Infancia y Aprendizaje, 90, 111-132.<br />
Brown, G. T. L. (2006). Teachers’ conceptions of assessment: Validation of an abridged instrument.<br />
Psychological Reports, 99, 166-170.<br />
Brown, G. T. L. (2005). Teachers' conceptions of assessment: Overview, lessons, & implications. Invited<br />
NQSF Literature Review for the Australian National Quality<br />
Remesal, A. (2006). Los problemas en la evaluación del aprendizaje matemático en la educación<br />
obligatoria: perspectiva de profesores y alumnos. Tesis Doctoral, Universidad de Barcelona.<br />
Remesal, A. (2007). Educational reform and primary and secondary teachers’ conceptions of assessment.<br />
The Spanish instance, building upon Black & Wiliam (2005). The Curriculum Journal. 18(1). 27-38.<br />
ENAC 2008 85
Testing our citizens.<br />
How effective are assessments of citizenship in England?<br />
Mary Richardson, Roehampton University, United Kingdom<br />
The idea that citizenship education might provide some kind of solution to social problems is<br />
nothing new (Greenwood and Robins, 2003; Faulks, 2000; 2006). Over a decade ago,<br />
following the publication of the White Paper Excellence in Schools (DfEE, 1997) and ‘Crick’<br />
Report (QCA, 1998), citizenship became a statutory part of the National Curriculum for<br />
England. There appears to be no opposition to the idea of educating young people about<br />
citizenship, but there are issues that have arisen from the decision to make it a mandatory<br />
subject in maintained secondary schools (Kerr et al, 2003). The most significant of these is<br />
assessment.<br />
There is at present a paucity of literature that focuses on the assessment of citizenship<br />
education and an assessment ‘deficit’ within the subject is becoming apparent. In its 2006<br />
report, Ofsted found sparse evidence of coherent and effective assessment and Kerr et al<br />
(2007) claim that assessment of citizenship continues to be problematic. The challenge for<br />
citizenship educators identified by Tudor (2001) and Jerome (2002) amongst others,<br />
includes the need to construct meaningful assessments that relate to the beliefs and values<br />
under discussion. Teachers are presented with a framework for assessing citizenship, but<br />
citizenship is a new, and different subject and apply modes of assessment to content such<br />
as active participation and voluntary activities is not straightforward.<br />
This research study seeks to develop:<br />
• knowledge and understanding of the assessments of citizenship education in<br />
maintained English secondary schools;<br />
• an understanding of the general perceptions of assessments by their primary user<br />
groups – teachers and students; and<br />
• an evidence base for policy in regard to the citizenship curriculum and its assessment.<br />
This paper describes the current structure of assessment for citizenship in secondary<br />
education in England and discusses the rationale for the assessment of citizenship.<br />
Philosophical and sociological literatures inform the conceptual analysis of definitions of<br />
citizenship; curriculum theory underpins an evaluation of teaching materials, policy and<br />
curriculum development documentation; and the literature of assessment informs the<br />
interrogation and discussions around specifications, examination papers and assessment<br />
documentation from a range of sources.<br />
An empirical evaluation of citizenship assessment from the perspective of the key user<br />
groups, teachers and pupils, was central to this research. Pilot investigations found no<br />
uniform approach to assessment and this has a significant effect upon the status of the<br />
subject (Richardson, 2006). A mixed-method approach combined a questionnaire survey<br />
sent to teachers and pupils in secondary schools across England; and interviews with pupils<br />
(Years 9-11) and teachers in 18 schools around England. The findings include a discussion<br />
of pupils’ attitudes towards end of key stage assessments and the current GCSE<br />
specifications offered for citizenship. Results suggest generally positive attitudes towards<br />
citizenship as a subject, but responses from teachers and pupils underline an educational<br />
ethos which values only the things that can be measured and graded. This attitude towards<br />
assessment appears to be affecting the perceived value of citizenship and teachers often<br />
struggle to develop methods of assessment which are appropriate for the subject.<br />
86 ENAC 2008
Standards in vocational education<br />
Andreas Saniter, University of Bremen, Germany<br />
Rainer Bremer, University of Bremen, Germany<br />
The main reason for the reliability and success of cross-OECD comparative studies in<br />
general education is not only the transnational agreement about educational standards but<br />
also the comparability of educational systems. This is not met in vocational education:<br />
Systemic differences between dual, modularized and school-based vocational education<br />
and training are obvious and generate serious obstacles in finding standards compatible to<br />
all national curricula (cf. Bremer 2005).<br />
The Leonardo pilot-project AERONET has pursued an approach that is independent from<br />
national curricula or systemic preferences. The first step was a survey about the Typical<br />
Professional Tasks (TPT) of skilled work in Aeronautic industries (mechanics and<br />
electricians) in selected Airbus-plants in France, Spain, Germany and the UK. Each TPT<br />
describes a cluster of related work processes, e. g. “Production of metallic components for<br />
aircraft or ground support equipment. In each plant skilled workers perform between 9 and<br />
12 TPT (for each profession) with surprisingly small differences between the countries<br />
(details can be found on http://www.pilot-aero.net ). To be proficient in these tasks is not<br />
only part of skilled work but also the aim of the apprenticeship – with the exception of Spain,<br />
where no apprenticeship in aeronautics exists and new workers are trained for one work<br />
process only. In our approach (Bremer/Saniter 2006) the professional work on each of<br />
these tasks is the vocational education standard and basis for evaluation – not set by<br />
trainers but by the community of practice. Obviously beginners and advanced apprentices<br />
are not yet able to fulfill all requirements of a complex task – we assessed their<br />
performance by analyzing their approaches to a holistic evaluation task in terms of<br />
understandability, practicability and usability. For each profession an evaluation task related<br />
to the assembly of equipment was chosen and was presented in a paper and pencil test to<br />
around 150 first, second, third year apprentices in France, Germany and the UK. The<br />
apprentices had 4 hours to work on the task.<br />
Surprisingly the better solutions were quite similar independently of the country and the<br />
years already spent in apprenticeship – it seems that different tracks lead to comparable<br />
results. More significant is the analysis and comparison of the performance of the<br />
apprentices who failed (partly): Whereas in our sample the German apprentices with<br />
acceptable solutions tended to ignore some aspects of the task, many participants from the<br />
UK followed the processes they had learnt, regardless of its applicability to the task and the<br />
French apprentices developed inventive but unrealistic solutions.<br />
We will present detailed results and first hypotheses concerning the relation of competence<br />
development and systemic aspects of the respective national vocational education and training.<br />
References<br />
Bremer, R. 2005: Kernberufe — eine Perspektive für die europäische Berufsentwicklung? in: Grollmann,.<br />
Philipp; Kruse, Wilfried; Rauner, Felix (Hrsg.): Europäisierung der Berufsbildung, Reihe Bildung und<br />
Arbeitswelt, Bd. 14, Münster, S. 45–62.<br />
Bremer, R.; Saniter, A. 2006: La recherche en matière développement de compétences chez les jeunes en<br />
milieu professionnel, in: L’École Comparée — Regards croisés franco-allemands, Groux, Dominique;<br />
Helmchen, Jürgen ; Flitner, Elisabeth, Paris.<br />
ENAC 2008 87
Why do some students stop showing progress on progress tests?<br />
Lydia Schaap, Erasmus University Rotterdam,The Netherlands<br />
H.G. Schmidt, Erasmus University Rotterdam,The Netherlands<br />
The Institute of Psychology at Erasmus University in Rotterdam has a problem-based (PBL)<br />
curriculum. Some of the goals of PBL are to promote a deeper understanding of the to-belearned<br />
material and to train students as effective problem solvers and lifelong learners.<br />
Therefore, long-term retention of knowledge is a crucial aspect in this learning environment<br />
(Norman & Schmidt, 1992). The Institute of Psychology wishes to reflect these goals in its<br />
assessment policy. It was decided to implement progress testing as the main assessment<br />
tool in the bachelor programme, because this Progress Test (PT) focuses on long-term<br />
retention of knowledge and measures knowledge growth (Van der Vleuten, Verwijnen, &<br />
Wijnen, 1996). Moreover, the direct association between a specific course and its test is<br />
disconnected and endless resits of exams are prevented. By using the PT as the main<br />
assessment tool, it is hoped that students are challenged to study in a way that promotes<br />
long-term retention of knowledge and that students are motivated to follow (to some extent)<br />
their own interests when studying.<br />
The PT, as used in the psychology program, reflects the objectives of the first two years of<br />
the bachelor programme. In these two years the basic knowledge of several domains of<br />
psychology are studied in sequentially programmed, five-week courses. The PT is<br />
administered four times a year. To promote studying on a regular basis, every course ends<br />
with a ‘course test’. Course tests are not rewarded with credits, however when students<br />
obtain an average score of 6.5 (on a ten-point scale) on these course tests, it is possible for<br />
them to compensate insufficient achievement on the PT.<br />
Several analyses have been carried out on the assessment data. For instance, the<br />
relationship between course tests and progress tests has been studied, as well as students’<br />
knowledge growth. Analyses have shown that scores on course tests and progress tests<br />
correlate sufficiently (r = .70). From the analyses on the growth of student knowledge it<br />
appears that not all students show the same growth curves. In fact three groups can be<br />
distinguished (Bouwmeester & Van Onna, under revision), including a group that does not<br />
grow any more after the second year of study. Currently we are trying to explain why some<br />
students show more knowledge growth than other students. We do this by taking into<br />
account student variables (e.g. IQ, professional skills in tutorial groups and study behaviour,<br />
such as invested study time and processing strategies) as well as test variables (e.g. item<br />
characteristics and level of knowledge measured). Results will be presented at the<br />
conference to discuss the different ways of influencing student variables and test variables<br />
to promote long term retention and knowledge growth.<br />
References<br />
Norman, G. R. & Schmidt, H. G. (1992). The psychological basis of problem-based learning: A review of<br />
the evidence. Academic Medicine, 67, 557-565.<br />
Van der Vleuten, C. P. M., Verwijnen, G. M., & Wijnen, W. H. F. W. (1996). Fifteen years of experience with<br />
progress testing in a problem-based learning curriculum. Medical Teacher, 18(2), 103-109.<br />
88 ENAC 2008
Contextualising Assessment: The Lecturer's Perspective<br />
Lee Shannon, Liverpool Hope University, United Kingdom<br />
Lin Norton, Liverpool Hope University, United Kingdom<br />
Bill Norton, Liverpool Hope University, United Kingdom<br />
Aim: In the research literature assessment is recognised as a fundamental driver of the<br />
learning process (Boud, 2007;Gibbs and Simpson, 2004-5; Ramsden, 2003; Rust et al.,<br />
2005) yet the relationship between lecturers’ pedagogical belief and choice of assessment<br />
is an under explored area. The aim of this study, which builds on work by Harrington et<br />
al.(2006) and Norton et al.(2005), was to elicit lecturers’ perceptions of assessment within<br />
the broader context of their philosophy of learning and teaching. Further focus is on specific<br />
aspects of the marking process, feedback relationships and the relationship between past<br />
experiences and current practices.<br />
Methodology: Thirty in depth semi-structured interviews were carried out with lecturers in 18<br />
disciplines at three higher education institutions in the UK. Participants ranged in<br />
assessment experience from 1-22 years and were drawn from a spectrum of academic<br />
backgrounds, some followed a career purely in academia, others were experienced<br />
practitioners in their field before entering higher education. Thematic analysis, using the<br />
process developed by Braun and Clarke (2006), was selected as an appropriate interpretive<br />
tool in dealing with individual and shared meanings at the analysis stage. The analysis was<br />
carried out by three experienced researchers who worked on transcripts independently at<br />
first, then jointly in an iterative process, to arrive at an agreed thematic structure. Data found<br />
to be relevant to the overall research question were developed to form thematic strands for<br />
further analysis. At all times themes were checked against the original transcripts to ensure<br />
an accurate representation of the data.<br />
Findings: Themes highlighted include:<br />
• Practical and emotional entailments of assessment regimes.<br />
• Relationships between assessment and philosophies of learning and teaching.<br />
• Assessment for learning.<br />
• Feedback: the students’ response.<br />
• Perceptions of students’ understanding of assessment.<br />
• Features of a ‘good’ University education.<br />
• Lecturing experience and changes in self perception.<br />
• Words of wisdom for the uninitiated.<br />
Implications: This study uncovered a range of features of assessment practice in various<br />
contexts and disciplines and emphasises the participants collective view that there is a need<br />
for explicit training in assessment design, marking and the use of feedback, which<br />
resonates with the findings of Rust (2002). It was also found that participants’ approaches<br />
varied from those who held explicit philosophies of learning and teaching inextricably linked<br />
to their assessment practices, to those who made implicit assumptions about pedagogy that<br />
bore little relation to the choice of assessment.<br />
Discussion: The above issues are discussed in relation to current trends in assessment in<br />
higher education and the relationship between individual pedagogies and assessment<br />
practices. Practical guidelines for supporting staff development are suggested as are key<br />
areas for further research.<br />
ENAC 2008 89
Learning to read: Modeling and assessment of<br />
early reading comprehension of the 4-year-olds in Macao kindergartens<br />
Pou Seong Sit, University of Macau, China<br />
Kwok-cheung Cheung, University of Macau, China<br />
Learning to read for the 4-year-olds is no easy task and this is especially so in traditional<br />
kindergartens in Macao. Challenging assessment in the form of integrated assessment system<br />
(IAS) is fervently needed (Birenbaum, et al. 2006). The aim of the present study is to scaffold<br />
children to higher level of reading comprehension through reciprocal teaching methods so that<br />
children learn to make use of the acquired reading strategies (i.e. questioning, clarifying,<br />
predicting, summarizing) to read predictable storybooks (Palincsar & Brown, 1984) . Notions of<br />
assessment, an integration of “assessment of learning” and “assessment for learning”, are<br />
extended and elaborated to form an IAS to comprise: (i) whole-class storybook reading<br />
instruction using reciprocal teaching methods; (ii) individualized storybook assessment followed<br />
by storyboard assessment after whole-class reading instruction, and (iii) assessment-driven<br />
action research at the reading corners that seek to bring up children’s reading comprehension<br />
levels at their zone of proximal development (Sit, 2007).<br />
Central to the research design is a conceptual model of early reading comprehension the<br />
knowledge structure of which transits from the language world progressively to the human<br />
world, and at the same time depicts children’s minds developing through “recognizing words<br />
and grammar” to “situated understanding of the texts read” (Tse, et al. 2005). Capitalizing<br />
on the textbase and situation model of the story contexts children progress from “learn to<br />
read” to “read to learn” via four distinct developmental milestones: (i) extracting the surface<br />
meanings of the texts and recognizing the apparent features of pictures read, (ii) inferring<br />
the underlying meaning of texts and inner structure of pictures for situated understanding;<br />
(iii) making connections of the meanings constructed from texts and pictures for holistic<br />
understanding of the story read; and (iv) acquiring reading strategies through reciprocal<br />
teaching methods (Sit, 2008). Worthy of particular mention is that interpretation of the four<br />
progressive milestones is done in the light of Feldman’s (1994) ideas of “sequentiality” and<br />
“hierarchical integration”, alongside the simultaneous restricted use of “universality” and<br />
“spontaneousness” of cognitive development.<br />
Central to the implementation of IAS is the development of storybook and storyboard<br />
assessment compatible with the proposed conceptual model. In the individualized<br />
storyboard assessment, target children are guided reading the storybook and are<br />
questioned and rated using a 4-point Likert scale in accordance with the objectives of the<br />
whole-class reading instruction (e.g. whether know the main characters after reading the<br />
front cover of the storybook). In the individualized storyboard assessment immediately<br />
following the storybook assessment, children make use of a storyboard to tell the story just<br />
read. They are rated using a 4-point Likert scale according to the situated understanding<br />
already emerged in their minds. Areas assessed include degree of participation, utilization<br />
of materials provided, teacher-student interactions, power of expression, completeness of<br />
the story structure exhibited, consistency of story themes, and signs of creativity.<br />
Using the assessment results as feedback for the design of action research, the present<br />
study was successful to scaffold children of varying learning ability to progress along the<br />
milestones as envisaged in the assessment model.<br />
90 ENAC 2008
Assessment in action –<br />
Norwegian secondary-school teachers and their assessment activities<br />
Anne Kristin Sjo, Stord/Haugesund University College, Norway<br />
Knut Steinar Engelsen, Stord/Haugesund University College, Norway<br />
Kari Smith, University of Bergen, Norway<br />
This paper deals with the conference theme “Learning-oriented assessment”, and takes a<br />
closer look at formative assessment amongst teachers in lower-secondary school. The<br />
paper is research-related.<br />
One of the strongest criticisms against Norwegian teachers is their lack of formative<br />
assessment skills. Results from the 2003 PISA study indicate that Norwegian teachers<br />
spend less time on feedback and reinforcement strategies than teachers from other OECD<br />
countries (Grønmo et al., 2004). This in spite of the fact that international research<br />
considers formative assessment strategies as extremely important determinants for<br />
students’ learning (Black & William, 1998; Coffield et al., 2004).<br />
The aim of this study is to find out what kind of formative assessment practices are<br />
identified amongst teachers in lower-secondary school, and more specificly, what kind of<br />
feedback processes can be detected and developed between teachers and their students.<br />
The research context is an action research project funded by the Norwegian Research<br />
Council which focuses on developing teachers’ assessment competence. The project is<br />
carried out at two schools involving nine teachers, each developing their own digital<br />
portfolio. During the course of the project the teachers will take note of experiences from<br />
their assessment practice and reflect upon these in relation to literature about assessment.<br />
The paper focuses on teachers’ feedback practices and the analysis is built on a<br />
multileveled ethno-methodological study. The intention behind the study is to gain insight<br />
into both the teachers’ real practices as seen in the classroom and how they themselves<br />
experience and explain their own feedback strategies.<br />
Initially two teachers were observed through a period of five days. The teachers where<br />
videotaped to show their interactions with the students in an attempt to reveal how the<br />
feedback processes were carried out step by step. Three situations from the video data are<br />
used in an interaction analysis (Jordan & Henderson, 1995) to show the different steps<br />
taken in the feedback processes. The observation period is followed by seven open-ended<br />
teacher interviews (Kvale, 2001). The teachers are asked to comment on various findings<br />
from the interaction analysis and elaborate on their own experiences with different ways of<br />
giving feedback to the students. In addition to an analysis of the teachers’ portfolios, this will<br />
draw a multi-levelled picture of existing practice, and also indicate how it is developing<br />
The preliminary findings indicate that the formative assessment processes performed in<br />
classrooms and the feedback situations are more complex and have even more layers than<br />
first assumed. The student’s perception of a feedback comment is to a large extent<br />
dependent on the context in which it is given, and the same comment can be understood by<br />
the student as either informative or non-informative, as constructive or destructive, as<br />
feedback or feed-forward, all depending on the context. The interaction analysis shows that<br />
the communication between teachers and students has a certain tacit dimension and one<br />
important factor to be considered in the analysis is whether the interaction involves an<br />
implicit shared inter-subjective (Rommetveit, 1974) understanding between the teacher and<br />
the student or not.<br />
ENAC 2008 91
How do students teachers and mentors assess the Practicum?<br />
Kari Smith, University of Bergen, Norway<br />
There is recognition of the importance of Practicum in teacher education (Korthagen et al.,<br />
2001; Smith & Lev Ari, 2005). Understanding for the need of student teachers to gain<br />
access to practitioners' tacit knowledge is expanding (Cambell & Kane, 1998). There is<br />
more to teaching than the direct product of theoretical knowledge and practical skills.<br />
Teaching is highly contextualized, which makes assessment of teaching a complex issue.<br />
A central focus for the Practicum is to help students develop independent reflective<br />
competence for future career-long professional development (Dewey, 1933; Schön, 1987;<br />
Korthagen, 2001; Day, 1999, 2004). Student teachers are expected to recognize when<br />
learning takes place and to recognize what is needed for future learning (Brodie & Irving,<br />
2007). These are internal self-assessment activities related to a specific learning context<br />
and are not easily articulated.<br />
Recently the role of mentors is rightfully receiving increased attention. Smith & Lev Ari (2005)<br />
show that mentors are the most significant contributors to students’ learning during Practicum.<br />
A key function of mentors is assessing students’ teaching competence. This is a difficult<br />
task as assessment serves multiple functions, to present student teachers with feedback<br />
and guidance and to serve summative and judgmental functions to protect the profession<br />
from incompetence (Smith, 2006).<br />
There is tension between the supporting role and the assessment role, especially in relation<br />
to summative assessment (grading). However, mentor opinion is essential to strengthening<br />
the validity of assessment. Mentors know the context of teaching and are able to assess the<br />
appropriateness of actions in that specific setting. Mentors accumulate practical and nondocumented<br />
evidence of assessment dialogue.<br />
The focus of the current study is to examine the extent of agreement between students’ and<br />
mentors’ assessment of the Practicum.<br />
A random sample of 20 students and their 20 mentors will be selected after the spring<br />
Practicum, and asked if they agree to respond electronically to an open ended structured<br />
questionnaire with the following focus points:<br />
• What is a good Practicum?<br />
• Strong points exhibited by student<br />
• Issues that need to be strengthened<br />
• How to go about implementing alternatives for improvement<br />
• Overall assessment of the practice period (grade).<br />
The responses of the 20 pairs (student/ mentor) will be compared to each other internally:<br />
• Comparison of open responses to each question<br />
• Comparison of grades (final question)<br />
Finally, the responses of the students and the teachers will be analysed separately to look<br />
for group commonalities.<br />
To ensure the validity of the findings the author and an additional researcher will analyse<br />
the data separately. The presented results represent the outcome of a moderation process..<br />
Data collection takes part in March 2008.<br />
Significance: The quality of communication and shared understanding of goals between<br />
students and mentors seem to be a major criterion for a successful Practicum and for<br />
quality assessment of students’ achievements. Until today, research on assessment of the<br />
Practicum is meagre (Graham, 2006), and hopefully the current study will deepen our<br />
understanding of the extent of agreement between students and mentors.<br />
92 ENAC 2008
Assessment of competencies of apprentices<br />
Margit Stein, Catholic University of Eichstätt-Ingolstadt, Germany<br />
Within the project ‘LAnf - Leistungsstarke Auszubildende nachhaltig fördern’ / Assisting<br />
highly competent apprentices’ of the BiBB (Bundesinstitut für Berufsbildung / Federal<br />
Institute for Vocational Education and Training) instruments and assessments were tested<br />
for diagnosing highly competent apprentices and young professionals.<br />
In reference to the deseco-program of the OECD (‚Definition and Selection of Competencies’)<br />
competencies within the project LAnf was not merely defined as cognitive<br />
competencies but also as achievement motivation with a high willingness for learning as<br />
well as social competencies and autonomy. This definition of competencies within LAnf<br />
sticks to the three aspects of competencies in the deseco-program: the competence to act<br />
autonomously, the effective and interactive use of symbols like language or mathematical<br />
symbols and the effective interaction in various heterogeneous groups. Especially within the<br />
domain of professional education and training a concept that would define competencies<br />
merely as skills would be rather one-sided.<br />
Up to now especially theoretical models regarding the concept of „professional competencies<br />
and skills” were developed that were rarely approved and assessed within reality.<br />
Within the project LAnf the assessment for professional competencies was developed on<br />
the assumption that within professional contexts even more than in the context of instruction<br />
within schools effective and competent action relies on factors besides mere cognitive<br />
competence like autonomy and the effective interaction in heterogeneous groups.<br />
Based on these theoretical assumptions within LAnf a psychometric assessment was<br />
developed and tested for assessing highly competent apprentices and young professionals.<br />
In a first step trainers of different enterprises and companies of varying size were asked to<br />
name apprentices and trainees who were outstanding concerning their competencies within<br />
daily work. This group of highly competent apprentices regarding to their trainers was then<br />
in a second step compared with a group of vocational school students that was matched<br />
concerning age, sex and apprenticeship training position. Both groups were confronted with<br />
an assessment based on the three dimensions of competencies of the desesco approach.<br />
The data stated that the group of highly competent apprentices regarding to their trainers<br />
(n=52) outmatched the group of vocational school students (n=61) within the domains of<br />
cognitive competencies and intelligence. The first group was even more highly significant<br />
predominant regarding achievement motivation and effective interaction in various<br />
heterogeneous groups. The matching between professional interest and professional<br />
demands was not significantly different between both groups. The data shows that not only<br />
cognitive aspects but also motivational and social aspects of competencies differ between<br />
groups that display different professional performance.<br />
ENAC 2008 93
Academics’ epistemic beliefs about their discipline and implications for their<br />
judgements about student performance in assessments<br />
Janet Strivens, The University of Liverpool, United Kingdom<br />
Cathal O'Siochru, Liverpool Hope University, United Kingdom<br />
In recent years there has been a growing focus within the debates on learning and teaching<br />
in higher education on the importance of the discipline. Academics’ primary professional<br />
allegiance is known to be to their subject and it is increasingly seen as ‘good practice’ to<br />
approach the development of their teaching skills through a disciplinary perspective. This<br />
brings into question what we really know about the nature of disciplines. Do academics<br />
within subjects share a consistent set of beliefs about their subject which do or should<br />
influence the way they teach and assess their students? If so, are these beliefs implicit or<br />
can they be clearly articulated, for the presumed greater benefit of students? Or are there in<br />
fact significant inconsistencies which may lead to different criteria applied to judgements<br />
about the quality of student performance, with the likely result of leaving students confused<br />
and uncertain?<br />
This paper reports on two studies with very different methodologies but a similar focus on<br />
making explicit the beliefs of academics about their subject and implications for their<br />
students. The first study explores ‘epistemic match’ between students and staff (faculty)<br />
using a pair of measures, both based on Hofer’s questionnaire on epistemological beliefs<br />
(Hofer, 2000): the second uses in-depth interviews to explore lecturers’ perceptions of how<br />
and why they make certain judgements about the quality of their students’ work when<br />
carrying out assessments, and what this means in terms of explicating their beliefs about<br />
‘knowledge’ and ‘learning’ in their subject area.<br />
Findings from both studies will be compared to attempt to establish what has already been<br />
learned about the significance of academics’ beliefs about their subject in relation to the<br />
learning of their students, and to draw out lessons for future research in this area.<br />
94 ENAC 2008
Techniques for Trustworthiness as a Way to Describe Teacher Educators’<br />
Assessment Processes<br />
Dineke Tigelaar, Jan van Tartwijk, Fred Janssen, Ietje Veldman, Nico Verloop<br />
ICLON-Leiden University Graduate School of Teaching, Netherlands<br />
Portfolios are increasingly being used in teacher education, both as a learning tool and as a<br />
tool for assessment. Since their introduction, portfolios have been expected to contribute to<br />
the learning and development of prospective teachers (Bird, 1990; Zeichner & Wray, 2001).<br />
Teaching portfolios should make prospective teachers think more carefully about their<br />
teaching and subject matter (Anderson & DeMeulle, 1998; Bartell, Kaye & Morin, 1998;<br />
Darling-Hammond & Snyder, 2000). However, portfolio use is often problematic. First, the<br />
potential benefits for student teacher learning often fail to materialize (Darling, 2001).<br />
Second, unambiguous portfolio rating is difficult to achieve, since information in portfolios is<br />
often non-standardized and derived from various contexts (Schutz & Moss, 2004). This implies<br />
that assessors have to interpret portfolio information and take account of context before they<br />
can derive judgments, which causes reliability problems. Therefore, a portfolio procedure is<br />
needed that promotes both student teachers’ learning processes and responsible interpretation<br />
in context. Applying Guba & Lincoln’s (1989) criteria for ‘trustworthiness’ seems promising in<br />
this respect (Tigelaar et al, 2005). Complying with these criteria means that trust must be built<br />
between assessors and student teachers, with assessors being aware of student teachers'<br />
concerns through extensive involvement in their learning processes (‘prolonged engagement’,<br />
‘persistent observation’). Assessors should discuss hypotheses with a peer and search for<br />
counterexamples (‘peer debriefing’, ‘progressive subjectivity’). Interpretations should be tested,<br />
accounting for all available evidence (‘negative case analysis’), and be ‘member checked’ with<br />
student teachers. Interpretations should be documented and conclusions should be supported<br />
by the original data (‘dependability’, ‘confirmability’). Finally, information about assessment<br />
conditions should be available (‘thick description’).<br />
In this study, eight teacher educators participated. Teacher educators acted both as<br />
supervisor and assessor for prospective teachers. We explored how teacher educators’<br />
formative and summative assessment of student teachers can be described using the<br />
framework that Guba and Lincoln provide. Research question: to what extent do teacher<br />
educators’ assessment activities relate to techniques for trustworthiness?<br />
Teacher educators were interviewed about the application of trustworthiness criteria when<br />
working with the portfolio. Questions focused on: (1) using the portfolio and/or other sources<br />
of information or formative and summative assessment; (2) criteria and procedures for<br />
formative and summative assessment (3) measures for guaranteeing the quality of the<br />
assessment processes. Data were analysed, testing tentative categories derived from Guba<br />
and Lincoln, summarized in matrices, discussed among the first and second author, and<br />
checked with the original interview transcripts and the participants.<br />
‘Prolonged engagement’ and ‘persistent observation’ were applied most by teacher<br />
educators. ‘Negative case analysis’ and ‘member check’ were evident in most interviews.<br />
Documenting (‘dependability’), ‘peer debriefing’ and ‘progressive subjectivity’ was done, but<br />
paid more attention to in cases of doubt. Tracing interpretation processes were applied least<br />
(‘confirmability’, and ‘thick description’). The results suggest that teacher educators need to<br />
make better use of scoring rubrics and artefacts in the portfolio to underpin their<br />
interpretations and conclusions, including their feedback to student teachers. Furthermore,<br />
methods for responsible portfolio interpretation might need to be made less time-consuming<br />
and more practical.<br />
ENAC 2008 95
Peer Assessment for Learning:<br />
a State-of-the-art in Research and Future Directions<br />
Marjo van Zundert, Open University, The Netherlands<br />
Despite popularity and advantages of peer assessment in education, a major problem has<br />
not yet been solved. An enormous variety in peer assessment practices exists, which<br />
makes it difficult to draw inferences in terms of cause and effect. All the more since<br />
generally, literature describes peer assessment in a holistic fashion (i.e., without specifying<br />
all variables present). To date, it is unclear exactly under which circumstances peer<br />
assessment is beneficial for student learning. And it is still inconclusive precisely what<br />
evokes satisfying measurements such as reliability and validity. Hence, this study attempted<br />
to investigate which variables foster optimal peer assessment that is beneficial for student<br />
learning and with satisfactory measurements.<br />
We tackled this problem by an inquiry of 26 experimental studies to map variety in peer<br />
assessment and to identify which strategies contribute to learning and measurements.<br />
Literature was selected on the basis of five criteria: (1) published between 1990 and 2007;<br />
(2) published journal article; (3) journal listed in Social Sciences Citation Index, domain<br />
Education & Educational Research; (4) empirical study; (5) main topic is peer assessment<br />
or related term.<br />
This literature inquiry resulted in a descriptive review, in which four outcome categories<br />
were distinguished. The first category concerned measurements of peer assessment.<br />
Measurement issues included among others agreement between multiple peer<br />
assessments, or agreement between student and staff assessment. For learning from peer<br />
assessment, three categories were distinguished: domain skill, peer assessment skill, and<br />
student attitudes. Learning of domain skill referred to improved quality of students’ work.<br />
Peer assessment skill concerned students’ competence in assessing peers. Student<br />
attitudes comprised their views on peer assessment. Measurements were enhanced by<br />
training and experience. Domain skill was fostered by providing students with the<br />
opportunity to revise their work on the basis of peer assessment. Peer assessment skill was<br />
ameliorated by training and dependent on student characteristics. Student attitudes were<br />
also positively influenced by training and experience.<br />
The multiplicity of peer assessment practices and the holistic way of reporting were<br />
underlined. Future research should strive for more transparency in peer assessment effects<br />
by true or quasi experimental studies, in which relations between variables are indicated, so<br />
strong inferences in terms of cause and effect can be drawn. Besides higher education,<br />
research can be broadened to vocational and secondary education, considering current<br />
developments there. Topics that need more scrutiny comprise long term learning effects,<br />
feedback, the role of interpersonal variables, and the distinction between assessing and<br />
being assessed. Also, more clarity in measurement issues and uniformity of measurement<br />
instruments are desired.<br />
96 ENAC 2008
Investigating the Pedagogical Push and Technological Pull of<br />
Computer Assisted Formative Assessment<br />
Denise Whitelock, The Open University, United Kingdom<br />
Over the last ten years, learning and teaching in higher education have benefited from<br />
advances in social constructivist and situated learning research (Laurillard, 1993). In<br />
contrast, assessment has remained largely transmission orientated in both conception and<br />
in practice (see Knight & Yorke, 2003). This is especially true in higher education where the<br />
teachers’ role is usually to judge student work and to deliver feedback (as comments or<br />
marks) rather than to involve students as active participants in assessment processes.<br />
This paper reports on a project which set out to provide further insights into the role of<br />
electronic formative assessment in Higher Education and to point the way forward to new<br />
assessment practices, capitalising on a range of open source tools. The project built upon<br />
the premise that assessment and learning need to be properly linked. It explored the factors<br />
that influence assessment inputs, processes and outcomes by:<br />
a) Developing a suite of technological tools at different levels of support for collaborative<br />
and free text entry e-assessment<br />
b) Evaluating a series of formative assessments across a number of disciplines.<br />
An agile methodological approach was adopted rather than a plan driven methodology for<br />
the development of the software and the user evaluation since the former supports<br />
adaptation rather than prediction. Student surveys and a case study methodology were<br />
employed to understand the pedagogical drivers and barriers associated with these types of<br />
assessment.<br />
Findings<br />
One of the more challenging aspects in the current e-assessment milieu is to provide a set<br />
of electronic interactive tasks that will allow students more free text entry and provide<br />
immediate feedback to them. Open Comment was a system that was built to accommodate<br />
free text entry for formative assessment for History and Philosophy students. It forms part of<br />
the pedagogical push from the Arts Faculty to construct systems that help students decode<br />
feedback, internalise it and become more self regulated learners.<br />
Other tools developed in this project include a BuddySpace, BuddyFinder and SIMLINK<br />
combination which assisted students to work remotely in a collaborative fashion to make<br />
predictions, using a science simulation, which were embedded in a series of formative<br />
assessment tasks.<br />
One of the major findings from this project is the creativity of staff, both academic and<br />
technical, to create formative e-assessments with feedback and collaborative online tasks<br />
that empower students to become more reflective learners. It might appear in the short term<br />
that the technological pull is currently overtaking the pedagogical push in the e-assessment<br />
arena but this project has shown with this collection of open source applications, that there<br />
is way forward to redress the balance. The approach adopted here sits well within a<br />
constructivist paradigm which has often been less well served in the past through formal<br />
summative assessment which is not an integral part of the knowledge construction process.<br />
ENAC 2008 97
Strict Tests of Equivalence for and Experimental Manipulations of<br />
Tests for Student Achievement<br />
Oliver Wilhelm, Ulrich Schroeders, Maren Formazin, Nina Bucholtz<br />
IQB, Humboldt-University Berlin, Germany<br />
Much of the research on equivalence of measurement instruments across test media is<br />
easy to summarize: Unless a test of maximal behavior is strongly speeded, test media will<br />
be of negligible relevance for what the test measures. For many applied and scientific<br />
purposes, this statement is obviously too simplistic. For example, high disattenuated<br />
correlations across test media do not ascertain the irrelevance of test media. Similarly, two<br />
measures with exactly the same score distributions do not necessarily measure the same<br />
ability underlying observed maximal behavior. The issue of equivalence across<br />
manifestations of a measure is a nuisance because due to the lack of generalizable results<br />
about the absence of determinants of divergence, the equivalence of two forms of a test has<br />
to be determined for each test in each application population and across soft- and hardware<br />
realizations. Apparently, this scientifically not very intriguing problem is a psychometric<br />
Pandora box.<br />
However, a lack of equivalence does not necessarily indicate failure of converting a<br />
measure. Lack of equivalence can also indicate meaningful improvements of a measure.<br />
For example conventional listening comprehension tasks have a variety of shortcomings<br />
that can be overcome in order to improve measurement quality in computerized testing: By<br />
using computers, it is easier to ensure the same audio stream in the same quality for all<br />
participants, there are more degrees of freedom in administrating a task in a group setting<br />
(rewinding, forwarding, pausing), and the response alternatives can be included into the<br />
audio stream. Currently, the importance of such improvements is underinvestigated.<br />
In study one, we have administrated reading and listening comprehension tests of English<br />
as a foreign language in traditional and computerized versions to a larger sample of<br />
secondary students. Test version and sequence were varied between subjects. An attempt<br />
was made to keep the computerized versions of all measures as close as possible to the<br />
conventional test form even if that implied suboptimal operationalisations of a measure. In<br />
study two, we have used modified versions of listening comprehension tests, implementing<br />
not only stimuli but also responses in audio format with a similar sample. Additionally,<br />
completely newly developed video comprehension tests were administrated. In both<br />
studies, standard demographic questionnaires and a questionnaire assessing computer<br />
experiences allow for group comparisons of means and covariances in structural equation<br />
modeling. Fluid and crystallized intelligence measures serve as covariates.<br />
The focus of all analyses are covariances between latent variables in a multi-group context.<br />
The discussion will consider a) advantages and disadvantages of computerized testing from<br />
the perspective of construct validity and b) opportunities for the assessment of hitherto<br />
unmeasurable aspects of student achievement.<br />
98 ENAC 2008
Roundtable Papers<br />
ENAC 2008 99
100 ENAC 2008
Why the moderate levels of inter-assessor reliability of student essays?<br />
Morten Asmyhr, Østfold University College, Norway<br />
Although some studies indicate that inter-assessor reliability is adequate when student<br />
papers in the essay format are considered (e.g. Johnsson & Svingby 2007), other studies<br />
reveal serious shortcomings as to assessor reliability, both when examination papers and<br />
portfolios are concerned. A number of studies, some of them ancient, are revisited to search<br />
for regularities that might help identify factors that contribute to low marker reliability.<br />
At a recent occasion, all assessors agreed to submit their tentative mark prior to the final<br />
marking session at two separate examinations. The analysis of the results revealed greater<br />
differences between the two markers than 2 steps on a 7 point scale at one of the<br />
examinations. Results on the other examination was somewhat better as to marker<br />
reliability. A small number of the student papers were selected for the second part of the<br />
study. A number of assessors were from the local pool of assessors for the examination in<br />
question recruited to mark individually the papers and to record their practical procedure<br />
when marking. Their use of defined and specified assessment standards and criteria was<br />
made a significant area of concern in their reports. A sample of students sitting for the same<br />
examination was also recruited to mark the same papers. The results from the whole data<br />
set was compiled, analysed and fed back to the group of students for them to assess the<br />
assessment of the whole group of assessors.<br />
In the paper, a more comprehensive survey of studies on assessment reliability will be<br />
presented and the results from the present study will be given and discussed in relation to<br />
practical as well as theoretical concerns pertinent to examination and assessment<br />
procedures. Is it possible to maintain a satisfactory consistency across markers or do<br />
students have to accept examination results that equally dependent upon who is the<br />
marker(s) and the quality of the students’ papers?<br />
References<br />
Jonsson, A. & Svingby, G. (2007). The use of scoring rubrics: Reliability, validity and educational<br />
consequences. Educational Research Review, 2(2): 130-144.<br />
ENAC 2008 101
Approaches to the assessment of graduate attributes in higher education<br />
Simon Barrie, C. Hughes, C. Smith<br />
The University of Sydney, Australia<br />
This paper draws on a literature review of the various approaches to the assessment of generic<br />
graduate attributes. The literature review was conducted as the first stage of a national research<br />
study exploring the integration of generic attributes in Australian universities’ assessment<br />
practices (Barrie, Hughes & Smith 2007). The issue of graduate attributes (also referred to by<br />
some authors as generic, core or employability skills), has received considerable attention in<br />
recent years as universities seek to renew and articulate their purposes and demonstrate the<br />
efficient achievement of these, particularly in response to calls for accountability (Barrie, 2005).<br />
Graduate attributes have been widely taken up by universities in many parts of the world<br />
including Australia. While university policy claims commonly refer to the “integration” and<br />
“embedding” of graduate attributes, questions have been raised in relation to the alignment<br />
between what is espoused, what is enacted and what students experience and learn (Bath,<br />
Smith, Stein and Swan, 2004) in assuring their development. Australian University Quality<br />
Agency (AUQA) audits have revealed the need for more systematic addressing of generic<br />
attributes in curricular and the provision of stronger evidence in support of institutional claims<br />
than policy statements and relatively surface mapping activities.<br />
It has been argued that the strongest evidence of graduate attribute policy implementation is their<br />
embedding in course and program assessment activities (Barrie, 2004). However there are<br />
significant barriers to the achievement of this including; lack of a conceptual basis and consistent,<br />
coherent operational definitions of the intended outcomes, the difficulty in meaningfully<br />
communicating assessment standards to students, the challenge of articulating the<br />
developmental progression and the temptation to resolve problems by defining skills at an everincreasing<br />
level of detail which soon becomes unworkable for academics and students alike<br />
(Barrie, 2005; Washer, 2007). Despite these barriers, the literature contains numerous examples<br />
of approaches to the assessment of graduate attributes which demonstrate a wide diversity of<br />
methods, levels of student involvement and disciplinary contexualisation. These include:<br />
• non-traditional assessments such as moral assessments and exit interviews. (Dunbar,<br />
Brooks, and Kubicka-Miller, 2006)<br />
• attempts to develop institutional grade descriptors based on generic attributes (Leask, 2002)<br />
• the development of resources such as templates to guide the design of assessment<br />
(Watson, 2002)<br />
• authentic outcomes based approaches using portfolio assessment (Hernon, 2004;<br />
Seybert, 1994)<br />
• standardised tests such as the Graduate Skills Assessment Test and the Collegiate<br />
Skills Assessment<br />
• self-rating scales (eg CEQ)<br />
• integrated performance based assessment tasks (Hart, Bowden & Watters, 1999)<br />
• the use of postgraduate assessment strategies eg oral presentation and defence in<br />
undergraduate contexts.<br />
This paper presents a typology of assessment approaches based on an analysis of the types of<br />
assessment strategy. This typology is considered in relation to the barriers to integration<br />
identified in the literature on generic attributes and in relation to emerging theoretical and<br />
conceptual models of graduate attributes. In doing so the paper identifies the potential for<br />
different assessment approaches to overcome the key barriers to assessment of generic<br />
attributes and will stimulate discussion in relation to both theoretical and practical issues.<br />
102 ENAC 2008
Assessment for learning in and beyond courses:<br />
a national project to challenge university assessment practice<br />
David Boud, University of Technology, Sydney, Australia<br />
There has been considerable debate in recent years about how assessment can contribute<br />
to student learning. However, most of this has focused on learning within the framework of<br />
the course of study being undertaken. In a changing world however, assessment needs<br />
also to foster the learning that will occur after course completion, as higher education needs<br />
to provide students with a foundation for a lifetime of professional practice in which they will<br />
be continually required to learn and to engage with new ideas that go beyond the content of<br />
their university course.<br />
As part of this, a critique has been building on the inadequacy of formative assessment<br />
practices that help students’ learning during their courses (eg. Sadler, 1998, Yorke, 2003).<br />
There has also been substantial criticism of the role of summative assessment and its<br />
negative effects on student learning (eg. Ecclestone, 1999, Knight, 2002, Knight & Yorke,<br />
2003). There is also concern that simply increasing feedback to students is not, in itself, a<br />
worthwhile practice unless it also builds students’ capacity to critique and improve their own<br />
work (Hounsell, 2003). There is a flourishing literature exploring assessment practices that<br />
have positive effects on learning and there have been important initiatives that look at the<br />
long-term consequences of university courses, including assessment, on subsequent<br />
learning in professional practice (Mentkowski, 2000).<br />
Boud (2000) discussed the needs of assessment in a learning society and introduced<br />
requirements for a new way of thinking about assessment. He suggested that current<br />
assessment practices in higher education did not equip students well for a lifetime of<br />
learning and the assessment challenges they would face in the future. He argued that<br />
assessment practices should be judged from the point of view of whether they effectively<br />
equip students for a lifetime of assessing their own learning. More recently Boud suggested<br />
(2007) that assessment needed to be reconceptualized as an activity of informing judgment,<br />
in particular informing judgements of learners about their own work.<br />
The Carrick Institute for Learning and Teaching in Higher Education, the main funding body<br />
for teaching and learning development for Australian universities has established a one year<br />
project to draw on international research to examine how assessment practices that focus<br />
on learning in and after courses can be developed, particularly in areas where there are<br />
large cohorts of students.<br />
The Roundtable discussion will take place at the very start of this project and it will seek to<br />
elicit international collaboration. It will focus on the questions: what assessment practices<br />
have shown potential for both meeting summative purposes but also informing student<br />
judgement in ways that carry beyond the end of courses? What evidence is there for the<br />
utility of such practices? How can they be extended beyond their initial sites of<br />
development? How can the uptake of new assessment practices within universities be<br />
influenced? The approach the project has adopted on these matters will be discussed and<br />
the views of participants on these canvassed.<br />
ENAC 2008 103
Electronic reading assessment: The PISA approach for the<br />
international comparison of reading comprehension<br />
Kwok-cheung Cheung, University of Macau, China<br />
Pou-seong Sit, University of Macau, China<br />
This paper seeks to document how Macau-PISA Center prepares for electronic assessment<br />
of reading literacy for the 15-year-old students in secondary schools in Macao. First,<br />
emerging concepts of reading literacy with regard to life-long learning for our next<br />
generation in the digital age will be explicated. Congruence of the proposed concepts of<br />
electronic reading literacy with existing curricular and instructional provisions in Macao is<br />
evaluated. Second, the Reading Literacy Assessment Framework, a response to the OECD<br />
“DeSeCo Project” (i.e. Definition and Selection of key Competences) to include the ICT<br />
(information and communication technology) components as key competences, is<br />
presented to highlight the constructs assessed and nourished in the classrooms. Third, the<br />
paper demonstrates how test items and tasks for electronic assessment of reading literacy<br />
can be designed, and subsequently developed into an individualized computerized testing<br />
platform.<br />
Central to the PISA approach is the definition of reading literacy. According to PISA, reading<br />
literacy is an individual’s capacity to understand, use and reflect on written texts, in order to<br />
achieve one’s goals, to develop one’s knowledge and potential and to participate in society<br />
(OECD, 2006). Reading literacy is assessed in relation to: (1) text format (i.e. continuous<br />
versus non-continuous texts of one of the following five types, i.e. description, narration,<br />
exposition, argumentation and instruction); (2) aspects of the reading processes (i.e.<br />
retrieving information, forming a broad general understanding, developing an interpretation,<br />
reflecting on the contents and formal qualities of a text; and (3) situations (i.e. reading for<br />
work, education, private and public use). This definition goes beyond the basic skills of word<br />
recognition, phonemic awareness, decoding and comprehension, and it requires the reader<br />
to be an active and reflective user of texts so as to expand one’s knowledge and potentials,<br />
i.e. one has to understand, apply, integrate and synthesize texts to fulfill one’s life-long<br />
learning goals.<br />
In the electronic medium, the reading tasks generally necessitate students to identify<br />
important questions, locate information in line with the access structure of the reading tasks,<br />
analyze the usefulness of the information retrieved, integrate information retrieved from<br />
multiple texts, and then communicate replies through electronic means. Therefore, the<br />
electronic texts come across by the students are dynamic with blurred boundaries. In the<br />
print medium, reading tasks are fixed texts with clearly defined boundaries, and what<br />
students did during reading are: (1) retrieve information; (2) interpret texts; and (3) reflect<br />
and evaluate. Delineation of an assessment framework for electronic reading literacy<br />
demands an incorporation and extension of concepts from the print to the electronic<br />
medium. The three distinctive aspects of electronic reading literacy that have implications in<br />
the design of assessment rubrics become: (1) accessing and retrieving appropriate<br />
information online via search engines and embedded hyperlinks; (2) constructing and<br />
integrating texts read recursively in accordance with access structures by clicking links, and<br />
searching for usable information until the reader judges synthesis has been done<br />
meaningfully; (3) reflecting and evaluating critically authorship, accuracy, as well as quality<br />
and credibility of information retrieved and conveyed in the electronic texts.<br />
104 ENAC 2008
Developing the autonomous lifelong learner:<br />
tools, tasks and taxonomies<br />
Wendy Clark, Northumbria University, United Kingdom<br />
Jackie Adamson, Northumbria University, United Kingdom<br />
This paper describes an action research project undertaken with undergraduate students at<br />
levels 4 and 5. Responding to the recent focus on lifelong learning and portfolio based<br />
personal development planning (PDP), this ongoing project encourages students to adopt a<br />
deep, active approach to learning, and thus take responsibility for their own learning.<br />
Assessment is widely recognised as an important influence on student learning. Recent<br />
conceptual shifts in thinking about assessment have highlighted the importance of<br />
developing students as autonomous learners by viewing assessment as a learning tool<br />
rather than a measurement of knowledge, and portfolios are mentioned as one of the<br />
modes appropriate for the new thinking about assessment (Havnes and McDowell, 2008).<br />
Therefore, the modules forming the basis of the project, in which the PDP concept was<br />
integrated into the curricular content and supported by the use of an ePortfolio, were<br />
designed following the precepts of Biggs’ theory of ‘constructive alignment’ (Entwistle 2003).<br />
This fits well with the PDP/ePortfolio philosophy for encouraging learner autonomy, as well<br />
as fulfilling the assessment for learning (AfL) requirements for formative feedback and lowstakes<br />
opportunities for practice before submission for rigorous summative assessment.<br />
Although there is still ongoing debate about the criteria to be used for the assessment of<br />
portfolios (Smith and Tillema, 2008), social scientists such as Baume (2002) and Biggs<br />
(1997) have shown that a qualitative view of validity and reliability can ensure adequate<br />
rigour for summative assessment. However, it is necessary to ensure inter-rater reliability as<br />
well as to make the learning goals and assessment criteria transparent for learners (Havnes<br />
and McDowell, 2008). A taxonomy for portfolio evaluation has therefore been developed<br />
which is easily understood and applied by tutors and students.<br />
In order to study the impact of this learning environment, a variety of data has been<br />
collected and analysed. This includes:<br />
• student achievement of the stated learning outcomes of the modules, assessed in<br />
accordance with our taxonomy for portfolio evaluation;<br />
• “added value” as indicated by a correlation of UCAS entry points with summative<br />
assessment results and a measurement of student engagement;<br />
• the quality of student reflection and self-evaluation demonstrated in the reflective<br />
commentaries.<br />
Results from these analyses show a positive impact.<br />
In order to provide more empirical evidence, students this year have completed the<br />
Effective Lifelong Learning Inventory (ELLI) questionnaire (details available at:<br />
https://secure.vlepower.com/nlst/core/main.htm). This profiling tool serves a double<br />
purpose: it provides students with a vocabulary to describe their own thought processes and<br />
to articulate their ideas, and it provides statistical data to tutors which indicate development<br />
of both cohort and individual student’s learning characteristics over time. Preliminary<br />
analysis of these data, together with student opinion obtained in written commentaries and<br />
in debriefing interviews, shows that the learning environment created has brought about<br />
positive change.<br />
We welcome discussion of ways of evaluating student progress towards learning autonomy,<br />
in particular of the effectiveness of the ELLI profiling tool as a measurement of learning<br />
power development.<br />
ENAC 2008 105
Assessing the Art of Diplomacy? Learners and Tutors perceptions<br />
of the use of Assessment for Learning (AfL) in non-vocational education<br />
Gillian Davison, Northumbria University, United Kingdom<br />
Craig McLean, Northumbria University, United Kingdom<br />
This paper will present findings from an authentic assessment project (Assessment for<br />
Learning) undertaken with a group of final year under-graduate students undertaking a<br />
(non-vocational) Politics degree and who elected to take a module called ‘Diplomacy’, at<br />
Northumbria University.<br />
Teaching on the module comprised not only a mixture of traditional lectures and seminars,<br />
but also a board-game exercise. It is this exercise – that is, students playing the Diplomacy<br />
board-game – that will be the focus of our paper.<br />
The research methodology took the form of non-participant observation and semi-structured<br />
interviews with learners. Data was gathered in relation to the tutor’s and students’<br />
experiences throughout the course of the module. Data was also taken from the learners’<br />
formative assessment activities and the final summative assessment which the learners<br />
were required to undertake.<br />
Students organised themselves into one of seven “teams” based upon the imperial map of<br />
Europe. Their objective was simple: to win the game by being the last “power” standing. The<br />
module is taught to a group of 24 students. This is an optimal number for the Diplomacy<br />
board-game, as it is results in teams that are neither too small (i.e., one or two players), or<br />
too large (i.e., more than five individuals per team).<br />
The Diplomacy board game is a vital learning resource because it allows students to<br />
develop skills in negotiation, bargaining and the agreement of Treaties. The board game<br />
also lets students consider questions such as whether it is ever permissible to lie, cheat or<br />
break promises. This approach to learning requires students to be active learners (anybody<br />
not paying attention is likely to be eliminated from the game!) and involves students<br />
focusing on values, building alliances, cultivating relationships and, most importantly, trust.<br />
Not only is this a vital aspect of standard diplomatic relations, but it also enables students to<br />
meet the module’s learning outcomes (the ability to: critically examine the role of diplomacy<br />
in today’s world order; apply diplomatic thought to real-world situations; and to examine<br />
critically whether current understandings of diplomacy can help to explain the business of<br />
interstate relations).<br />
Over the twelve week period students are required to compile formative assessment<br />
material in the form of seminar logs, detailing their experiences of the Diplomacy board<br />
game. This constitutes some 20% of their overall mark, and serves as a platform for the<br />
extended summative essay that students write at the end of the module.<br />
The paper aims to demonstrate that authentic assessment activities can be used effectively<br />
within non-vocational subject areas and do not necessarily need to be located in areas of<br />
professional practice.<br />
106 ENAC 2008
Assessment of oral presentation skills in higher education<br />
Luc De Grez, Martin Valcke, Irene Roozen<br />
University College Brussels, Belgium<br />
Research Problem: Underlying this research is the concept of self-regulated learning from a<br />
social cognitive perspective (Bandura, 1997). A learner acquires standards and has to be<br />
capable, eventually, to compare his actual performance with these standards and to try to<br />
close the gap. This process generates internal feedback and is often supplemented by<br />
external feedback from teachers and peers. Both forms of feedback have to be accurate<br />
because accurate calibration seems a necessary condition for productive self-regulating<br />
learning. We can change this demand for an accurate calibration into a reliability problem, if<br />
we conclude that we strive for the same assessment result whether performance is<br />
assessed by teachers, peers, or by the learner. An overview of the literature about self- and<br />
peer assessment in the domain of oral presentation skills generated some questions and<br />
remarks: Can the optimistic view be maintained that only a simple instruction is needed to<br />
generate peer and self-assessments that are in agreement with assessments by<br />
professionals? And what if there’s no such agreement? In that case, the generalizability<br />
analysis (Brennan, 2000) seems to be a good first step to analyse the error variance. An<br />
under investigated element are the perceptions students hold of peer and self assessment.<br />
Research Questions: (1) What is the agreement between peer and self- assessments and<br />
professional assessments? (2) What are the perceptions about peer assessments?<br />
Research Design: Research instruments<br />
Assessment instrument for ‘oral presentation performance’: A rubric was constructed<br />
containing: three content-related (introduction, structure, and conclusion) five deliveryrelated<br />
(eye-contact, vocal delivery, enthusiasm, contact with the public and bodylanguage),<br />
and one overall item.<br />
Perception of ‘peer assessment’: An existing questionnaire was used and presented twice.<br />
Procedure: First year students (n=57) delivered three short oral presentations about<br />
prescribed topics and the presentations were videotaped. Participants assessed their own<br />
first (n=24) or second (n=54) presentation. Five professional assessors assessed in total<br />
209 recordings. A total of 29 presentations were assessed by six peers.<br />
Research results: Overall, we have found a positive correlation between professional and<br />
peer assessment scores (significant for four criteria) and between professional and selfassessment<br />
scores (significant for five criteria). The total score of professional assessments<br />
is significantly lower than self- and peer assessments. However, scores on eight of the nine<br />
items of the rubric are significantly different between professional and peer assessments.<br />
A two-facet generalizability study was conducted to obtain variance estimates and to<br />
determine the number of peers needed for reliable scores. The analysis of the variance<br />
components showed that the variance in scores related to the oral presentations is low and<br />
the variance component for peers is large. The generalizability coefficient points at a good<br />
reliability (.81) and the results suggest that four peers are sufficient when nine criteria are<br />
used. The perception of peer assessment is predominantly positive and becomes<br />
significantly more positive in the second questionnaire.<br />
References<br />
Bandura, A. (1997). Self efficacy: the exercise of control. New York: Freeman.<br />
Brennan, R. (2000). (Mis)Conceptions about Generalizability theory. Educational Measurement: Issues and<br />
Practice, 5-10.<br />
ENAC 2008 107
Mobile Assessment of Practice Learning:<br />
An Evaluation from a Student Perspective<br />
Christine Dearnley, University of Bradford, United Kingdom<br />
Jill Taylor, Leeds Metropolitan University, United Kingdom<br />
Catherine Coates, Leeds Metropolitan University, United Kingdom<br />
The ALPS CETL* aims to develop and improve assessment, and thereby learning, in practice<br />
settings for health and social care students. The centre is working towards an<br />
interprofessional programme of assessment of common competencies such as<br />
communication, team working and ethical practice among health and social care students.<br />
The assessment tools will be delivered in electronic, mobile format. Between July and<br />
December 2007, ALPS issued nearly 900 mobile devices with unlimited data connectivity to<br />
students undertaking practice based learning and assessment across the ALPS partnership.<br />
ALPS is implementing the infrastructure to develop, deliver and manage learning content and<br />
assessments on mobile devices to students on a large scale across the 5 partner HEI.**<br />
The study that forms the basis of this paper is being undertaken across all five partner sites. It<br />
incorporates students from sixteen professions and will investigate the impact of the ALPS<br />
mobile assessment processes on learning and assessment within practice settings over an<br />
eighteen month period. Early outcomes of the study will be reported with an emphasis on the<br />
extent to which assessment of core competencies for practice can be facilitated using the ALPS<br />
mobile assessment processes and the relationships between these processes and learning in<br />
practice settings. The ALPS mobile assessment processes have two further innovative<br />
components, which will be explored as part of this study, these are inter-professional<br />
assessment of common competencies and service user involvement in practice assessment.<br />
Whilst there is considerable evidence of mobile devices being used in health and social<br />
care provision, their use for assessment of professional practice is a new and innovative<br />
development that has not been fully evaluated. This study builds on the ALPS IT Pilots,<br />
which explored the feasibility and key issues of using mobile technologies in the<br />
assessment of health and social care students in practice settings and were reported at the<br />
Earli conference 2006 (Dearnley & Haigh 2006, Taylor et al 2006). Key benefits were<br />
identified; these included reduction in paperwork and in risks of handling paper copies of<br />
assessment data, enhanced communication between peers and tutors leading to increased<br />
professional interactions and that in some cases mobile devices seemed to help students to<br />
overcome barriers to writing and to instil pride in their work. The project team is committed<br />
to further exploring the full pedagogic potential of this initiative.<br />
*Assessment & learning in Practice Settings is a centre for Excellence in Learning &<br />
Teaching (CETL) funded by the Higher Education Funding Council for England<br />
http://www.alps-cetl.ac.uk/<br />
**Universities of Bradford, Leeds, Huddersfield, Leeds Metropolitan and York St John<br />
University College<br />
References<br />
Dearnley C.A., Haigh J., Using Mobile Technologies for Assessment and Learning in Practice Settings: A<br />
Case Study. Third Biennial Joint Northumbria/EARLI SIG Assessment Conference. 30th Aug-1st<br />
Sept. Co Durham, UK (2006)<br />
Taylor J.D., Coates C., Eastburn S, & Ellis I. (2006) Using mobile phones for critical incident assessment in<br />
Health placement practice settings. Third Biennial Northumbria/EARL SIG assessment Conference.<br />
108 ENAC 2008
How reliable is the assessment of practice, and what is its purpose?<br />
Student perceptions in Health and Social Work<br />
Margaret Fisher, University of Plymouth, United Kingdom<br />
Tracey Proctor-Childs, University of Plymouth, United Kingdom<br />
Introduction: The Centre for Excellence in Professional Placement Learning (Ceppl) is<br />
based at the University of Plymouth in Devon, England. This Centre seeks to share and<br />
develop excellent practice in collaboration with other disciplines which have a placement or<br />
practice component (QAA 2003).<br />
One research strand is evaluating practice assessment methods in Midwifery, Social Work<br />
and Post-registration Health Studies. A multi-disciplinary team representing all three of<br />
these professional groups and comprising students, service-users, practitioners and<br />
academics is currently working on this project.<br />
This paper reports on the findings of Years One and Two of a three-year longitudinal study,<br />
which commenced in June 2006. Staff focus groups are concurrently being undertaken, but<br />
results from these will be reported at a later date. The literature clearly suggests that validity<br />
and reliability are fundamental to the success of an assessment, but are difficult to achieve<br />
(Chambers 1998, Calman et al 2002, Crossley et al 2002, McMullan et al 2003).<br />
Assessment of competence in practice is crucial in determining whether or not a student<br />
meets the criteria required of their profession (Cowan et al 2005, Watkins 2000). Early<br />
findings of the study raise important issues in relation to this existing evidence, as well as<br />
identifying further avenues for investigation. Once the study is complete, generic guidelines<br />
and resources will be developed to inform cross-professional assessment of practice in<br />
placement settings which should be transferrable internationally.<br />
Methodology: The aim of the project is to explore the student experience of the practice<br />
assessment process during a professional programme of study. Perceptions of validity and<br />
reliability of assessment methods as well as the impact of the process on the student<br />
learning experience are being explored. Multi-centre Research Ethics Committee approval<br />
was obtained for the study. An average of five students per professional group are<br />
participating in longitudinal case studies throughout their two to three-year programme.<br />
Semi-structured interviews are tape-recorded after submission of the practice assessment<br />
documents at the end of each year, and students are invited to add any further contributions<br />
during the year as they see fit. Single-case and cross-case analysis and synthesis is being<br />
conducted using the “Framework technique” (Ritchie and Spencer 1994).<br />
Findings: Analysis of transcripts from the first two years has resulted in identification of key<br />
themes:-Purpose, Process and Guidance. Practicalities of methods used and the students’<br />
perception of the purpose of assessment have been discussed. The role of the practice<br />
assessor and the placements themselves have been identified as key areas. An interesting<br />
sub-theme around honesty and integrity – “cheating the system” – has emerged as an issue<br />
of importance. This is being explored further in view of the future professional roles of the<br />
students. Information gained has already informed delivery and structure of some of the<br />
professional programmes and their practice assessment methods. Reports on the findings<br />
may be accessed on the Ceppl website at: www.placementlearningcetl.plymouth.ac.uk.<br />
Journal publication is in progress.<br />
ENAC 2008 109
Measuring variance and improving the reliability of<br />
criterion based assessment (CBA): towards the perfect OSCE<br />
Richard Fuller, Matthew Homer, Godfrey Pell<br />
University of Leeds, United Kingdom<br />
Background<br />
Assessment methodologies have increasingly come under the spotlight with respect both<br />
reliability and validity. In healthcare settings, the traditional unstructured ‘long and short<br />
cases’ have given way to the OSCE (Objective Structured Clinical Examination) where<br />
students undertake a series of short clinical assessments which are objectively assessed<br />
against predetermined criteria. The OSCE is a prime example of CBA in health care<br />
programmes, allowing careful blueprinting, spread of domains, clarity of assessment mark<br />
sheets, standard setting and metrics to look thoughtfully at the performance of the<br />
assessment.<br />
CBA has a number of obvious weaknesses, typically in that:<br />
- item based checklists can highly reward a scattergun approach by candidates<br />
- it can be difficult to reward better performers,<br />
- there is strong reliance on assessor behaviour despite item based checklists<br />
- they are labour intensive and costly.<br />
- Tensions exist between (face) validity on the one hand and standardisation and<br />
reliability on the other. A variety of metrics can be used in the process of defining,<br />
exploring and correcting error variance (variance in marks due to factor other than<br />
student performance). This paper explores Leeds’ experience and research in this area,<br />
defining measures for error variance and methods of reducing variance whilst<br />
maintaining strong clinical validity.<br />
Summary of work<br />
This paper will provide a brief overview of the OSCE process and analysis of final year<br />
results from recent years will be presented. We have found that between assessor variance<br />
(proportion of checklist mark/grade variance attributable to assessors out of the total<br />
mark/grade variance) in many cases exceeded 25% and in some cases exceeded 40%.<br />
Interpretation of the raft of station metrics allowed us to identify causes of both random and<br />
systematic error.<br />
This paper looks at issues such as assessor training, gender interactions and checklist<br />
structure, and shows how these issues were addressed to reduce the mean station<br />
variance to below 20%.<br />
Conclusions<br />
Tensions between reliability and validity continue to be important in complex CBA<br />
arrangements. This philosophical tension does have demonstrable effects – and we can<br />
use variance to examine this, and the impact of changes.<br />
Despite our best efforts, between assessor variance persists, perhaps as a result of varying<br />
perceptions of appropriate ‘standards’ for students at different stages of their courses,<br />
because of varying levels of assessor maturity and confidence in dealing with checklist<br />
items. Whilst we have made significant improvements by addressing specific issues<br />
detailed in this paper, it is important to recognise that error variance in complex, high stakes<br />
criterion based assessment remains an ongoing challenge.<br />
110 ENAC 2008
Learning through assessment and feedback:<br />
implications for adult beginner distance language learners<br />
Concha Furnborough, The Open University, United Kingdom<br />
Feedback on marked assignments is an important element in the learning process,<br />
especially in distance learning, where it can provide students not only with a measure of<br />
their progress but also with individualised tuition (Cole et al., 1986), and may be the sole<br />
channel for student-tutor communication (Ros i Solé & Truman, 2005: 88). Feedback also<br />
makes an important contribution to motivation (Walker & Symons, 1997: 16-17). This paper<br />
reports specifically on distance learner perceptions of positive tutor feedback, together with<br />
cognitive and affective responses they may generate.<br />
One of the challenges of studying a language at a distance is managing interpersonal and<br />
communicative aspects of language acquisition (Sussex, 1981: 180); in the Open University<br />
(UK) students are offered a supported distance course, which includes tutor feedback on<br />
assignments. In this model feedback has a dual function, being used for both formative and<br />
summative assessment purposes. Anecdotal evidence suggests that students attach far<br />
greater importance to the latter than to the former, although it can be argued that learning<br />
occurs when students perceive feedback not simply as a judgement on their level of<br />
achievement but as enabling learning (Maclellan (2001) in Weaver, 2006: 380-381).<br />
Learning depends not only on the quality of the feedback but also on students’ responses to<br />
it, according to how they interpret it.<br />
The research presented here is part of a larger study on motivation that gathered data<br />
through questionnaires and interviews. These findings draw mainly on data obtained from<br />
56 telephone interviews with students of Spanish, French and German at the midpoint of<br />
their courses. The interviews covered themes associated with motivation, including<br />
approaches to distance language learning, support in language learning, confidence and<br />
progress; so they enabled us to situate learner perspectives on tutor feedback in the context<br />
of their views on other aspects of their learning.<br />
Our results suggest that this concept of feedback as a learning tool is especially important<br />
for beginner language learners in distance learning settings, and one that also acts as a<br />
vehicle for increasing their self-confidence – an important consideration in terms of<br />
motivation maintenance. We would also argue that some learners in this category need little<br />
help in discovering how to use feedback to these ends, whereas others require<br />
considerable support, guidance and encouragement.<br />
Although our target group was beginners in a distance learning context, the findings may<br />
also be applicable to other levels and learning contexts.<br />
Practice-related discussion<br />
Suggested areas for discussion are:<br />
• implications for raising learner awareness of the teaching and learning function of<br />
feedback;<br />
• training of tutors to be aware of students’ needs in terms of feedback<br />
• the potential of feedback to engage students in active learning, and enhance their selfconfidence<br />
and motivation.<br />
ENAC 2008 111
Secret scores: Encouraging student engagement with useful feedback<br />
Stuart Hepplestone, Sheffield Hallam University, United Kingdom<br />
This short paper session will discuss the use of technology in providing useful feedback to<br />
students by exploring the development of, and presenting initial findings from ongoing<br />
research into the practical experience and impact on the student assessment experience, of<br />
two separate, yet complimentary, tools at Sheffield Hallam University (SHU) to enhance the<br />
way feedback can be provided to students, and to encourage students to engage with their<br />
feedback through the Institution’s virtual learning environment, Blackboard.<br />
Students at SHU are increasingly expecting access to their feedback and marks online,<br />
often remarking on the usefulness of online feedback as a way to track their progress on<br />
different assessment tasks for their modules. To meet these rising expectations, the<br />
University undertook a project to enhance the way feedback can be provided through<br />
Blackboard (Hepplestone & Mather 2007). A key aspect of this project was the development<br />
of customised assignment handler extension which supports effective online feedback<br />
through the Blackboard Gradebook by enabling tutors to batch upload feedback file<br />
attachments along with student marks, providing feedback on group assignments to each<br />
individual in the group, presenting student feedback all in one place and close to their<br />
learning, and encouraging students to engage with their feedback to trigger the release of<br />
their marks (after Black & Wiliam, 1998, who argued that the “effects of feedback was<br />
reduced if students had access to the answers before the feedback was conveyed”). (A<br />
poster presentation, Useful feedback and flexible submission: Designing and implementing<br />
innovative online assignment management, accompanies this short paper to explore the<br />
development process).<br />
Accompanying this development is an electronic feedback wizard. This tool allows tutors to<br />
quickly generate consistent individual feedback documents for an entire student cohort specific<br />
to each assignment created in Blackboard from a generic feedback template containing a<br />
matrix of assessment criteria and feedback comments (Hepplestone & Mather, 2007). This<br />
initiative stems from various systems developed and used by individual colleagues at SHU,<br />
paralleling the work of Denton (2001) who developed a technique using a combination of<br />
Microsoft Excel and Microsoft Word to generate personalised feedback sheets.<br />
SHU is a large UK University with over 28,000 students, based across three campuses,<br />
offering a diverse range of undergraduate and postgraduate courses.<br />
References<br />
Black, P. and Wiliam, D. (1998) Assessment and classroom learning. Assessment in Education, 5 (1), pp.7-74.<br />
Denton, P. (2001) Generating Coursework Feedback for Large Groups of Students Using MS Excel and<br />
MS Word, [online]. University Chemistry Education, 5 (1), pp.1-8. Last accessed 12 February 2008<br />
at: http://www.rsc.org/pdf/uchemed/papers/2001/p1_denton.pdf<br />
Denton, P. (2001) Generating and e-Mailing Feedback to Students Using MS Office, [online] In: Proc. 5th<br />
International Computer Assisted Assessment Conference, Loughborough, 2-3 July 2001. Learning<br />
and Teaching Development, Loughborough University. Last accessed 12 February 2008 at:<br />
http://www.caaconference.com/pastConferences/2001/proceedings/j3.pdf<br />
Hepplestone, S. & Mather, R. (2007) Meeting Rising Student Expectations of Online Assignment<br />
Submission and Online Feedback, [online] In: Proc. 11th Computer-Assisted Assessment<br />
International Conference 2007, Loughborough, 10-11 July 2007. Learning and Teaching<br />
Development, Loughborough University. Last accessed 12 February 2007 at:<br />
http://www.caaconference.com/pastConferences/2007/proceedings/Hepplestone%20S%20Mather%<br />
20R%20n1_formatted.pdf<br />
112 ENAC 2008
Large-Scale Assessment and Learning-Oriented Assessment:<br />
Like Water and Oil or new Possibilities for Future Research Directions?<br />
Therese Nerheim Hopfenbeck, University of Oslo, Norway<br />
For better or worse, large-scale assessments seem to be here to stay. Surveys such as the<br />
Programme for International Student Assessment (PISA) have had a huge impact on<br />
national educational policy in several countries, and will probably continue to do so.<br />
The aim of the current work is to bridge the gap between the fields of educational<br />
psychology concerned with learning-orientated assessment and the field of large-scale<br />
assessment and the need for policy relevant data.<br />
The present paper consists of two arguments. First I will argue, that despite critiques, largescale<br />
assessment offer valuable information to the field of educational research. They can<br />
play a valuable role in the development of comprehensive assessment systems, which also<br />
includes learning -oriented assessment.<br />
Secondly, the use of questionnaires in large-scale assessments such as PISA can be used<br />
in combination with small-scale research, such as interviews, to further investigate in depth<br />
some of the main findings from large-scale assessment. Bringing qualitative small-scale<br />
research together with large-scale assessment, can lead to improvements of the research<br />
methods used for improving classroom assessment.<br />
Using a mixed method approach, combining quantitative findings from PISA 2006 with<br />
qualitative data from an interview study in Norway, descriptions of students’ self-beliefs of<br />
learning, achievement and assessment will be presented. I will show how such studies<br />
might be carried out and contribute to a deeper understanding of assessment. The<br />
relevance of the current research is based upon a review of large – scale assessment and<br />
its’ policy influence after 1970 together with the research based principles from the<br />
Assessment Reform Group (2002).<br />
In addition to the quantitative material from the PISA 2006 test, the empirical base for the<br />
discussion includes comparisons between the low achieving students and high achieving<br />
students on the following factors:<br />
• How students experienced the PISA test<br />
• Task format<br />
• Schools preparation for the PISA test<br />
• Students’ test motivation<br />
• Different assessment cultures<br />
Together these mixed method approach offer a thick description of students’ experience of<br />
large-scale assessment, their consequences’ and challenges.<br />
Finally, suggestions for combining large-scale assessment with classroom assessment are<br />
made, in an attempt to further empower the students in their learning process, to better<br />
develop as self-regulated, or what PISA calls “learners for tomorrow's world” who are also<br />
able to monitor their own learning.<br />
ENAC 2008 113
Online interactive assessment for open learning<br />
Sally Jordan, Philip Butcher, Arlëne Hunter<br />
The Open University, United Kingdom<br />
This paper describes recent developments in the formative, summative and diagnostic use<br />
of e-assessment at the UK Open University, in particular the development of interactive<br />
computer marked assignments (iCMAs). These are being introduced within a coordinated<br />
initiative that is extending the richness of e-assessment tasks within an integrated and<br />
supported pedagogical model.<br />
The iCMAs include many different question types, some of considerable complexity and<br />
involving elements of constructed learning. It is widely recognised that rapidly received<br />
feedback on assessment tasks has an important part to play in underpinning student<br />
learning, encouraging engagement and promoting retention (see for example Rust et al,<br />
2005, op cit; Yorke, 2001). Online assessment provides an opportunity to give virtually<br />
instantaneous feedback. However, providing automatically generated feedback which is<br />
targeted to an individual student’s specific misunderstandings is more of a challenge,<br />
especially in response to answers entered as free-text. Students are allowed three attempts<br />
at each iCMA question, with tailored and increasingly detailed prompts allowing them to act<br />
on the feedback whilst it is still fresh in their minds and so to learn from it (Gibbs and<br />
Simpson, 2004, op cit). Feedback can also be provided on the student’s demonstration of<br />
learning outcomes developed in the preceding period of study.<br />
Evaluation methodologies have included student observation, comparisons against human<br />
marking and a ‘success case method’ approach. Preliminary results indicate that the<br />
systems are robust and accurate in marking, that students enjoy the iCMAs (even when<br />
used summatively) and that they usually engage with the feedback provided. The system<br />
automatically collects information about student interactions, enabling the tracking of<br />
individual students’ progress (and if necessary the provision of additional support) as well<br />
as wider-ranging insights into students’ understanding of the course material.<br />
The formative capabilities of computer based assessment tasks such as those described<br />
are of particular importance in distance learning contexts, because of the ability to mimic a<br />
‘tutor at the students’ elbow’, irrespective of the geographical location of the learner and<br />
tutor (Ross, Jordan and Butcher, 2006). They enable dialogue about standards of<br />
achievement and act as a proxy for the immediately accessible learning community enjoyed<br />
by face to face students.<br />
Although the paper emphasises the open and distance learning context, we will encourage<br />
discussion of wider applicability. We share the view that e-assessment has the potential to<br />
‘significantly enhance the learning environment’ (Whitelock and Brasher, 2007), and will<br />
seek to challenge perceptions of e-assessment as being of limited validity and relevance. In<br />
so doing we will explore reasons for its relatively low uptake.<br />
References<br />
Ross, S.M., Jordan, S.E and Butcher, P.G.(2006) Online instantaneous and targeted feedback for remote<br />
learners. In Innovative assessment in Higher Education ed. Bryan, C & Clegg, K.V., pp123-131<br />
London U.K., Routledge.<br />
Whitelock, D and Brasher, A (2006) Roadmap for e-assessment. JISC. At<br />
http://www.jisc.ac.uk/elp_assessment.html [accessed 1st February 2008].<br />
Yorke, M (2001) Formative assessment and its relevance to retention, Higher Education Research &<br />
Development, 20(2), 115-126.<br />
114 ENAC 2008
Can inter-assessor reliability be improved by deliberation?<br />
Per Lauvas, Østvold University College, Norway<br />
Gunnar Bjølseth, Østvold University College, Norway<br />
Anton Havnes, University of Bergen, Norway<br />
From previous studies it is evident that inter-assessor reliability varies from nearly zero to<br />
almost complete match. However, the reliability often seems to be lower than what is<br />
considered acceptable, at least when expressive assignments is considered. In one health<br />
related study programme, several indications (e.g. from handling appeals) raised serious<br />
concerns as to marker reliability after the number of assessors had been cut back. Standard<br />
procedure is for the assessors to assign a mark and produce a written justification for easy<br />
processing when students use their legal right to receive feedback.<br />
The final summative assessment is an integrative, across-the-modules home examination<br />
where students are assigned a thematic field and required to choose perspectives and<br />
cases based on their own priorities and experience. The teaching is organised in themespecific<br />
modules while the final assignment is integrative. All teachers are involved in the<br />
final assessment, and an assessor will mark assignments that are close to or further away<br />
from his or her field of expertise.<br />
The Department of nursing education decided to run 6 workshops (full or half day) for all<br />
academic staff involved in the bachelor programme over a one year period. Prior to each<br />
workshop a set of authentic, recent student papers were distributed to all teachers (i.e.<br />
internal assessors) for individual, independent marking and with the requirement to produce<br />
the written justification (feedback) to support the mark. Student papers covered all three<br />
years of the Bachelor programme, as well as the whole 6 step range of marks (A to Fail).<br />
The workshops had three parts: (a) thematic introduction, (b) deliberations in groups to<br />
arrive at a conclusion as to mark assigned to a specific student paper, and (c) recording of<br />
results from all groups and a subsequent plenary summary and discussion. Individual and<br />
group assessments (grades and justifications) were collected, analysed and fed back to the<br />
participants, also serving as background for selecting an assessment approach to be tested<br />
out in the next workshop.<br />
Emphasis was placed on the assessors’ interpretation and application of assessment<br />
criteria. The intention to be scrutinised was whether a systematic process of collegial<br />
deliberations over the assessment of authentic student papers in relation to assessment<br />
criteria and feedback/justification would result in improved inter-assessor reliability. The<br />
‘assessment of the assessors’, conducted in the final workshop (Oct. 2007) showed that<br />
assessment reliability had improved, but only marginally; still it was the case that the<br />
variation between individual assessors’ grading and valuing of the quality of students’<br />
assignments is inferior to standards considered appropriate by the faculty. It seems to be<br />
the case, however, that the written justifications (‘feedback’)of the given marks did change.<br />
In the paper, the background for the project will be elaborated, the process and the results<br />
will be analysed and discussed: Is it realistic to improve inter-assessor reliability to an<br />
acceptable level by deliberation among colleagues? What challenges do assessment of<br />
integrative, expressive assessment tasks represent and how could they be met?<br />
ENAC 2008 115
Sketchbooks and Journals: a tool for challenging assessment?<br />
Paulette Luff, Anglia Ruskin University, United Kingdom<br />
Gillian Robinson, Anglia Ruskin University, United Kingdom<br />
This paper highlights aspects of our experience as exploratory practitioners researching the<br />
use and the value of sketchbooks and learning journals as a form of assessment. We report<br />
our developing understandings of ways in which these can support and extend students’<br />
learning within the context of an Art, Design, Technology module and an Early Childhood<br />
Curriculum module, both for undergraduate students of education. Within our Early<br />
Childhood Studies (ECS) and Primary Education BA courses we emphasise approaches to<br />
young children’s education informed by socio-cultural theories. This promotes a view of<br />
learning which stresses the importance of shared meaning making and the co-construction<br />
of knowledge. Accordingly, we draw upon the Vygotskian concept of pedagogical tools,<br />
mediating and extending knowledge construction, and emphasise a close relationship<br />
between means of assessment and student learning. Sketchbook research journals have<br />
been used as part of the assessment for the Art, Design Technology and Control,<br />
Technology for several years. The module is delivered through lectures, practical<br />
workshops, ICT workshops, self and tutor directed learning over a period of 12 weeks.<br />
Students are challenged to make and programme a 3D working model based on a work of<br />
art, to create a teaching aid that makes effective use of cross-curricular approaches. They<br />
use the sketchbook learning journal to maintain a record of thinking and decision making<br />
during the development of this project. The Early Childhood Curriculum module is studied<br />
over 24 weeks with sketchbook learning journals used to capture and explore<br />
understandings of this topic (from lectures, workshops, fieldwork and wider reading). Our<br />
project is, therefore, based upon a multiple case study design, apt for monitoring and<br />
explaining educational practices (Sanders, 1981; Merriam, 1998 ). Most data gathering is<br />
integrated into the module programmes with the sketchbook journals themselves forming<br />
important sources of qualitative data, together with staff and students’ reflection on the<br />
processes. Evidence, from our initial analysis of findings, indicates that sketchbook learning<br />
journals can provide a means for students to capture, synthesise, reflect upon and critique<br />
their learning. By making learning visible they also offer a rich source for assessing the<br />
processes of student learning and assisting our understandings and development as<br />
teachers. In considering sketchbooks as challenging assessment tools, we address the<br />
ways that using sketchbooks challenges traditional forms of summative assessment by<br />
requiring that students show their developing thinking and learning throughout a module,<br />
with built in opportunities for formative tutor, peer and self assessment. There is also the<br />
challenge of some clash of philosophy as, although we are advocating a constructivist<br />
approach to learning, in our current system all students must achieve pre-set module<br />
learning outcomes. It is also challenging for students, as material has to be synthesised and<br />
documented in ways that communicate their ideas and demonstrate higher order thinking -<br />
and this must be sustained throughout a module. We anticipate that these points may prove<br />
fruitful for discussion.<br />
116 ENAC 2008
Evaluating the use of popular science articles for assessing<br />
high schools students<br />
Michal Nachshon, Ministry of Education, Israel<br />
Amira Rom, The Open University, Israel<br />
Alternative assessment is a way to assess students' achievements whereby teachers<br />
assess students by authentic tasks when the student is required to formulate the problem;<br />
tasks which enable different solutions and provide the opportunity to reflect on their learning<br />
process.The majority of students in Israel graduate from high school after completing at<br />
most one year of science studies. For these students, a new program, Science for All, is<br />
now being offered at the high-school level as an alternative to the traditional natural science<br />
courses. This program encourages the teaching of science in a more thematic way,<br />
integrating the different scientific disciplines and aspects of technology. The intention is to<br />
expose all students to scientific principles and, consequently, to extend their understanding<br />
of them.<br />
Popular science articles, published in a variety of newspapers and magazines, can be a<br />
powerful tool to help students connect what they learn in school to current scientific and<br />
technological advancements. Further, popular science articles can provide opportunities for<br />
students to read critically, discuss issues and reach decisions based on their knowledge of<br />
science.<br />
The purpose of the study was to evaluate the use of authentic tasks, based on popular<br />
science articles for assessing Science for All students. The results presented in this<br />
summary are part of an ongoing longitudinal study of the use of popular science articles in<br />
instruction and assessment.<br />
The sample consists of 57 teachers in 40 schools nationwide. At the end of the school year<br />
Science for All teachers were asked to choose a popular science article and use it for the<br />
development of an assignment including scoring rubrics. They were then asked to send us<br />
the assignment, the scoring rubrics, and a sample of three students’ work corresponding to<br />
excellent, medium and poor grades. In addition, teachers filled a written questionnaire in<br />
which they were asked to characterize the assignment they developed and reflect on their<br />
experience.<br />
Specifically, teachers were asked to identify the learning goal assessed by the assignment,<br />
including both concepts and skills; to specify the cognitive levels required by each item in<br />
the assignment; and to indicate which abilities of multiple intelligences are represented in<br />
their assignment.<br />
Two independent expert teachers evaluated each assignment and sample student work.<br />
Next, assessment experts reviewed these evaluations and summarized the strengths and<br />
weaknesses of each assignment. At the end of the process, each teacher received a written<br />
feedback and discussed this feedback with his assigned expert teacher.<br />
Our findings show that teachers include both low and high-level cognitive questions in the<br />
assignments. With respect to multiple intelligences, teachers tend to include tasks that<br />
require linguistic and logical abilities but not other abilities. In addition, three main difficulties<br />
were identified: Teachers had trouble identifying valid learning goals; in some cases,<br />
teachers failed to recognize the skills that were assessed; and typically, the scoring rubrics<br />
did not match the learning goals the teachers intended to assess. We believe that this<br />
process can help teachers become more knowledgeable about the desired characteristics<br />
of assessment.<br />
ENAC 2008 117
Supporting student intellectual development through assessment design:<br />
debating ‘how’?<br />
Berry O'Donovan, Oxford Brookes University, United Kingdom<br />
Margaret Price, Oxford Brookes University, United Kingdom<br />
Prior research suggests that students move through stages of intellectual development<br />
(whilst in higher education) in which their beliefs about the nature of knowledge and learning<br />
change and develop in complexity and understanding, the best known of which is probably<br />
that of Perry (1970) but also includes the influential work of Belenky et al, (1986), King and<br />
Kitchener (1994) and Baxter Magolda, (1992). However, the literature is less clear about<br />
how such intellectual development can be triggered and encouraged through assessment<br />
and learning activities.<br />
Vygotsky’s (1978) seminal work on social constructivism and ‘zones of proximal<br />
development’ conceptualises learning development in incremental terms. Students<br />
advancing to nearby learning positions that some of their peers already hold and share with<br />
them. Arguably, this suggests tutors and assessment designs should provide the cognitive<br />
scaffolding that would support collaborative, incremental and seemingly comfortable<br />
development.<br />
However, other perspectives on intellectual development such as Meyer and Land’s (2003)<br />
work on threshold concepts and the narratives within Baxter Magolda’s (1992) work on<br />
intellectual development paint a less comfortable picture. Meyer and Land posit that there<br />
are disciplinary concepts that once understood by students lead to new and previously<br />
inaccessible ways of thinking. Such intellectual movement involves students entering into a<br />
‘liminal space’ where they have moved out of familiar cognitive territory into a zone of<br />
disorientation where existing certainties are rendered problematic before they can cross the<br />
threshold into a new landscape of understandings. Baxter Magolda’s (1992) narratives also<br />
contain student reflections on critical incidents that seemingly thrust them uneasily up the<br />
intellectual development ladder, revealing the development as sometimes both erratic and<br />
disquieting.<br />
So what does this mean for assessment? Taking a Vygotskian approach may involve<br />
adopting an assessment design involving low stakes, scaffolded, collaborative assessment<br />
activity that allows for ‘slow learning’ (Claxton, 1998 cited in Knight and Yorke, 2002). In the<br />
initial stages of an undergraduate degree this may also involve designing assessment tasks<br />
that align with lower level epistemological beliefs, i.e. content focused assessment that<br />
reflects factual material verified by an authority.<br />
Alternatively, if we consider intellectual development as uneven and inconsistent, and take<br />
the stance that students need to confront ‘troublesome knowledge’ (Meyer and Land, 2003)<br />
and make disquieting intellectual leaps that cross learning thresholds -then what<br />
assessment designs would we choose? Arguably, such a stance may involve assessment<br />
designs that involve: an ‘unfreezing’ process (Lewin, 1951) to provoke students out of<br />
current comfortable orientations: assessment tasks that ‘problematise’ the subject (Grey et<br />
al, 1996); student discomfort; and tasks that provoke higher order epistemological stances.<br />
The discussion will explore the nature of intellectual development, including diverse<br />
disciplinary epistemologies, and the implications for assessment design. To support<br />
participants unfamiliar with the literature, discussion will be seeded by illustrative quotes<br />
from the literature and practical examples taken from a large scale qualitative study of<br />
students’ epistemological beliefs undertaken at Oxford Brookes.<br />
118 ENAC 2008
Assessment contexts that underpin student achievement: demonstrating effect<br />
Berry O'Donovan, Oxford Brookes University, United Kingdom<br />
Margaret Price, Oxford Brookes University, United Kingdom<br />
A large scale study in the US that examined over 25,000 students and over 190<br />
environmental variables found that the key influence on student success is student<br />
involvement fostered by student/student and student/faculty interaction (Astin,1997). Such<br />
findings have been corroborated by smaller scale unpublished studies in the UK (Holden,<br />
2008). Taking a social constructivist approach to the classroom and the use of interactive<br />
teaching strategies has been well documented in the literature (Vygotsky, 1978). Less well<br />
documented is the effect of intentionally increasing opportunities for student/student and<br />
staff/student interaction outside of the classroom (O’Donovan et al., 2008)<br />
The ASKe (Assessment Standards Knowledge exchange) Centre for Excellence based at<br />
Oxford Brookes University in the UK has for the last two years been attempting to cultivate<br />
students and staff sense of community within one School situated on a satellite campus<br />
which attracts significant numbers of undergraduates, often taught in large classes. Within<br />
this learning context, described by many students as ‘impersonal’ (Price et al. 2007), the<br />
Centre has developed initiatives that intentionally involve students with the academic<br />
community outside the formal classroom. Initiatives include: peer-assisted learning in which<br />
more advanced students help others with their learning; modular leader assistantship in<br />
which students help academics with their teaching preparation and organisation; students<br />
as co-researchers; students allowing staff insight of their experience though the media of<br />
audio diaries.<br />
Whilst these initiatives have been evaluated as very successful from both student and staff<br />
perspectives, evidencing an effect on student learning through their assessed performance<br />
is proving very tricky. As Graham Gibbs (2002) states there is a real absence in most<br />
pedagogic research of hard evidence of improvement to student learning. Roundtable<br />
discussion will commence with Gibb’s fundamental question on whether qualitative<br />
evidence demonstrating student and staff appreciation and belief in the effects of such<br />
initiatives is sufficient. After which possible methodologies that evidence cause and effect<br />
between such individual initiatives and students’ assessed performance within a context of<br />
an ever changing learning landscape will be discussed and debated.<br />
References<br />
Astin, A. (1997) What Matters in College? Four Critical Years Revisited, San Francisco: Jossey-Bass<br />
Gibbs, G. (2002) ‘Ten years of Improving Student Learning’ Improving Student Learning Theory and<br />
Practice 10 years on. Improving Student Learning 10, Berlin, September.<br />
Holden, G. (2008) ‘The Importance of Feedback’, Assessment for Learning: How does that work?’,<br />
HEA/Northumbria Workshop, Newcastle, February.<br />
Price, M. & O’Donovan, B., Rust, C. (2007) Building community: engaging students within a disciplinary<br />
community of practice, ISSOTL ‘Locating Learning’: Sydney, July.<br />
O’Donovan B., Price M., Rust C. (2008) Developing student understanding of assessment standards,<br />
Teaching in Higher Education, vol 13., no. 2, pp.205-217<br />
Vygotsky, L. S (1978). Mind in Society: The Development of Higher Psychological Processes. MA: Harvard<br />
University Press.<br />
ENAC 2008 119
In-classroom use of mobile technologies to support formative assessment<br />
Ann Ooms, Timothy Linsey, Marion Webb<br />
Kingston University, United Kingdom<br />
The paper presents the findings of a research project on in-classroom use of mobile<br />
technologies to support diagnostic and formative assessment. The research project<br />
addressed the following questions:<br />
1. Under which conditions can each of the technologies be efficiently and effectively used<br />
for diagnostic / formative assessment in classroom settings?<br />
2. What is the impact of the in-classroom use of mobile technologies for<br />
diagnostic/formative assessment on students’ attitudes toward the module?<br />
3. What is the impact of the in-classroom use of mobile technologies for<br />
diagnostic/formative assessment on students’ conceptual understanding?<br />
4. What is the impact of the in-classroom use of mobile technologies for<br />
diagnostic/formative assessment on students’ test results?<br />
5. What is the impact of the project on teaching practices? How likely is it that that impact, if<br />
there is any, will sustain?<br />
6. What is the impact of the project on assessment practices? How likely is it that that<br />
impact, if there is any, will sustain?<br />
7. What is the impact of the project on attitudes on in-classroom use of mobile<br />
technologies? How likely is it that that impact, if there is any, will sustain?<br />
8. What indicators are there of institutional commitment to and subsequent uptake of inclassroom<br />
use of mobile technologies?<br />
Thirteen academic staff members from 7 different faculties within one university used a<br />
range of mobile technologies such as electronic voting systems, mobile phones, Tablet<br />
PC’s, Interactive Tablets and i-Pods to support rapid feedback. Two mentors supported and<br />
assisted the academic staff.<br />
A mixed-methods methodology was used to collect data from academic staff<br />
(questionnaires, interviews, reflective journals), students (questionnaires, focus groups),<br />
and mentors (interviews, reflective journals). In addition, attendance records, assessment<br />
strategies, assessment tools and assessment records were compared with those from the<br />
previous year.<br />
120 ENAC 2008
The Devil's Triad:<br />
The symbiotic link between Assessment, Study Skills and Key Employability Skills<br />
Jon Robinson, Northumbria University, United Kingdom<br />
David Walker, Northumbria University, United Kingdom<br />
Student reaction to assessment, study skills and the idea of being taught graduate<br />
employability is typically negative within Higher Education Institutions. Yet, all three now<br />
have to be considered and included by those involved in the curriculum design of<br />
programmes and modules within the University of Northumbria. This is particularly<br />
problematic for non-vocational subjects, such as those typical of the Humanities. In the<br />
English Division at Northumbria we have redesigned the core first-year module for English<br />
students in a way that symbiotically links assessment, study skills and employability within a<br />
framework underpinned by the theory and practice of Assessment for Learning (AfL).<br />
This roundtable presentation will outline, and open up for in-depth discussion, the approach<br />
taken by the curriculum team, from both a theoretical and practical perspective, when<br />
designing the module assessment to link with study and employability skills. It will also<br />
present the initial findings of research into the effectiveness of the innovation in curriculum<br />
design. The overarching intention of the presentation will be to create dialogue and explore<br />
avenues for collaboration with participants at the conference, from different countries and<br />
who hold different perspectives, in order to facilitate further development of our work and<br />
create an opportunity for the exchange of ideas.<br />
ENAC 2008 121
Learning-oriented assessment and students experiences<br />
Ann Karin Sandal, Margrethe H. Syversen, Ragne Wangensteen<br />
Sogn and Fjordane University College, Norway<br />
Kari Smith, University of Bergen, Norway<br />
This presentation reports part of an ongoing project at the Sogn and Fjordane University<br />
College funded by the Norwegian Research Council. The aim of the project is to examine<br />
students’ experiences with the transition from primary to secondary school. An important<br />
issue is to investigate how portfolio assessment can be supportive for making choices and<br />
motivate for lifelong learning. The comprehensive research project focuses on how students<br />
are prepared to choose programmes in secondary schools and prepare for choosing a<br />
future profession through the subject “Elective programme subjects”. This new programme<br />
was introduced in primary schools together with the curricula reform “Knowledge Promotion”<br />
in 2006. The main aim is to prevent mistakes and dropouts, and help the students make a<br />
good choice.<br />
In the current study we examine how formative assessment influences students’ beliefs and<br />
plans for further education, and to what extent assessment, through digital portfolios,<br />
enhances consciousness about further education (Klenowski, 2002; Black & William, 2006;<br />
Harlen, 2006). We investigate how assessment in digital portfolios can support the learning<br />
process and the processes of decision making. We try to identify some consequences of<br />
teachers’ supportive assessment and the students’ experiences with assessment for<br />
learning. The study will follow students choosing vocational education programmes. We are<br />
using both qualitative and quantitative methods in the study.<br />
A questionnaire was sent to 90 students in 3 different schools in their last term in primary<br />
school (age 15). The preliminary findings show variations in the students’ interest,<br />
motivation and consciousness about the choices they are about to make. When asked what<br />
kind of assessment encourages further work and how this can stimulate the learning<br />
processes, the students valuate the written comments on their work highly. Together with<br />
oral response in assessment, this direct and personal response on their work gives the<br />
students improved self-esteem and belief in their capability of learning (Harlen, 2006; Gibbs<br />
& Simpson, 2005). It seems that this type of assessment is important to the students in<br />
order to motivate and create interest for schoolwork. (Hidi & Renningar, 2006).<br />
However, even if the students seem to be motivated for vocational education and practical<br />
activities, some students put effort into the more theoretical subjects.<br />
In order to be admitted to vocational education, good marks in the theoretical subjects are<br />
required, in which the students are not particularly interested. The students spend most of<br />
their time on these subjects in their last year in primary school, whereas their interest is<br />
inspired through, for many of them, the subject preparing them for vocational studies.<br />
This indicates some interesting challenges due to the students motivation and teachers’<br />
assessment practices. How can formative assessment help the students to develop selfesteem,<br />
knowledge, visions and intrinsic motivation for further education? And how are all<br />
these challenges dealt with while using portfolio in the formative assessment?<br />
These questions will be followed up by action research in schools and longitudinal study.<br />
122 ENAC 2008
Connecting Research Behaviours with Quality Enhancement of Assessment:<br />
Eliciting Developmental Case Studies by Appreciative Enquiry<br />
Mark Schofield, Edge Hill University, United Kingdom<br />
This paper relates the University’s commitment to systematic enhancement of the student<br />
experience of assessment. This extends beyond quality assurance and juxtaposes research<br />
and development behaviours allied to ‘thicker’ description (Geertz) of complex events in<br />
qualitative, interpretive research approaches with those traditionally ‘thinner’ evaluation<br />
tools characteristic of many university quality assurance systems.<br />
The paper describes the process of a developmental audit across the Faculties of<br />
Education, Health and Arts and Sciences. Dialogues were conducted to explore the<br />
experiences of feedback on assessment of staff and students and those in disability and<br />
specific learning through focus groups and scrutiny of practices against the SENLEF<br />
Principles of Feedback (Student Enhanced Learning through Effective Formative<br />
Feedback). The process also included the elicitation of case studies from staff and students<br />
about their experience of effective feedback on assessment in the form of short writing<br />
activities. These focused on the context of effective feedback, an individual reflection on<br />
why it worked for them, and importantly ideas and guidance for others embarking on trying<br />
similar approaches. As such, this key element of the audit was conducted in the spirit of<br />
Appreciative Enquiry.<br />
Included are reflections on the similarities and differences in these two sets of staff and<br />
student voices and Tag Cloud representations (word frequency analyses) which reveal<br />
dominant and recessive themes in the sample groups. This offers some stark insights into<br />
affective issues related to assessment and feedback, congruence in attitudes and<br />
approaches and some perhaps unexpectedly astute epistemological insights from students.<br />
The case studies will also be offered )in an abridged form), with commentary related to<br />
effective practices and alignment with the SNLEF principles represented in MS Word<br />
comments function, including key questions and challenges arising from the narrative texts.<br />
The full versions will be available via a url/hyperlink in the paper.<br />
We argue that such developmental enquiry (research based activity) has given sightlines<br />
into effective practices, highlights the importance of perceptions of effective feedback, and<br />
emphasises that the processes embodied in this approach add enhancement layers to<br />
extant, historical, quality systems. These approaches are replicable for use in supporting<br />
other lines of enquiry related to assessment and other learning-related aspects of the<br />
student experience. This enrichment of quality processes is achieved by bringing research<br />
behaviours into close juxtaposition with quality assurance systems of intelligence gathering<br />
and by producing data artefacts of both of developmental significance (for use with students<br />
in academic induction and in staff development) and influential in policy decision making<br />
related to dissemination of good practice and systematic enhancement of assessment<br />
practices.<br />
ENAC 2008 123
Conceptions of assessment in higher education:<br />
A qualitative study of scholars as teachers and researchers<br />
Elias Schwieler, Stockholm University, Sweden<br />
Stefan Ekecrantz, Stockholm University, Sweden<br />
The researcher’s professional life world is based on explicit and well reflected subject<br />
specific conceptions. These sophisticated conceptions are the result of an extensive formal<br />
education, followed by life long, advanced learning by conducting research. As a teacher,<br />
the same individual’s pedagogic life world is often exclusively a result of socialization and<br />
the reproduction of existing traditions. Consequently, the teacher in higher education is<br />
expected to develop knowledge about pedagogic work more or less intuitively, based on far<br />
less articulated and reflected conceptions. Thus, the academic profession of<br />
research/teaching can be said to be founded on two professional extremes. There is a need<br />
for an increased understanding of how such double roles and life worlds are constituted,<br />
and how they relate to each other. In the area of assessment, we argue, an individual<br />
researcher’s/teacher’s double belief systems are particularly visible, making it an important<br />
field of study.<br />
We will present preliminary results from an ongoing interview-based study about this<br />
phenomenon, from three different assessment-related themes:<br />
1) Assessment and personal theories of learning – Subject specific and generic beliefs on<br />
how, when and why different aspects of a subject need to be learned and assessed is the<br />
foundation for a teacher’s professional world view. These beliefs are studied, in part, as<br />
implicit theories of threshold concepts (Meyer & Land 2006) and views on backwash effects<br />
of assessment.<br />
2) Assessment and normative values – Summative assessment and grading highlight<br />
underlying perceptions of assessment as a means to discipline, punish and reward. (Filer<br />
2000) Also, both students’ and teachers’ workloads peak during the assessment process,<br />
often leading to stress and tension. In such a climate (Biggs 2007), reflected as well as tacit<br />
professional values are especially important.<br />
3) Methodological and epistemological foundations of assessment – Advanced<br />
epistemological beliefs on knowledge, evidence and scientific method is a vital part of all<br />
academics’ research. The same individuals’ tacit views on assessment epistemology are<br />
often at conflict those upheld in research. A methodology that would not be considered in<br />
research is frequently used uncritically in the enquiry of student learning.<br />
Our aim is, specifically, to include individual inconsistencies, contextual issues, conceptual<br />
discrepancies and unelected assumptions by focusing on each teacher’s conceptions of<br />
assessment. In previous research, with its explanatory focus on idealized models, such<br />
complexities are usually seen as residuals that must to be excluded in order to maintain a<br />
manageable amount of parameters. (Cf. Prosser et al 2005.) Furthermore, in order to grasp<br />
the intricacies of the interviewees' scientific as well as subject specific conceptions, we have<br />
chosen to study only two epistemic communities, History and English literature.<br />
124 ENAC 2008
Innovative Assessment Practice and Teachers’ Professional Development:<br />
Some Results of Austria’s IMST-Project<br />
Thomas Stern, University of Klagenfurt, Austria<br />
IMST (Innovations in Mathematics and Science Teaching) is a long term research and<br />
development project aimed at establishing an effective support system for Austrian schools.<br />
One of seven measures is the IMST-fund for the promotion of innovations in the teaching of<br />
maths, sciences and IT. About 160 teacher teams per year are encouraged to submit<br />
proposals for their classroom innovations, to evaluate both processes and results and to<br />
write reports that are published on the internet. In return they receive intensive individual<br />
counselling and some financial remuneration, and they are invited to several workshops. A<br />
remarkable number of these teachers decide to choose alternative assessment methods as<br />
their classroom innovation and as a field of investigation into their own practice.<br />
A cross-case examination of several school projects focuses on new ways of assessment<br />
that allow the students to some extent to choose their own topics and to keep track of their<br />
learning progress. Two high school teachers e.g. asked their 12 year old students to record<br />
examples of encounters with mathematics in daily life; then they went about assessing the<br />
sophistication and originality of their reports. A physics teacher let her 16 year old students<br />
choose their own fields of interest in astronomy and then draw and present posters, which<br />
she assessed in accordance with criteria she had worked out with her class. The study<br />
shows that self-regulated learning has a strong effect not only on the students’ motivation<br />
and interest but also on their proficiency and learning outcomes. What is even more<br />
impressing is the repercussion of these teaching innovations on the attitudes of the<br />
teachers themselves. In the course of their project about changes in their assessment<br />
practices most of them embarked on a thorough reflection of their teaching priorities, of their<br />
beliefs about learning and of their personal perspectives and ambitions as teachers. Both<br />
their autonomous school innovations and their action research studies can be shown to<br />
have boosted their professional development. Changes in their assessment routines turned<br />
out to have an especially strong impact on many aspects of their professional performance<br />
and were often accompanied by an additional commitment for school development and an<br />
overall increase in reflection about professional standards.<br />
ENAC 2008 125
Characteristics of an effective approach for formative assessment of teachers’<br />
competence development<br />
Dineke Tigelaar, Mirjam Bakker, Nico Verloop<br />
ICLON-Leiden University Graduate School of Teaching, The Netherlands<br />
Stimulating teachers’ professional development is an important function in assessment of<br />
teaching (Porter, Youngs & Odden, 2001). However, more research is needed into the effects<br />
of teacher assessments on teacher professional learning development (Lustick & Sykes, 2006).<br />
The research is part of a larger research project ‘Effects of different assessment<br />
approaches on teachers’ professional development’. The goal of this postdoctoral resarch<br />
project is to evaluate and compare the effects of three formative assessment approaches:<br />
(1) an expertise- en feedback based approach, (2) an approach for self-assessment, and<br />
(3) a negotiated assessment approach. In this research project, the focus is on teachers’<br />
competences for promoting reflective skills of senior secondary vocational students in<br />
health care, i.e. in nursing. Central question: “What are the effects of different formative<br />
teacher assessment approaches on the development of secondary vocational education<br />
teachers’ competences for promoting and formatively assessing students’ reflection skills,<br />
and which combination of assessment design characteristics promotes optimally the<br />
teachers’ competence development?<br />
Research questions:<br />
1. Which assessment criteria and standards are developed, formulated and used in the three<br />
PhD-projects and which set of (common) criteria and standards can be used for a<br />
representative overall measurement of the participating teachers’ competence development?<br />
2. How do the teachers perceive and value the characteristics of the assessment approaches<br />
(see the characteristics 1 – 3 above) in the projects they participated in, and what are the<br />
results of the overall measurement of the teachers’ competence development (see question 1)?<br />
3. What is the relation between the measured teachers’ overall competence development<br />
and a) the assessment approach characteristics as documented by the PhD-researchers as<br />
well as b) the participating teachers’ perceptions and evaluations of these characteristics?<br />
Tasks of the postdoc:<br />
1. Distillation of the common elements in the criteria and standards for teaching<br />
competences formulated in the PhD-projects using matrices (Miles & Huberman, 1994).<br />
2. Development of instruments for the repeated overall measurement of teachers’<br />
competences, and of the teachers’ (N=88) perceptions and evaluations of the assessment<br />
design characteristics, and more general conditions in the schools for teacher professional<br />
development. Video vignettes will be developed, and teachers’ will be asked to select<br />
samples of student work. Furthermore, questionnaires will be developed.<br />
3. Organization of the data gathering (in collaboration with the three PhD-researchers).<br />
4. Analyses of the relations between the measured teachers’ overall competence development<br />
and a) the assessment approach characteristics as documented by the PhD-researchers as well<br />
as b) the participating teachers’ perceptions and evaluations of these characteristics and of the<br />
more general relevant conditions. This will be done using qualitative analyses (matrices) and<br />
quantitative analyses (analysis of variance, multiple regression analysis, and multilevel analysis).<br />
5. Development of an optimal combination of design characteristics.<br />
We stimulate both a research and practice related discussion.<br />
126 ENAC 2008
Posters<br />
ENAC 2008 127
128 ENAC 2008
Predictive indicators of academic performance at degree level<br />
Andy Bell, Manchester Metropolitan University, United Kingdom<br />
Kevin Rowley, Manchester Metropolitan University, United Kingdom<br />
In the absence of A* grades at A Level, Cambridge University has designed an additional<br />
Admissions selection ‘tool’ – hence UCLES (University of Cambridge Local Examinations<br />
Syndicate) has produced the Thinking Skills Assessment Test’ (TSA Test).<br />
The TSA Test is designed as a ‘knowledge-independent’ measure of the candidate’s ability<br />
to think effectively and critically. This test is composed of two types of questions: ‘problemsolving’<br />
questions, and questions which ‘tap into’ ‘critical thinking’ abilities.<br />
The extent to which the TSA Test is predictive of performance at degree level has yet to be<br />
established. Initial ‘in-house’ research by Cambridge suggests that there is indeed a<br />
significant predictive link between scores on the TSA Test and performance at degree level<br />
for students at Cambridge (Emery, J. L. et al, 2006; Emery, J.L, 2006).<br />
Cambridge University, then, is currently involved in appraising the TSA Test as part of its<br />
Admissions process. If used extensively to select students for a place at Cambridge, such<br />
use would have to be seen as justified – otherwise, it would be unfairly discriminatory. The<br />
present research at the Manchester Metropolitan University (MMU) was designed to add to<br />
the knowledge base concerning the validity of the TSA Test as a predictor of performance<br />
at degree level and, ipso facto, its validity as an Admissions ‘tool’.<br />
Hence, this paper addresses research currently being conducted to examine the extent to<br />
which the TSA Test is predictive of success at degree level at a non-Oxbridge institution<br />
(Manchester Metropolitan University). Four cohorts of first-year Psychology undergraduates<br />
(total N = approx. 350) completed Test L (a research version of the TSA Test). With the<br />
students’ consent, their academic performance was tracked throughout the three years of<br />
their degree-level studies. It was thus possible to examine the extent to which students’<br />
scores on the TSA Test were predictive of degree level performance in examinations and<br />
assessed coursework (ACW).<br />
Factors other than the TSA Test – such as personality traits as measured by Quintax<br />
(Stuart Robertson & Associates, 1999); performance at A Level and participants’ scores on<br />
an IQ-type test (Ravens Progressive Matrices - Plus) – were also examined as possible<br />
predictors of students’ success at degree level. Students’ scores on the three sub-scales of<br />
the Approaches & Study Skills Inventory for Students (ASSIST / Entwistle, 2000) were also<br />
established and possible links with academic performance were examined.<br />
In addition to the above, an adaptation of the ‘Big Five’ scale provided on the website of the<br />
International Personality Item Pool (IPIP) is currently being developed (Bell & Rowley,<br />
2008). This is the ‘Big Five for Students’ scale. This will undergo factor analysis and item<br />
analysis. Then students’ scores on this scale will be correlated with their academic<br />
performance at degree level. As this scale is a recent development, this aspect will only<br />
include the 2007-2008 First Year cohort of students (N= 101).<br />
This research will be completed and all data analysis will be conducted in time for<br />
presentation to the EARLI (2008) conference in Berlin.<br />
ENAC 2008 129
Online Formative Assessment for Algebra<br />
Christian Bokhove, Utrecht University, Netherlands<br />
Rationale: In the Netherlands – as in many other EU-countries – universities complain about<br />
the algebraic skill level of students coming from secondary school. It is unclear whether<br />
these complaints have to do with basic skills or “actual conceptual understanding”, which<br />
we refer to as symbol sense (Arcavi, 1994). In this research we want to find out how ICT<br />
tools can help with formatively assessing algebraic skills.<br />
Key concepts: Three key topics come together in this poster session, forming the<br />
conceptual framework for our observations: tool use, assessment and algebraic skills.<br />
The first topic concerns acquiring algebraic skills. Here we discern basic skills, for example<br />
solving an equation, but in particular conceptual understanding. Arcavi (1994) calls this<br />
“symbol sense”.<br />
In this case, formative assessment would be appropriate; aimed at assessment for learning.<br />
Assessment contributes to learning and understanding of concepts (Black & Wiliam, 1998).<br />
Feedback plays an important in formative assessment. On the other hand, still getting<br />
scores and results, remains it also is important to record the progress of a student:<br />
summative assessment, assessment of learning. Using both to get ‘the best of both worlds’.<br />
In assessment for learning using ICT tools can be beneficial. ICT tools can help with giving<br />
users feedback, may focus on process rather than result, track results or scores and<br />
provide several ‘modes’, ranging from practice to exam. Thus, assessment is for learning.<br />
Method: In this poster session experiments with with an ICT tool called Digital Mathematical<br />
Environment (Bokhove, Koolstra, Heck, & Boon, 2006) are described. Through expert<br />
reviews, one-to-ones and small group experiments we provide a framework on how<br />
formative assessment can support learning for mathematics. We mention:<br />
Using several ‘modes’ of assessment during a sequence of lessons: first practice with more<br />
feedback, gradually more exam-like assessment without feedback.<br />
Emphasis on the process: how does a student reach his/her correct or wrong answer. This<br />
information can be used in a subsequent lesson.<br />
Results: Embedding use of an algebra tool in a didactical scenario, where self-assessment<br />
and classroom feedback make up a balanced curriculum for attaining sufficient algebraic<br />
skills, is an important part of formative assessment. We will describe the preliminary results<br />
of possible didactical scenario’s with the Digital Mathematical Environment. These<br />
scenario’s will be used for further research on the subject.<br />
Discussion: We would like to discuss the implications for classroom practice when using ICT<br />
tools for (formative and summative) assessment, and what didactical scenario’s are best<br />
suited for acquiring algebraic skills, by using ICT tools.<br />
References<br />
Arcavi, A. (1994). Symbol Sense: Informal Sense-Making in Formal Mathematics. For the Learning of<br />
Mathematics, 14(3), 24-35.<br />
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles,<br />
Policy & Practice, 5(1), 7-73.<br />
Bokhove, C., Koolstra, G., Heck, A., & Boon, P. (2006). Using SCORM to Monitor Student Performance:<br />
Experiences from Secondary School Practice. Math CAA series.<br />
130 ENAC 2008
Investigating the use of short answer free-text e-assessment questions<br />
with instantaneous tailored feedback<br />
Barbara Brockbank, The Open University, United Kingdom<br />
Sally Jordan, The Open University, United Kingdom<br />
Tom Mitchell, Intelligent Assessment Technologies Ltd., United Kingdom<br />
Warburton and Conole (2005) argue that ‘It seems likely that the drive towards emergent<br />
technologies such as simulations and free-text marking will result in increasingly strong<br />
competitive pressures against the more traditional ‘standardised testing’, purely objective<br />
types of CAA system.’ This paper describes the application of such an emergent<br />
technology, grounded in a desire to improve the student learning experience.<br />
The UK Open University’s OpenMark assessment system enables students to be provided with<br />
immediate and tailored feedback on their responses to questions of a range of types, including<br />
those requiring free-text entry of numbers, symbols and single words (Ross, Jordan and<br />
Butcher, 2006). This study is an investigation into the viability and effectiveness of adding<br />
questions which require free-text responses of up to about 20 words in length. Answer matching<br />
is provided by an authoring tool supplied by Intelligent Assessment Technologies Ltd. (IAT)<br />
which is able to perform an intelligent match between free-text answers and predefined<br />
computerised model answers. Thus an answer such as ‘the Earth orbits the Sun’ can be<br />
differentiated from ‘the Sun orbits the Earth’ and an answer of ‘The forces are balanced’ is<br />
marked as correct whereas an answer of ‘The forces are not balanced’ is not. The tool looks for<br />
understanding without unduly penalising errors of spelling, grammar or semantics.<br />
The questions are delivered to students online and instantaneous targeted feedback is<br />
provided on both specifically incorrect and incomplete answers. Another novel feature of the<br />
project has been the use of student responses to early developmental versions of the<br />
questions – themselves delivered online – to improve the answer matching.<br />
Students have been observed performing the assessment tasks. Most claim that they wrote<br />
their responses as if for a human marker. However a few were conscious that they were<br />
being marked by a computer and anticipating (incorrectly) that only keywords were required,<br />
entered answers either in note form or in very long sentences. Most students enjoyed the<br />
assessment tasks and seemed comfortable with the concept of a computer marking freetext<br />
responses. Where the initial response was incorrect, most students were observed to<br />
use the advice provided by the feedback and many reached the correct answer.<br />
A human-computer marking comparison has indicated that the computer’s marking is typically<br />
indistinguishable from that of six subject-specialist human markers. The computer’s marking<br />
was generally accurate, showing greater than 95% concordance with the question author. A<br />
small number of these questions have been incorporated into regular summative interactive<br />
computer marked assignments on a new distance-learning interdisciplinary science course.<br />
We will encourage discussion of our evaluation findings and of the technological, financial,<br />
cultural and pedagogical issues which appear to limit take-up of assessment of this type.<br />
References<br />
Ross, S.M., Jordan, S.E and Butcher, P.G.(2006) Online instantaneous and targeted feedback for remote<br />
learners. In Innovative assessment in Higher Education ed. Bryan, C & Clegg, K.V., pp123-131<br />
London U.K., Routledge.<br />
Warburton, W and Conole, G (2005) Wither e-assessment. Proceedings of the 2005 CAA Conference at<br />
http://www.caaconference.com/pastConferences/2005/proceedings/index.asp [accessed 1st<br />
February 2008]<br />
ENAC 2008 131
Contextualized reasoning with written and audiovisual material:<br />
Same or different?<br />
Nina Bucholtz, Maren Formazin, Oliver Wilhelm<br />
IQB, Humboldt University Berlin, Germany<br />
The ability to arrive at valid conclusions from given information and to comprehend given material<br />
of non trivial complexity is of importance for many aspects in life, e.g., for learning and acquiring<br />
knowledge. Fluid intelligence (gf) or more specifically reasoning can be regarded as the main<br />
prerequisite for contextualized reasoning. In addition, relevant domain specific knowledge –<br />
supposedly a content-specific component of crystallized intelligence (gc) - can aid in solving<br />
contextualized reasoning tasks. We have developed an innovative measure of contextualized<br />
reasoning in order to further investigate the distinction between decontextualized reasoning tasks<br />
as included in intelligence tests and contextualized reasoning tasks as a relevant aspect of<br />
student achievement. The new measure is expected to tap both abstract reasoning ability (gf) and<br />
the recall of acquired information (gc). Contextualized reasoning measures differ from traditional<br />
comprehension tests – as for example included in the PISA studies – because they focus on<br />
specific content as opposed to rather general and somehow arbitrary topics.<br />
One aim of our efforts is to bridge the gap between intelligence and student achievement.<br />
Additionally, we want to overcome the shortcoming of focusing on written material in<br />
contextualized measures by also considering audiovisual material. With this design we hope<br />
to encompass contextualized reasoning in a wider sense, including typical learning<br />
situations encountered by students.<br />
In order to reach the above research aims we have designed two studies. The most critical<br />
research questions were:<br />
1. Can short video sequences presented via PDA or notebook be embedded into<br />
contextualized reasoning tasks that meet all requirements of standardized measures of<br />
maximal behaviour?<br />
2. Are such audiovisual contextualized reasoning tasks equivalent to paper pencil based<br />
contextualized reasoning tasks?<br />
3. Is performance in audiovisual and traditional contextualized reasoning tasks a linear<br />
function of decontextualized reasoning and relevant (i.e. natural sciences) domain<br />
specific knowledge?<br />
In study one, a newly developed audiovisual test that comprises video sequences of 3 to 5<br />
min length was piloted with N = 86 high school students. The videos include real life scenes<br />
or animated simulations from biology, chemistry, physics, and geography. Participants<br />
watch every video once and then have to answer several comprehension questions on the<br />
basis of the video. A paper pencil based contextualized reasoning test was matched in<br />
content and comprises texts, tables and figures. All questions cover the circumscribed<br />
domain of natural sciences. A g-factor-model on the basis of testlets was established for<br />
both contextualized reasoning tests separately. Both models fit the data well. The relation<br />
between these two latent factors in a SEM was r = .94; fit of this model was not noticeable<br />
better than the fit of a model with a single latent factor across both tests.<br />
In a second study that is currently running with about 200 participants, an effort is made to<br />
replicate the results from study one and to address research question number three from<br />
the above list. Results of this study will be presented and discussed with regard to<br />
implications for further test development and educational implications.<br />
132 ENAC 2008
Effects of Large Scale Assessments in Schools:<br />
How Standard-Based School Reform Works<br />
Tobias Diemer, Freie Universität Berlin, Germany<br />
Harm Kuper, Freie Universität Berlin, Germany<br />
Comparative large scale assessments form a centerpiece of recent standard-based reforms<br />
of the educational systems of the Federal Republic of Germany. The formulation standards<br />
in education and the implementation of according large scale standard tests in many States<br />
(Länder) of Germany mark a considerable shift away from an input-oriented towards an<br />
output-oriented account of governance. By providing and feeding back standardized and<br />
comparative data about pupils’ achievement, standard tests intend to make teachers and<br />
schools accountable and to help them to control and improve the outcomes of the pupils’<br />
work by means of evidence-based decision-making processes concerning the profession of<br />
teaching as well as the task of organizing schooling.<br />
The proposed poster deals with the question of whether and how the results from<br />
comparative large scale assessments are utilized by teachers as professionals and schools<br />
as organizations. It therefore will examine the profession- and organization-related processes<br />
and effects that are produced in line with large scale standardized testing and the feeding<br />
back of comparative results to teachers and schools. The paper will present typological<br />
descriptions of effects and process-related patterns of individual as well as collective databased<br />
decision-making that is based on large scale assessment test results in schools.<br />
Special attention will be drawn on noticeable consequences and changes regarding the<br />
conceptualisation and the design of teaching and learning processes by teachers.<br />
Exploration and analysis of the outlined effects and processes are carried out on two levels<br />
of abstraction. On a first, comparatively concrete level, the observable phenomena are<br />
described within the framework of a heuristic model suggested by Helmke (2004).<br />
According to this model the process of utilization of test results conceptually subdivides into<br />
four cyclically iterative stages: (1) reception, (2) reflection, (3) action, and (4) evaluation.<br />
Subsequently, the findings described within these categories are further aggregated as well<br />
as re-aggregated on a more abstract level. On this level theories of professions and<br />
organizational theories, particularly new institutionalism, sensemaking theory and system<br />
theory are analysed in reference to the results found on the more concrete level.<br />
Within these conceptual frameworks, empirical evidences will be presented that provide<br />
systematic as well as exemplary insight into the ways standard-based school reform works<br />
in schools. Furthermore, by reason of the integration of profession- and organization-related<br />
models, the study contributes to the development of a general theory of the functioning of<br />
school development in the context of the present standard-based and outcome-orientated<br />
paradigm of governance within the educational system.<br />
To capture the processes of decision-making, it is used a longitudinal case study approach<br />
based on semi-structured qualitative problem-centered interviews with headmasters and<br />
teachers. The material is analyzed according to procedures of qualitative content analyses<br />
and grounded theory. The data basis consists of about 120 interviews and 8 observations,<br />
that are accomplished in 4 schools in the space of 4 data collecting phases spanning over a<br />
period of 2 years. Due to its longitudinal design, the study gives information on the processrelated<br />
conditions and effects of standard-based reforms in schools.<br />
ENAC 2008 133
Support in Self-assessment in Secondary Vocational Education<br />
Greet Fastré, Marcel van der Klink, Dominique Sluijsmans, Jeroen van Merriënboer<br />
Open University, The Netherlands<br />
Despite the importance placed on student’s self-assessment in current education, it appears<br />
that students are not always able to assess themselves accurately, because they are<br />
insufficiently able to decide on which criteria they should assess themselves.<br />
In current assessment practices, students are often asked to come up with self-generated<br />
criteria and standards on which they want to assess themselves. However, it appears that<br />
students at the beginning of their study are not able to identify the standards and criteria<br />
themselves because they do not have a clear view on what is expected of them when it<br />
comes to their learning outcomes. It is thus the question if novice students should be asked<br />
to self-generate the assessment criteria.<br />
If students are given assessment criteria, still, most of the time, only a few assessment<br />
criteria are relevant for a certain task. When students need to become competent self-<br />
assessors, they should not only be able to make an accurate assessment, but they should<br />
also be capable in taking a good decision on which criteria are relevant and which criteria<br />
are not relevant for assessing a task (Sadler, 1989). This is certainly true in the case of<br />
assessing real-life whole tasks. In real-life tasks, resembling professional life, a large<br />
database of potential performance criteria could reasonably be considered. The whole set<br />
of criteria can be split up in two parts: relevant and irrelevant criteria. In today’s educational<br />
practices, often, no information on the relevance of the criteria is available for the students<br />
in advance. The question arises if students are capable of selecting the relevant criteria<br />
from the whole set of criteria. Regehr and Eva (2006) state that when students get the<br />
freedom of choosing on which criteria they want to assess themselves, there is a risk that<br />
they will only highlight the criteria on which they perform well or which they like because<br />
people naturally strive at creating a positive feeling. The risk is that students will thus not<br />
recognize exactly those learning needs that are really necessary.<br />
In this study, it is hypothesized that students who receive information on the relevance of<br />
the criteria can produce a more accurate self-assessment than students who do not receive<br />
information on the relevance of the criteria. Furthermore, we expect that students with a<br />
high accuracy of self-assessment are more competent in selecting points of improvement<br />
than students with a low accuracy of self-assessment. In the end, we expect there to be a<br />
positive relation between the accuracy of student’s self-assessment skills and student’s task<br />
performance.<br />
One hundred and six first-year students of a Secondary Vocational Education in Nursing<br />
participated in this study. The experimental design was a 2x2 factorial pre-test - post-test<br />
design in which the effects of ‘information on the relevance of the criteria’ (Relevant criteria<br />
vs. All criteria) and ‘variability in learning trajectory’ (School-based vs. Practice-based) were<br />
studied. Data are collected at the moment and results will be available by time of the<br />
conference.<br />
134 ENAC 2008
The confidence levels of course/subject coordinators in undertaking<br />
aspects of their assessment responsibilities<br />
Merrilyn Goos, Clair Hughes, Ann Webster-Wright<br />
The University of Queensland, Australia<br />
This paper reports the findings of an investigation of the confidence levels of course/subject<br />
coordinators in undertaking aspects of their assessment responsibilities at a large<br />
metropolitan university. Like universities in many other parts of the world, the Australian<br />
institution in which this investigation was undertaken is experiencing “a period of rapid<br />
change and innovation in relation to assessment policies and practice” (Havnes &<br />
McDowell, 2008, p. 3). The pressures for change and innovation range from developing<br />
pedagogical advances that call into question many traditional assessment practices to the<br />
challenges presented by the increasing student diversity, class sizes and casualisation of<br />
teaching, which, along with diminishing resources, characterise contemporary educational<br />
contexts (Anderson et al, 2002).<br />
The investigation was one element of a situational analysis which formed the first phase of<br />
a broader project aimed at supporting the leadership capacities of course/subject<br />
coordinators as assessment innovators. This group was targeted because, though<br />
significant in the implementation of institutional assessment policy, the role is scarcely<br />
researched despite it being highly likely that improved performance would benefit student<br />
learning (Blackmore et al, 2007). Confidence is considered central to the ability to learn<br />
about and master new practices (Gaven, 2004) and was identified as an issue for this group<br />
through an earlier pilot conducted by of one of the project team.<br />
The investigation took the form of an online survey of all course coordinators (response rate<br />
33%). Survey items were developed from the responsibilities and expectations either<br />
explicated or implied in institutional policies and rules. The survey identified areas of<br />
particularly high (e.g. making and defending summative judgements) and low (e.g. dealing<br />
with plagiarism and locating support when needed) levels of confidence. The paper will<br />
report survey findings in relation to individual items as well as the influential factors that<br />
emerged from analysis and the correlation of particular factors with demographic data such<br />
as years of experience and gender. In addition, coordinators provided open-ended<br />
comment, the analysis of which is used to elaborate on or clarify particular findings in<br />
relation to their positive or negative impact on confidence.<br />
The project was funded through the Fellowship scheme of the (Australian) Carrick Institute<br />
for Learning and Teaching in Higher Education.<br />
References<br />
Anderson, D., Johnson, R., & Saha, L. (2002). Changes in Academic Work: Implications for Universities of<br />
the Changing Age Distribution and Work Roles of Academic Staff. Canberra: DEST.<br />
Blackmore, P., Law, S., & Dales, R. (2007). Investigating the capabilities of course and module leaders in<br />
departments. Paper presented at the Higher Education Academy Annual Conference, Harrogate,<br />
Graven, M. (2004). Investigating mathematics teacher learning within an in-service community of practice:<br />
The centrality of confidence. Educational Studies in Mathematics, 57, 177-211.<br />
Havnes, A., & McDowell, L. (2008). Assessment dilemmas in contemporary learning cultures. In A. Havnes<br />
& L. McDowell (Eds.), Balancing Dilemmas in Assessment and Learning in Contemporary Education.<br />
New York: Routledge.<br />
ENAC 2008 135
Useful feedback and flexible submission:<br />
Designing and implementing innovative online assignment management<br />
Stuart Hepplestone, Sheffield Hallam Univeristy, United Kingdom<br />
Specific functionality has been added to the Blackboard virtual learning environment at<br />
Sheffield Hallam University (SHU) to enhance the way in which feedback can be provided to<br />
students and to improve the way student assignments are processed. This poster will<br />
explore the practical experience of designing and implementing a customised assignment<br />
handler tool in response to rising student expectations of online feedback and online<br />
assignment submission. (This poster presentation accompanies the short paper session,<br />
Secret scores: Encouraging student engagement with useful feedback, which discusses the<br />
use of technology in providing useful feedback to students).<br />
The design of this innovative assignment handler tool was achieved by mapping out the<br />
lifecycle of a student assignment and highlighting key functional areas for development.<br />
These have been developed into an innovative assignment handler tool which:<br />
1. Supports the online delivery of useful feedback through the Blackboard Gradebook by:<br />
• batch upload of individual file attachments providing detailed feedback along with<br />
student marks (whether the original work is submitted through Blackboard, or in a nonelectronic<br />
format such as hard-copy, by portfolio or presentation)<br />
• allowing partial cohort feedback to be uploaded by each member of the marking team<br />
• providing feedback on group assignments to each individual in the group, rather than<br />
one per group<br />
• giving students access to their feedback all in one place and presented as close to their<br />
learning as possible<br />
• encouraging students to engage and reflect on their feedback in order to activate the<br />
release of their marks (after Black & Wiliam, 1998, who argued that the “effects of feedback<br />
was reduced if students had access to the answers before the feedback was conveyed”).<br />
2. Supports the online submission of student work through Blackboard by providing students with<br />
a detailed electronic receipt of their assignment submission.<br />
The poster will present a visual representation of the lifecycle of a student assignment, clearly<br />
indicating where students have responsibilities in the course of completing and submitting<br />
assignments, and reflecting and acting upon feedback (Hepplestone & Mather, 2007). Information<br />
about an accompanying electronic feedback wizard development will also be displayed.<br />
SHU is a large regional University with over 28,000 students. It is based on three campuses<br />
and offers courses in a diverse range of academic subjects at both undergraduate and<br />
postgraduate levels.<br />
References<br />
Black, P. & Wiliam, D. (1998) Assessment and classroom learning. Assessment in Education, 5 (1), pp.7-74.<br />
Hepplestone, S. & Mather, R. (2007) Meeting Rising Student Expectations of Online Assignment<br />
Submission and Online Feedback, [online] In: Proc. 11th Computer-Assisted Assessment<br />
International Conference 2007, Loughborough, 10-11 July 2007. Learning and Teaching<br />
Development, Loughborough University. Last accessed 12 February 2007 at:<br />
http://www.caaconference.com/pastConferences/2007/proceedings/Hepplestone%20S%20Mather%<br />
20R%20n1_formatted.pdf<br />
136 ENAC 2008
The challenge of engaging students with feedback<br />
Rosario Hernandez, University College Dublin, Ireland<br />
Effective and high quality feedback is often regarded as a key element of excellence in<br />
teaching that supports student learning (Ramsden, 2003; Black and William, 1998, Sadler,<br />
1989). Despite this, feedback is often regarded by teachers as a labour-intensive activity<br />
that frequently makes little impact on student learning. Similarly, students have stressed<br />
that sometimes they do not understand the feedback they receive, that the feedback is too<br />
vague or that it does not provide them with suggestions on how to improve their work.<br />
These comments are particularly relevant in the teaching of modern languages in higher<br />
education where the feedback provided by teachers often focuses on the correction of<br />
grammatical mistakes and the provision of correct answers. Adding to that pressure is the<br />
fact that there are large numbers of students in classes and the practice of offering the<br />
“traditional” timely written feedback to students has become a struggle for many teachers.<br />
After an initial study of the issues concerning academics and students in the provision of<br />
effective feedback, an action-research project was undertaken with a group of<br />
undergraduate students of Hispanic Studies at University College Dublin. Throughout a<br />
semester, the duration of the module chosen for the study, students were provided with a<br />
variety of learning tasks whose main aim was to engage students with feedback. This<br />
approach to teaching and assessment required the involvement of students in a variety of<br />
learning activities, among others their participation in dialogue about the assessment criteria<br />
adopted, the use of assessment sheets with comments to act on them, the reading and<br />
critiquing of their work and that of their classmates (self-and peer-assessment) or the<br />
provision of feedback, by the teacher, with no grades. Written and oral data were collected<br />
by the teacher of this module at different moments during the semester in order to explore<br />
the experiences of the students with regard to this approach to feedback. Furthermore, a<br />
focus-group session was conducted in class at the end of the semester. This paper reports<br />
on the outcomes of the data collected throughout the semester, on the focus-group session<br />
and on the challenges that this approach to the provision of feedback to students entailed.<br />
ENAC 2008 137
Towards More Integrative Assessment<br />
Dai Hounsell, University of Edinburgh, United Kingdom<br />
Chun Ming Tai, University of Edinburgh, United Kingdom<br />
Rui Xu, Ningbo University, China<br />
Assessment in higher education is typified by competing tensions between multiple<br />
purposes, functions and stakeholders; wide diversity in practices within and across subject<br />
areas, courses and institutions; and diffuse responsibilities for the oversight and<br />
management of different aspects of assessment. Achieving coherence and integration in<br />
assessment practices, processes and policies is therefore a formidable challenge.<br />
This poster summarises the outcomes of a project which drew extensively upon the<br />
international literature on assessment in higher education to examine how a more<br />
integrative approach might be pursued. The project was undertaken as part of a sector-wide<br />
initiative in Scottish higher education on quality enhancement.<br />
The main outcomes of the project were a workshop programme and four guides, each of<br />
which focused on a key aspect of Integrative Assessment:<br />
• Monitoring Students’ Experiences of Assessment. This guide examines strategies to ascertain<br />
how well assessment in its various manifestations is working, so as to build on strengths and<br />
take prompt remedial action where helpful. It explores why it is important to monitor assessment<br />
practices systematically, what aspects of assessment are currently well-monitored in Scottish<br />
universities, and how the monitoring of assessment could be improved.<br />
• Balancing Assessment of and Assessment for Learning. This guide discusses ways of striking<br />
an optimal balance between the twin central functions of assessment, i.e. to evaluate and certify<br />
students’ performance or achievement, and to assist students in fulfilling their fullest potential as<br />
learners. It highlights some undesirable side-effects of imbalances and explores four strategies<br />
to rebalance assessment: feed-forward assessments, cumulative coursework, betterunderstood<br />
expectations and standards, and speedier feedback. Each strategy is illustrated<br />
with case-examples from a range of subjects and settings.<br />
• Blending Assignments and Assessments for High-Quality Learning. The starting-point for<br />
this guide is why it might be important not only to assess students' progress and<br />
performance by a variety of means, but also to consider what combination or blend of<br />
assignments and assessments in a course or programme of study might be optimal. The<br />
guide goes on to explore four important considerations that can shape how assignments<br />
and assessments are blended: blending for alignment of assessment and learning; blending<br />
for student inclusivity; blending to support progression in students’ understanding and skills,<br />
and blending for economy and quality. Examples and case reports are outlined from a<br />
cross-section of subject areas and course settings.<br />
• Managing Assessment Practices and Procedures. This guide argues that while most<br />
dimensions of assessment are generally well-managed, there are also aspects which have<br />
often not received the weight of attention they seem to warrant in the contemporary<br />
university. These aspects are: managing assessment for as well as assessment of learning;<br />
enabling evolutionary change in assessment; and wider sharing of responsibilities for<br />
managing assessment practices and processes.<br />
The four guides are freely downloadable from the Scottish Universities’ Enhancement<br />
Themes website (http://www.enhancementthemes.ac.uk/publications/) and a web-based<br />
version of the guides is being launched in spring 2008.<br />
138 ENAC 2008
Using a framework adapted from Systemic Functional Linguistics<br />
to enhance the understanding and design of assessment tasks<br />
Clair Hughes, The University of Queensland, Australia<br />
The plentiful and steadily increasing literature on teaching and learning in higher education<br />
has produced a number of helpful frameworks and guidelines that can be applied to the<br />
development and communication of assessment practice. As an educational developer I<br />
regularly deploy a core group of appropriate resources in the selection and staging<br />
(adaption of McAlpine, 2004) of assessment tasks that target specific cognitive levels<br />
(Krathwohl, 2002), the planning of feedback (Gibbs and Simpson, 2004: Price and<br />
O’Donovan, 2006) and the making of assessment judgements (Biggs and Collis, 1991).<br />
The literature however, is surprisingly light on material to support the analysis and<br />
purposeful design of individual assessment tasks. This gap initially became an issue for me<br />
when working with academics in adjusting assessment tasks to minimize opportunities for<br />
plagiarism. Our work was limited by several factors including a failure to acknowledge the<br />
wide variations in both task type and level of demand that can distinguish assessments<br />
within and between such categories as ‘orals’, ‘examinations’ or ‘assignments’; the<br />
identification of tasks by reference to subject matter and activity only; and, a belief that<br />
assessment tasks are restricted to the traditional or ‘signature’ forms of assessment<br />
associated with particular disciplines (Bond, 2007).<br />
This paper reports the outcome of my efforts to locate a framework that would provide the<br />
shared concepts and terminology required as a basis for productive and meaningful<br />
discussions of assessment tasks with academics. In broadening my search beyond the<br />
assessment literature, I investigated systemic functional linguistics (SFL) (Eggins, 2004;<br />
Knapp and Watkins, 2005). The resulting framework that is described has proved a useful<br />
resource for explicating the components of assessment tasks including many that were<br />
previously overlooked or inferred – audience, student perspective, mode of presentation<br />
and so on. The paper outlines the application of the framework to the original purpose of<br />
‘designing out’ opportunities for plagiarism and concludes that the framework has significant<br />
further potential to introduce academic teachers to a vast but generally unfamiliar literature<br />
on the systematic development of academic communication skills (see for example Swales<br />
and Feak, 2004) and as a basis for the critique of assessment as cultural practice.<br />
References<br />
Biggs, J., & Collis, K. (1982). Evaluating the Quality of Learning - the SOLO Taxonomy. New York: Academic Press.<br />
Bond, L. (2007). Toward a Signature Assessment for Liberal Education. Retrieved January 23, 2008, from<br />
http://bondessays.carnegiefoundation.org/?p=8<br />
Eggins, S. (2004). An Introduction to Systemic Functional Linguistics. Continuum. London & New York.<br />
Gibbs, G., & Simpson, C. (2004). Conditions under which assessment supports students' learning.<br />
Learning and Teaching in Higher Education Retrieved 19 April, 2005, from<br />
http://www.glos.ac.uk/shareddata/dms/2B70988BBCD42A03949CB4F3CB78A516.pdf<br />
Knapp, P., & Watkins, M. (2005). Genre, text, grammar: technologies for teaching and assessing writing<br />
Sydney: UNSW Press.<br />
Krathwohl, D. (2002). A revision of Bloom's taxonomy: An overview. Theory into Practice, 41(4), 212-218.<br />
McAlpine, L. (2004). Designing learning as well as teaching. Active Learning in Higher Education, 5(2), 119-134.<br />
Swales, J., & Feak, C. (2004). Academic Writing for Graduate Students: Essential Tasks and Skills<br />
(Second ed.). Ann Arbor: The University of Michigan Press.<br />
ENAC 2008 139
The use of transparency in the "Interactive examination" for student teachers<br />
Anders Jonsson, Malmö University, Sweden<br />
If the aim of education is for all students to learn and improve, then the expectations must<br />
be transparent to the students. In this study, three aspects of transparency are investigated<br />
in relation to an examination methodology for assessing student teachers' skills in analyzing<br />
classroom situations and in self-assessing their answers: self-assessment criteria, a scoring<br />
rubric, and exemplars. The examinations studied were carried out in 2004, 2005, and 2006<br />
respectively, all with a cohort of first year student teachers (n = 170, 154, and 138). There<br />
was a large difference in scores between the 2004 and 2005 cohorts (effect size, d = 3.21),<br />
when changes in the examination were implemented in order to increase the transparency.<br />
The comparison between 2005 and 2006, when no further changes were made, does not<br />
show a corresponding difference (d = .27). These results suggest that, by making the<br />
assessment more transparent, students’ performances could be greatly improved.<br />
140 ENAC 2008
School monitoring in Luxembourg:<br />
computerized tests and automated results reporting<br />
Ulrich Keller, Monique Reichert, Gilbert Busana, Romain Martin<br />
University of Luxembourg, Luxembourg<br />
This presentation will introduce the Luxembourgish school monitoring project, focusing on<br />
the various tools that were developed and used, especially the more innovative tools<br />
developed for internet-based computer assisted testing and automatic report generation.<br />
Luxembourg’s school system faces a transition encountered in many countries throughout<br />
the world: a transition towards more autonomy for individual schools. This necessitates the<br />
establishment of a school monitoring program, regularly assessing the progress of students<br />
in a variety of areas including, but not limited to, academic achievement.<br />
Apart from the development of valid, reliable and objective measures, two other<br />
requirements for the success and usefulness of such a project are the economical<br />
administration of tests and comprehensive reporting of relevant results. In this presentation,<br />
we will introduce the internet-based testing platform TAO and tools used for automatic<br />
report generation. Though developed in a country with a small population, these tools scale<br />
very well to other contexts and countries with much larger populations than in Luxembourg.<br />
We will also outline further possible developments of these tools in order to respond more<br />
fully to the demands of evidence based decision making in an educational context.<br />
ENAC 2008 141
Mathematical power of special needs students<br />
Marjolijn Peltenburg, FIsme, Utrecht University, The Netherlands<br />
Marja van den Heuvel-Panhuizen, FIsme, Utrecht University, The Netherlands/<br />
IQB, Humboldt University Berlin, Germany<br />
The poster will inform the conference participants about a small-scale study that forms the<br />
start of the IMPulSE project. This is a large project aimed at revealing the undisclosed<br />
mathematical power of special needs students (Van den Heuvel-Panhuizen & Peltenburg,<br />
2007). The purpose of the small-scale study is to pilot a set of test items which differ from<br />
regular grade-level achievement tests used to determine students’ mathematics<br />
understanding. The items in the pilot have been designed with the intention of offering<br />
children optimal possibilities to show what they are capable of. An important characteristic<br />
of these items is their ‘elasticity’. Elasticity in items allows different levels of strategy use,<br />
which makes it possible for students to pass the limits of their assumed capacities. This<br />
reduces the ‘all-or-nothing’ character of assessment (Van den Heuvel-Panhuizen, 1996).<br />
To reveal the undisclosed mathematical power of weak students we chose a topic that is<br />
recognized as difficult for weak students: subtraction with “borrowing”; that means<br />
subtraction problems in which the 1’s digit of the subtrahend is larger than the 1’s digit of<br />
the minuend (e.g., 52–17 = ...). A frequently made mistake in these problems is reversing<br />
the digits (in this case, subtracting 2 from 7 instead of 7 from 2).<br />
The set of items that is presented to the children includes fourteen subtraction problems in<br />
the number domain up to 100. The items are taken from the Cito LOVS Test for Mid Grade<br />
6, but re-designed and placed in an ict environment in which the children are offered a<br />
dynamic visual tool to find the answers. We expect that this tool will help students to<br />
overcome the obstacles, as mentioned above, in solving these subtraction problems which<br />
require “borrowing”.<br />
The data-collection takes place in two schools for primary special education. In total, the set<br />
of items is piloted with 20 children. While working on the computer, the children’s steps<br />
through the program are recorded by the Camtasia Studio software. The analysis of the<br />
data focuses on the correct scores in the two conditions – regular Cito LVS Test and ict<br />
version with the dynamic tool – and on tool use in the ict version.<br />
The poster shows a sample of the problems used in the study and a summary of the<br />
findings. In addition to the results presented on the poster, Camtasia clips will be shown on<br />
a laptop. During the poster presentation, we would like to share with the audience our<br />
experiences with using an ict-based dynamic assessment format to reveal weak students’<br />
learning potential. In connection with this, we would also like to discuss ways to continue<br />
this research.<br />
References<br />
Van den Heuvel-Panhuizen, M. (1996). Assessment and realistic mathematics education. Utrecht: CD-ß<br />
Press/Freudenthal Institute, Utrecht University.<br />
Van den Heuvel-Panhuizen, M. & Peltenburg, M (2007). Unused learning potential of special-ed students in<br />
mathematics. Research proposal. Utrecht, the Netherlands: Freudenthal Institute for Science and<br />
Mathematics Education.<br />
142 ENAC 2008
Quality Assurance review of clinical assessment:<br />
How does one close the loop?<br />
Glynis Pickworth, M. van Rooyen, T.J. Avenant<br />
University of Pretoria, South Africa<br />
The MBChB Undergraduate Programme Committee (UPC) of the School of Medicine,<br />
University of Pretoria mandated the Assessment sub-committee (AC) to review assessment<br />
practices in the student internship rotations. These rotations take place during the last 18<br />
months of the six-year programme. Students no longer have any class activities and work<br />
the whole day in a clinic or hospital. There are five seven-week rotations and eight three or<br />
three and a half week rotations through various departments such as Family Medicine,<br />
Obstetrics, Gynaecology, etc. On the whole the staff supervising and assessing students<br />
are clinicians will little or no training in education and assessment practices. The university<br />
provides such courses for staff but the clinicians’ workload mostly precludes them from<br />
attending such courses. They are joint appointments by the state and university and find it<br />
difficult to get time off due to service delivery commitments.<br />
The relevant departments were informed of the review process and criteria, after these had<br />
been approved by the UPC. They were also supplied with a resource guide outlining best<br />
practice in clinical assessment. The AC made an appointment for a group meeting with the<br />
staff responsible for assessment for a particular rotation. The group consisted of rotation<br />
heads and representatives from other departments, members of the AC and members of<br />
the department responsible for the rotation. During the group meeting the assessment<br />
practices would be described and discussed according to the review criteria. The AC would<br />
then compile a report describing the assessment practices. Good practice would be<br />
acknowledged and recommendations for improvement made. The report would be sent to<br />
the person responsible for the rotation to make sure the information on assessment<br />
practices was correct, where after it would be tabled at a meeting of the UPC.<br />
A comparison across rotations revealed that a wide diversity of assessment methods are<br />
used. Compared with the four levels of Miller’s pyramid too much assessment is still related<br />
to the lower levels of the pyramid model rather than to the apex. The review sensitised a<br />
number of staff to good assessment practice through the resource guide and discussion<br />
about their assessment practice.<br />
The question is ‘How does one close the loop in quality assurance?’ Are the<br />
recommendations for improvement actually implemented? A follow-up study still needs to<br />
be done.<br />
ENAC 2008 143
Feedback: What’s in it for me?<br />
Margaret Price, Karen Handley, Berry O’Donovan<br />
Oxford Brookes University, United Kingdom<br />
Hattie and Timperley (2007) conceptualize feedback broadly as, ‘information provided by an<br />
agent…regarding aspects of one’s performance or understanding' and are very clear that it<br />
must be ‘a “consequence” of performance’. This view is not contentious. Students want a<br />
response to their effort and staff need to provide information on the gap between<br />
performance and aim (Sadler, 1989). However we know that the process of providing and<br />
receiving feedback is fraught with difficulty arising from the multiple purposes of feedback,<br />
communication problems and emotional responses to name but a few.<br />
This paper seeks to examine the relational dimension of feedback and argues that it is a central<br />
but often a missing dimension of feedback. The role of feedback in creating the relational<br />
footings between student and tutor provides the foundations for a successful learning process<br />
and, in particular, for on-going student engagement (Black and Wiliam 1998).<br />
If we want students to engage with their assessment feedback, we must pay attention to the<br />
relational dimension of feedback. Students are free to accept, partially accept or reject feedback<br />
(Chinn and Brewer, 1993) and we would encourage them to exercise their own judgement in<br />
evaluating feedback as they progress as independent learners. Students make judgements on<br />
the basis not only of the 'content' but also of their perceptions of the credibility and intentions of<br />
the author. In addition, there is a temporal dimension because if students are initially confused<br />
(and negatively evaluate the feedback) but can then engage in dialogue with receptive tutors,<br />
students may come to understand and therefore value the feedback. Therefore our feedback<br />
must be convincing, but not necessarily positive ‘feelgood’ feedback which does not link with<br />
performance (Dweck, 2000). However, to be persuaded of the feedback’s worth, students must<br />
recognise the feedback as valuable through the reciprocity of the assessor. That reciprocity will<br />
be demonstrated through the feedback communication process. Is this process seen as<br />
unidirectional or dialogic, active or passive, are the participants in this together or separately?<br />
A three-year study on student engagement with assessment feedback involving 35 interviews<br />
with students and staff, 12 case studies, and questionnaire data from 3 institutions will be<br />
presented and the findings used to provide a framework for analysing the factors that impact<br />
on the relational dimension of feedback including:<br />
• effectiveness of communication process<br />
• timeliness of response<br />
• match between staff and students expectations of the process<br />
• trust in the assessor<br />
• media of communication – what sort of knowledge can it carry?<br />
• dialogue opportunity<br />
• context in which it can be acted upon.<br />
The findings confirm that what students are looking for in feedback is not unrealistic, but often<br />
not provided, and this leads to disillusionment and a cycle of disengagement. Therefore the<br />
implications for practice will be considered and discussed, including the need to prepare staff<br />
and students to give and receive feedback and establish a relational footing; the opportunity for<br />
dialogue in resource-constrained environments; and opportunities to use the feedback once<br />
received and understood.<br />
144 ENAC 2008
From students’ to teachers’ collaboration:<br />
a case study of the challenges of e-teaching and assessing as co-responsibility<br />
Ana Remesal, Universidad de Barcelona, Spain<br />
Manuel Juárez, José Luis Ramírez<br />
Centro Nacional de Investigación y Desarrollo Tecnológico, Mexico<br />
The introduction of new technologies into education demands that teachers handle tools that<br />
allow them to use techno-pedagogical environments for e-learning (Mauri et al. 2007). These<br />
environments pose a challenge for teachers when it comes to transforming their practice and<br />
to using the new tools in an efficient way that eventually will transform and optimize the<br />
teaching and learning processes. We report about a case study in Higher Education as the<br />
first part of a two-round project. The “Foundations of Computing Science” preliminary distance<br />
course tackled basic subjects in discrete mathematics and their applications to computing; its<br />
aim was to develop common knowledge among the students accepted for the Master’s in<br />
Computer Science in the Centro Nacional de Investigación y Desarrollo Tecnológico (National<br />
Center of Investigation and Technological Development)(CENIDET), an institution that<br />
belongs to Mexico’s Sistema Nacional de Educación Superior Tecnológica (National System<br />
of Higher Technological Education ) (SNEST). The Claroline (V. 1.1) distance learning<br />
platform was used. The purpose of this poster is to describe some of the experiences we had<br />
as designers and teachers of this first Web-based distance course on discrete mathematics.<br />
Particularly, this poster describes the difficulties encountered and the solutions proposed by<br />
the group of professors that designed and developed the course using Claroline s a tool in<br />
order to prepare the second edition of the course.<br />
The course was developed over five weeks in 2007, with a group of 18 students. The<br />
course was structured around five units, one per week. The students’ work consisted of<br />
learning activities done first individually, then contrasted in pairs and then discussed in the<br />
whole group. These activities could be carried out either in an asynchronous or a<br />
synchronous manner. The platform used, despite important deficiencies, allowed for the<br />
organization, administration and follow-up on an individual level, on student-pair level and<br />
also on the whole group.<br />
This experience showed us how the distance learning tools introduce conditions that are<br />
different from face-to-face courses. In order for the design of contents, materials and<br />
dynamics to be adequate for this new environment, greater reflection by the teacher is<br />
necessary (Coll, 2004). Big challenges are set for the second implementation of the course:<br />
especially challenges concerning the assessment of students’ learning from a coresponsibility<br />
perspective. The second implementation of the course will be carried out by<br />
two teachers simultaneously. This poses particular challenges as to students’ assessment,<br />
since both teachers will need to clarify and share teaching goals and assessment purposes<br />
and instruments. Thus, the teachers’ conceptions about assessment are expected to play<br />
an important role in this new course.<br />
References<br />
Coll, C. (2004). Psicología de la educación y prácticas educativas mediadas por las tecnologías de la<br />
información y la comunicación. Sinéctica , 25 , 1-24.<br />
Mauri, T., Colomina, R., De Gispert, I. (en prensa). Diseño de propuestas docentes con TIC en la<br />
enseñanza superior: nuevos retos y principios de calidad desde una perspectiva<br />
socioconstructivista. Revista deEducación. MEC. (7-3-2007).<br />
ENAC 2008 145
Symbiotic relationships:<br />
Assessment for Learning (AfL), study skills and key employability skills<br />
Jon Robinson, Northumbria University, United Kingdom<br />
David Walker, Northumbria University, United Kingdom<br />
This poster links directly to a detailed aspect of the area to be covered in a roundtable<br />
presentation application that has been submitted, but it can also stand alone as a<br />
representation of the development of the particular teaching practice.<br />
In the English Division at Northumbria University we have redesigned the core first-year<br />
module for English students in a way that symbiotically links assessment, study skills and<br />
employability within a framework underpinned by the theory and practice of Assessment for<br />
Learning (AfL). This poster provides a textual and graphical presentation of the introduction<br />
and evaluation of the first element of summative assessment on the core 1st year<br />
undergraduate module in English Studies at Northumbria University.<br />
This assessment covers the topic of plagiarism, a contentious study skills topic not only<br />
within Northumbria but also in the sector as a whole. The assessment practice is based on<br />
the principles of Assessment for Learning and designed in a way that also provides an<br />
opportunity to begin introducing students to practices that relate directly to key employability<br />
skills highlighted by the English Subject Centre as deficient in the typical English Studies<br />
graduate.<br />
146 ENAC 2008
Assessing low achievers’ understanding of place value –<br />
consequences for learning and instruction<br />
Petra Scherer, University of Bielefeld, Germany<br />
Introduction<br />
Understanding place value is necessary for understanding our decimal number system. As<br />
a consequence, a certain understanding has relevance for different fields of school<br />
mathematics. Having place-value concept is crucial for developing effective calculation<br />
strategies (e.g. to replace one-by-one finger counting), for understanding the written<br />
algorithms or for moving from integers to fractions. Research shows that especially low<br />
achievers have great difficulties, even in higher grades, with understanding place value.<br />
The paper describes a small case study in which an assessment tool was developed and<br />
piloted. By means of this tool teachers should get a better understanding of (1) low<br />
achievers’ difficulties and (2) consequences for teaching and learning processes.<br />
Assessment tool development<br />
The existing instruments for assessing the understanding of place value mainly focus on<br />
applying the concept in standard calculations. The developed tool, in contrast, includes testitems<br />
that cover different levels of representations and the main building blocks for<br />
calculation strategies. Moreover, not only standard items have been chosen but unknown<br />
formats or challenging items which have not been treated in classroom yet, which cannot be<br />
solved in a mechanistic way and which refer to the specific role of zero. The assessment<br />
tool comprises tasks with numbers up to 1000 and covers the following topics: Counting in<br />
steps, splitting up numbers in place values, composing numbers, interpretation of iconical<br />
representations of numbers, identifying place values of digits in 3-digit-numbers and solving<br />
simple additions and subtractions. The items can be used for paper-and-pencil tests as well<br />
as for interviews to get information for both, oral and written competences of the children.<br />
Results<br />
The assessment tool was piloted with 12 low achieving students (4 girls; 8 boys) from 5th<br />
and 6th grade who visit a special school for learning disabled. A first analysis of the results<br />
showed a certain understanding of place value for all students but also revealed a variety of<br />
difficulties, especially with the non-standard items (e. g., composing a number out of<br />
70+200+3 led to the incorrect number 723 whereas a more or less standard item like<br />
300+50+4 resulted in a correct solution). Moreover, working out simple addition and<br />
subtraction tasks in many cases was done in a rather mechanistic way by manipulating the<br />
digits and not thinking about the numbers (e. g., students did not consider all place values<br />
when adding 314+314 and came to the result 328). Beyond this, problems with zero<br />
became obvious (e. g. 624–203 led to the result 401).<br />
Discussion<br />
The analysis also shows that test results cannot be seen in an isolated way, but one has to<br />
take into account that children might have individual interpretations of the tasks. Just as<br />
important is the analysis of the whole solving process. After presenting a selection of<br />
results, consequences for teaching and learning will be discussed. Assessment of low<br />
achievers’ competences as well as classroom practice requires more than focusing on<br />
correct results but should also take into account the students’ solution strategies (including<br />
explanations and reasoning).<br />
ENAC 2008 147
Using a course forum to promote learning and assessment<br />
for learning in environmental education<br />
Revital (Tali) Tal, Technion – Israel Institute of Technology, Israel<br />
In continuation of a previous study, in which a complex assessment framework was<br />
implemented in an environmental education course in a science education department (Tal,<br />
2005), this study focused on one component of the assessment – the discussions in the<br />
course forum. The on-line a-synchronic forum served as a sociocultural arena for raising<br />
questions, leading of and participating in socio-environmental debates, and uploading the<br />
students’ projects and carrying out peer assessment. The participants were 15 minority preservice<br />
teachers from various science education disciplines who had very little prior<br />
knowledge about or awareness of the environment. Within the assessment for learning<br />
framework that directed the learning, the students were required to read and critique<br />
newspaper articles, investigate an environmental problem in their home community,<br />
participate in a field trip and discuss a variety of environmental topics in the course forum<br />
that was managed by the author (the course instructor) and a teaching assistant. The main<br />
goal was to use the course forum to improve and assess learning and engagement in<br />
environmental discourse. As the students were pre-service teachers, an additional goal was<br />
to expose the students to multi modal learning and assessment in environmental education,<br />
which is in line with the basic principles of environmental education. The research questions<br />
were: (a) to what extent the course-forum allowed participating in an environmental<br />
education learning-community? (b) in what ways students expressed engagement and<br />
concern for environmental-issues? (c) to what extent the course forum allowed the students<br />
to express diverse learning outcomes? Three levels of participation in the forum were<br />
identified: obligatory – very little participation, limited to the requested course tasks;<br />
occasional – characterized by random activity and limited to responding to others; and<br />
active – expressed by continuous activity either as initiators who brought up new topics for<br />
discussions or respondents who continued to develop the discussion. There was good<br />
alignment between the activity in the forum and the students’ final score. The students who<br />
actively participated in the forum expanded their learning far beyond class and the course<br />
assignments. In the interviews carried out a year after the course has ended, these students<br />
referred to the forum as learning, as well as an assessment instrument that contributed as<br />
well to their environmental awareness and commitment. Finally, the course forum enabled<br />
deep discussions that elevated the class-based learning. The students discussed more<br />
local problems, which were typical to their communities, and provided rich evidence to<br />
meaningful learning. In the follow up interviews, they indicated about the contribution of the<br />
forum to their freedom and their success to overcome the class language barrier.<br />
Referring to the idea of sociocultural theory and communities of practice, the course-forum<br />
enhanced learning through intensive interaction among the students, where three levels of<br />
practice were identified: peripheral, occasional and experienced. This study contributes to<br />
the field of teaching and assessment in higher education, and to the field of environmental<br />
education in multicultural societies.<br />
148 ENAC 2008
Learning-oriented feedback: a challenge to assessment practice<br />
Mirabelle Walker, The Open University, United Kingdom<br />
The paper starts by presenting research into feedback carried out in the Technology Faculty<br />
of the UK’s Open University. A coding tool introduced by Brown and Glover (2006) was<br />
used to analyse over 3000 comments made on 106 assignments in three undergraduate<br />
course modules. One dimension of this code was used to determine the categories of<br />
comments being made: relating to the content of the answer; relating to skills development;<br />
offering motivation; etc. The other dimension was used to determine the ‘depth’ of the<br />
comments: whether they were indicative, corrective or explanatory. Students’ responses to<br />
these comments were obtained through individual interviews with 43 of the students whose<br />
commented assignments had been examined. The students were asked to indicate, if<br />
possible, an example of a comment on their assignment that they had been able to use in<br />
later assignments in the module. They were also asked how they had responded to some of<br />
the specific comments written on their assignment. In the latter case, a thematic analysis of<br />
these responses was carried out, followed by a matching of response themes to categories<br />
and depths of comment.<br />
Two key results emerged from this work: the most effective comments for the students’<br />
future work are those that relate to skills development; the most effective comments for<br />
helping students to understand inadequacies in their work are those that are explanatory.<br />
The paper shows that these findings are consistent with a conceptualisation of effective<br />
feedback on assignments that, drawing on Sadler (1989) and Black & Wiliam (1998), sees it<br />
as offering students a means whereby they can reduce or close the gap between their own<br />
knowledge, skills and understanding and the desired knowledge, skills and understanding.<br />
Feedback of this type is learning-oriented feedback, and assessment which is designed to<br />
offer adequate opportunities for feedback of this type is learning-oriented assessment.<br />
The paper concludes by highlighting the ways in which these findings challenge both<br />
feedback practice, which is often insufficiently learning-oriented, and assessment practice,<br />
where skills development tends to be undervalued by those who set and mark the<br />
questions, and attention is seldom paid to skills development through the sequence of the<br />
assignments in a module or programme of study.<br />
It is intended that discussion will centre around the challenges to feedback and assessment<br />
practice that arise from this research, as outlined in the conclusion to the paper. It is hoped<br />
that participants will be able to share experiences of, or suggestions for, responding to<br />
these challenges.<br />
References<br />
Black, P. & Wiliam, D. (1998) Assessment and classroom learning, Assessment in Education: Principles,<br />
Policy and Practice, 5(1), 7–74.<br />
Brown, E. & Glover, C. (2006) ‘Evaluating written feedback’ in Bryan, C. & Clegg, K. (eds.) Innovative<br />
Assessment in Higher Education, Abingdon: Routledge, 81–91.<br />
Sadler, D. R. (1989) Formative assessment and the design of instructional systems, Instructional Science,<br />
18, 119–144.<br />
ENAC 2008 149
Progressive Formalization as an Interpretive Lens for<br />
Increasing the Learning Potentials of Classroom Assessment<br />
David Webb, University of Colorado at Boulder, The United States of America<br />
Education researchers have repeatedly asserted that to improve student learning, teachers<br />
need to give greater attention to their use of formative assessment. To effectively guide<br />
student learning, teachers must develop greater confidence in their own decision making<br />
and expertise in classroom assessment. To appropriately interpret student responses to<br />
instructional activities, teachers need to understand how the mathematical content<br />
demonstrated in students’ representations relate to the development of student learning and<br />
expectations for mathematical literacy.<br />
The didactical design construct of progressive formalization, and many examples thereof,<br />
draws from decades of developmental research using the principles of Realistic Mathematics<br />
Education. Instructional sequences in RME are conceived as “learning lines” in which problem<br />
contexts serve as starting points to elicit students’ informal representations. When<br />
appropriate, the teacher builds upon students’ representations and either draws upon student<br />
strategies that are progressively more formal or introduces students to new strategies and<br />
models. Students are encouraged to refer back to less formal representations to deepen their<br />
understanding of the abstract-symbolic. Essentially, progressive formalization is a designoriented<br />
mathematical instantiation of cognitive/constructivist learning theories. Through<br />
careful attention to students' prior knowledge and guided support from the teacher, students’<br />
conceptions are related to other pre-formal mathematical representations. The teacher<br />
facilitates student learning and a sense of ownership by selecting appropriate problems,<br />
interpreting student responses, posing clarifying questions, and using counterexamples to<br />
support the development of students’ mathematical understanding.<br />
This paper reports the underlying design theory and results from a research-based,<br />
professional development program designed to improve teacher confidence, expertise and<br />
use of classroom assessment. Over the past 3-years, the program has involved 32 middle<br />
grades mathematics teachers working among six middle schools (i.e., 12 to 14 year old<br />
students) in a moderately-sized U.S. public school district.<br />
From prior assessment design studies involving mathematics teachers, we recognized that<br />
limitations in the content knowledge of some teachers had a profound influence on their<br />
ability to select or design tasks accessible to students’ informal and pre-formal<br />
representations. As a way to deepen their understanding of mathematics, teachers<br />
completed mathematical tasks that illustrated progressive formalization in rational number<br />
and algebra. In design and analysis activities, teachers continuously used progressive<br />
formalization as a lens to adapt or create assessment tasks, review their instructional<br />
materials, design scoring guides and rubrics, interpret student responses, and discuss<br />
instructional responses based on examples of student work.<br />
The analysis of teachers’ assessment portfolios (i.e., collection of all paper and pencil<br />
assessments) suggest that this PD model provided teachers with a more principled basis for<br />
assessing student understanding and resulted in a conceptualization of assessment that<br />
was generative. That is, teachers applied principles of progressive formalization by<br />
increasing the accessibility of the assessment tasks they used and in the ways they<br />
interpreted and responded to student work. The full paper and presentation will include<br />
examples of the classroom assessment teachers designed, how they used student work to<br />
inform the revision and redesign of assessments, and preliminary analysis of the impact of<br />
this program on student achievement of participating teachers.<br />
150 ENAC 2008
Author Index<br />
ENAC 2008 151
152 ENAC 2008
Adamson 105<br />
Admiraal 69<br />
Allin 57<br />
Asghar 58<br />
Asmyhr 101<br />
Avenant 143<br />
Bakker 126<br />
Barrie 102<br />
Bayrhuber 34<br />
Bell 129<br />
Berenst 52<br />
Bjølseth 115<br />
Black<br />
Beth 59<br />
Paul 66<br />
Blackmore 42<br />
Bloxham 60<br />
Bohemia 65<br />
Bokhove 130<br />
Boud 103<br />
Boursicot 45<br />
Bremer 87<br />
Brockbank 131<br />
Bruder 34<br />
Bucholtz 98, 132<br />
Busana 141<br />
Butcher 114<br />
Campbell 60<br />
Cheung 90, 104<br />
Clark 105<br />
Coates 108<br />
Contreras Palma 61<br />
Cowie 62<br />
Crossouard 14<br />
Dacre 45<br />
Davison 106<br />
de Glopper 50, 52<br />
De Grez 107<br />
Dearnley 76, 108<br />
Diemer 133<br />
Dysthe 25, 27<br />
Ebert 63<br />
Ecclestone 13<br />
Eckes 16<br />
Eggen 64<br />
Ekecrantz 124<br />
Engelsen 27, 91<br />
Entwistle 32<br />
Fastré 134<br />
Fisher 109<br />
Fishwick 57<br />
Foreman-Peck 29<br />
Formazin 98, 132<br />
Fuller 44, 110<br />
Furnborough 111<br />
Goldhammer 39, 40, 74<br />
Goos 135<br />
Handley 84, 144<br />
Harman 65<br />
Harrison 66<br />
Harsch 17<br />
Hartig 17, 33, 36<br />
Hartley 76<br />
Hartnell-Young 26<br />
Havnes 25, 67, 68, 115<br />
Hepplestone 112, 136<br />
Hernandez 137<br />
Higham 45<br />
Hodgen 66<br />
Hoeksma 69<br />
Höhler 36<br />
Homer 110<br />
Hopfenbeck 113<br />
Hounsell<br />
Dai 70, 138<br />
Jenny 70<br />
Hughes<br />
C. 102<br />
Clair 135, 139<br />
Hunter 114<br />
Jadoul 38<br />
James 12<br />
Janssen<br />
Fred 95<br />
Judith 69<br />
Jones<br />
Alister 62<br />
Julie 31<br />
Jonsson 140<br />
Jordan 22, 114, 131<br />
Joughin 71<br />
Juárez 145<br />
Jude 40<br />
Karius 18<br />
Keller 141<br />
Klieme 5, 40<br />
Köller 15<br />
Kunina 35<br />
ENAC 2008 153
Kuper 133<br />
Kwant 52<br />
Lai<br />
Mei Kuin 78<br />
Patrick 72<br />
Latour 38<br />
Lauvas 115<br />
Lecaque 38<br />
Leitch 6<br />
Leuders 34<br />
Linsey 120<br />
Luff 116<br />
Maier 73<br />
Marshall 66<br />
Martens 37, 39, 40, 74<br />
Martin 141<br />
McCabe 75<br />
McCusker 21<br />
McDowell 11<br />
McLean 106<br />
Meddings 76<br />
Meeus 28<br />
Mellor 32<br />
Mitchell 131<br />
Montgomery 77<br />
Moreland 62<br />
Nachshon 117<br />
Narciss 49<br />
Naumann 39, 40<br />
Neumann 18<br />
Nicholson 21<br />
Norton<br />
Bill 89<br />
Lin 89<br />
O'Brien 78<br />
O'Doherty 79<br />
O'Donovan 84, 118, 119, 144<br />
Oehler 80<br />
Ooms 120<br />
Orr 81<br />
O'Siochru 94<br />
Otrel-Cass 62<br />
Pat-El 82<br />
Pell 41, 44, 110<br />
Peltenburg 142<br />
Pickworth 143<br />
Pilkington 83<br />
Plichart 38<br />
Price 84, 118, 119, 144<br />
Proctor-Childs 109<br />
Pryor 14<br />
Ramírez 145<br />
Reichert 141<br />
Reimann 25<br />
Remesal 85, 145<br />
Renault 38<br />
Richardson 86<br />
Ridgway 21<br />
Roberts 41, 45<br />
Robinson<br />
Gilian 116<br />
Jon 121, 146<br />
Robitzsch 15, 18, 80<br />
Rölke 39, 40<br />
Rom 117<br />
Roozen 107<br />
Rowley 129<br />
Ruedel 20<br />
Rupp 35<br />
Sambell 77<br />
Sandal 122<br />
Saniter 87<br />
Schaap 88<br />
Scharaf 39<br />
Scherer 147<br />
Schmidt 88<br />
Schofield 123<br />
Schroeders 98<br />
Schwieler 124<br />
Segers 49, 82<br />
Serret 66<br />
Shannon 89<br />
Sit 90, 104<br />
Sjo 91<br />
Sluijsmans 46, 47, 48, 134<br />
Smee 43<br />
Smith 19<br />
Andrew 31<br />
C. 102<br />
Kari 91, 92, 122<br />
Stein 93<br />
Stern 125<br />
Strijbos 48, 49<br />
Strivens 94<br />
Swietlik-Simon 38<br />
Syversen 122<br />
Tai 138<br />
Tal 148<br />
Taylor 108<br />
Tigelaar 95, 126<br />
154 ENAC 2008
Tillema 82<br />
Valcke 107<br />
Van de Watering 48<br />
van den Boogaard 53<br />
van den Heuvel-Panhuizen 50, 53, 142<br />
van der Klink 134<br />
van der Pol 51<br />
van Lierop-Debrauwer 51<br />
van Merriënboer 47, 134<br />
Van Petegem 28<br />
van Rooyen 143<br />
van Tartwijk 95<br />
van Zundert 47, 96<br />
Vedder 82<br />
Veldman 95<br />
Verloop 95, 126<br />
Vernon 30<br />
Walker<br />
David 121, 146<br />
Mirabelle 149<br />
Wangensteen 122<br />
Webb<br />
David 150<br />
Marion 120<br />
Webster-Wright 135<br />
Whitelock 19, 97<br />
Wilhelm 35, 98, 132<br />
Wiliam 7<br />
Wirtz 34<br />
Xu 138<br />
ENAC 2008 155
156 ENAC 2008
Address List of<br />
presenters<br />
ENAC 2008 157
158 ENAC 2008
Allin, Linda<br />
Northumbria University<br />
CETL<br />
NE1 8ST Newcastle Upon Tyne<br />
UNITED KINGDOM<br />
linda.allin@unn.ac.uk<br />
Barrie, Simon<br />
The University of Sydney<br />
AUSTRALIA<br />
S.Barrie@itl.usyd.edu.au<br />
Blackmore, David<br />
Medical Council of Canada<br />
2283 St. Laurent Boulevard<br />
K1G 5A2 Ottawa<br />
CANADA<br />
dblackmore@mcc.ca<br />
Boud, David<br />
University of Technology, Sydney<br />
PO Box 123<br />
NSW 2007 Broadway<br />
AUSTRALIA<br />
David.Boud@uts.edu.au<br />
Bucholtz, Nina<br />
Humboldt-Universität zu Berlin<br />
IQB<br />
Unter den Linden 6<br />
10099 Berlin<br />
GERMANY<br />
bucholtn@iqb.hu-berlin.de<br />
Asghar, Mandy<br />
Leeds Metropolitan University<br />
7 Chelwood Ave<br />
LS8 2BA Leeds<br />
UNITED KINGDOM<br />
a.asghar@leedsmet.ac.uk<br />
Bell, Andy<br />
Manchester Metropolitan University<br />
24 Park Row<br />
SK4 3DY Heaton Mersey /<br />
Stockport<br />
UNITED KINGDOM<br />
A.Bell@mmu.ac.uk<br />
Bloxham, Sue<br />
University of Cumbria<br />
Bowerham Rd<br />
LA1 3JD Lancaster<br />
UNITED KINGDOM<br />
susan.bloxham@cumbria.ac.uk<br />
Boursicot, Katharine<br />
St George's, University of London<br />
46-47 Compton Road<br />
N1 2PB London<br />
UNITED KINGDOM<br />
kboursic@sgul.ac.uk<br />
Cheung, Kwok Cheung<br />
Faculty of Education<br />
University of Macau<br />
11A, Block 2<br />
Taipa Macao<br />
CHINA<br />
kccheung@umac.mo<br />
Asmyhr, Morten<br />
Østfold University College<br />
Sagmesterveien 49<br />
1414 Trollåsen<br />
NORWAY<br />
morten.asmyhr@hiof.no<br />
Black, Beth<br />
Cambridge Assessment<br />
8 Marshall Road<br />
CB1 7TY Cambridge<br />
UNITED KINGDOM<br />
Black.B@cambridgeassessment.org.uk<br />
Bokhove, Christian<br />
FIsme<br />
Utrecht University<br />
Aidadreef 12<br />
3561 GE Utrecht<br />
NETHERLANDS<br />
cbokhove@gmail.com<br />
Brockbank, Barbara<br />
The Open University<br />
19 Woodfield Road<br />
TN9 2LG Tonbridge<br />
UNITED KINGDOM<br />
bsb3@tutor.open.ac.uk<br />
Clark, Wendy<br />
Northumbria University<br />
CETL<br />
NE1 8ST Newcastle Upon Tyne<br />
UNITED KINGDOM<br />
wendy.clark@unn.ac.uk<br />
ENAC 2008 159
Contreras Palma, Saul Alejandro<br />
CHILE<br />
saul2674@hotmail.com<br />
de Glopper, Kees<br />
Center for Language and<br />
Cognition, Faculty of Arts,<br />
University of Groningen<br />
PO Box 716<br />
9700 AS Groningen<br />
NETHERLANDS<br />
c.m.de.glopper@rug.nl<br />
Diemer, Tobias<br />
Freie Universität Berlin<br />
Arnimallee 12<br />
14195 Berlin<br />
GERMANY<br />
diemer@zedat.fu-berlin.de<br />
Ecclestone, Kathryn<br />
Oxford Brookes University<br />
Westminster Institute of Education<br />
Harcourt Hill OX Oxford<br />
UNITED KINGDOM<br />
kecclestone@brookes.ac.uk<br />
Fastré, Greet<br />
Open Universiteit Nederland<br />
Valkenburgerweg 177<br />
6419 AT Heerlen<br />
NETHERLANDS<br />
greet.fastre@ou.nl<br />
Cowie, Bronwen<br />
University of Waikato<br />
Hillcrest Rd<br />
2001 Hamilton<br />
NEW ZEALAND<br />
bcowie@waikato.ac.nz<br />
De Grez, Luc<br />
University College Brussels<br />
Koningsstraat 336<br />
1030 Brussels<br />
BELGIUM<br />
luc.degrez@hubrussel.be<br />
Dysthe, Olga<br />
University of Bergen<br />
Beiteveien 9<br />
5019 Bergen<br />
NORWAY<br />
Olga.Dysthe@iuh.uib.no<br />
Eckes, Thomas<br />
TestDaF Institute<br />
Feithstr. 188<br />
58084 Hagen<br />
GERMANY<br />
thomas.eckes@testdaf.de<br />
Fisher, Margaret<br />
University of Plymouth<br />
Drake Circus<br />
PL4 8AA Plymouth<br />
UNITED KINGDOM<br />
m.fisher@plymouth.ac.uk<br />
Davison, Gillian<br />
Northumbria University<br />
CETL AfL<br />
NE1 8ST Newcastle upon Tyne<br />
UNITED KINGDOM<br />
gillian.davison@unn.ac.uk<br />
Dearnley, Christine<br />
University of Bradford<br />
Ashgrove Barn, Broad Lane<br />
HD9 1LS Huddersfield<br />
UNITED KINGDOM<br />
c.a.dearnley1@bradford.ac.uk<br />
Ebert, Julian<br />
University of Zurich<br />
Binzmühlestr. 14<br />
8050 Zürich<br />
SWITZERLAND<br />
ebert@ifi.uzh.ch<br />
Eggen, Astrid Birgitte<br />
University of Oslo<br />
Kapellveien 17c<br />
487 Oslo<br />
NORWAY<br />
astrid.eggen@ils.uio.no<br />
Foreman-Peck, Lorraine<br />
The University of Northampton<br />
53 Portland Road<br />
0X2 7EZ Oxford<br />
UNITED KINGDOM<br />
lorraine.foremanpeck@northampton.ac.uk<br />
160 ENAC 2008
Fuller, Richard<br />
School of Medicine<br />
University of Leeds<br />
LS2 9JT Leeds<br />
UNITED KINGDOM<br />
R.Fuller@leeds.ac.uk<br />
Harman, Kerry<br />
Northumbria University<br />
CETL, Ellison Building<br />
NE1 8ST Newcastle Upon Tyne<br />
UNITED KINGDOM<br />
m.newson@unn.ac.uk<br />
Hartnell-Young, Elizabeth<br />
Learning Science Research Institute<br />
The University of Nottingham<br />
NG8 1BB Nottingham<br />
UNITED KINGDOM<br />
elizabeth.hartnellyoung@nottingham.ac.uk<br />
Hernandez, Rosario<br />
University College Dublin<br />
School of Languages and Literatures<br />
Newman Building, Belfield 4<br />
Dublin<br />
IRELAND<br />
charo.hernandez@ucd.ie<br />
Hopfenbeck, Therese Nerheim<br />
University of Oslo<br />
Faculty of Education<br />
Sem Seland vei 24, P.O. Box 1099<br />
Blindern<br />
NO-0317 Oslo<br />
NORWAY<br />
t.n.hopfenbeck@ils.uio.no<br />
Furnborough, Concha<br />
The Open University<br />
Walton Hall<br />
MK7 6AA Milton Keynes<br />
UNITED KINGDOM<br />
c.furnborough@open.ac.uk<br />
Harrison, Christine<br />
King's College London<br />
Franklin-Wilkins-Building WBW,<br />
150 Stamford Street<br />
SE1 9NN London<br />
UNITED KINGDOM<br />
christine.harrison@kcl.ac.uk<br />
Havnes, Anton<br />
University of Bergen<br />
Øvreveien 36<br />
N-1450 Nesoddtangen<br />
NORWAY<br />
anton.havnes@hio.no<br />
Hoeksma, Mark<br />
University of Amsterdam Graduate<br />
School for Teaching and Learning<br />
P.E. Tegelbergplein 4<br />
1019 TA Amsterdam<br />
NETHERLANDS<br />
m.hoeksma@uva.nl<br />
Hounsell, Dai<br />
University of Edinburgh<br />
Paterson's Land, Holyrood Road<br />
EH8 8AQ Edinburgh<br />
UNITED KINGDOM<br />
Dai.Hounsell@ed.ac.uk<br />
Goldhammer, Frank<br />
German Institute for International<br />
Educational Research (DIPF)<br />
Schlossstr. 29<br />
60486 Frankfurt/Main<br />
GERMANY<br />
goldhammer@dipf.de<br />
Hartig, Johannes<br />
German Institute for International<br />
Educational Research (DIPF)<br />
Schloßstraße 29<br />
60486 Frankfurt am Main<br />
GERMANY<br />
hartig@dipf.de<br />
Hepplestone, Stuart<br />
Sheffield Hallam University<br />
Howard Street<br />
S1 1WB Sheffield<br />
UNITED KINGDOM<br />
s.j.hepplestone@shu.ac.uk<br />
Höhler, Jana<br />
German Institute for International<br />
Educational Research (DIPF)<br />
Schloßstraße 29<br />
60486 Frankfurt am Main<br />
GERMANY<br />
hoehler@dipf.de<br />
Hounsell, Jenny<br />
University of Edinburgh<br />
Paterson's Land, Holyrood Road<br />
EH8 8AQ Edinburgh<br />
UNITED KINGDOM<br />
Jenny.Hounsell@ed.ac.uk<br />
ENAC 2008 161
Hughes, Clair<br />
The University of Queensland<br />
147 Swann Rd<br />
4068 Brisbane<br />
AUSTRALIA<br />
clair.hughes@uq.edu.au<br />
Jonsson, Anders<br />
Malmö University<br />
School of Teacher Education<br />
SE-205 06 Malmö<br />
SWEDEN<br />
anders.jonsson@mah.se<br />
Keller, Ulrich<br />
University of Luxembourg<br />
Route de Diekirch<br />
L-7220 Walferdange<br />
LUXEMBOURG<br />
ulrich.keller@uni.lu<br />
Kunina, Olga<br />
Humboldt-Universität zu Berlin<br />
IQB<br />
Unter den Linden 6<br />
10099 Berlin<br />
GERMANY<br />
Olga.Kunina@iqb.hu-berlin.de<br />
Lauvas, Per<br />
Østvold University College<br />
NORWAY<br />
per.lauvas@hiof.no<br />
James, David<br />
University of the West of England<br />
Coldharbour Lane<br />
BS16 1QY Bristol<br />
UNITED KINGDOM<br />
david.james@uwe.ac.uk<br />
Jordan, Sally<br />
The Open University<br />
COLMSCT<br />
95 Sluice Road<br />
PE38 0DZ Downham Market<br />
UNITED KINGDOM<br />
s.e.jordan@open.ac.uk<br />
Klieme, Eckhard<br />
German Institute for International<br />
Educational Research (DIPF)<br />
Schloßstraße 29<br />
60486 Frankfurt<br />
GERMANY<br />
klieme@dipf.de<br />
Kwant, Aletta<br />
Center for Language and<br />
Cognition, Faculty of Arts<br />
University of Groningen<br />
PO Box 716<br />
9700 AS Groningen<br />
NETHERLANDS<br />
l.p.kwant@rug.nl<br />
Leitch, Ruth<br />
Queen`s University Belfast<br />
69-71 University Street<br />
BT7 1HL Belfast<br />
UNITED KINGDOM<br />
r.leitch@qub.ac.uk<br />
Jones, Julie<br />
The University of Northampton<br />
11 Cardinal Close<br />
NN4 0RP Northampton<br />
UNITED KINGDOM<br />
julie.jones@northampton.ac.uk<br />
Joughin, Gordon<br />
University of Wollongong<br />
CEDIR, University of Wollongong<br />
2522 Wollongong<br />
AUSTRALIA<br />
gordonj@uow.edu.au<br />
Köller, Olaf<br />
Humboldt-Universität zu Berlin<br />
IQB<br />
Unter den Linden 6<br />
10099 Berlin<br />
GERMANY<br />
iqboffice@iqb.hu-berlin.de<br />
Lai, Patrick<br />
The Hong Kong Polytechnic University<br />
Educational Development Centre<br />
Room TU607<br />
Hung Hom, Kowloon<br />
Hong Kong<br />
CHINA<br />
etktlai@netvigator.com<br />
Luff, Paulette<br />
Anglia Ruskin University<br />
Bishop Hall Lane<br />
CM1 1SQ Chelmsford<br />
UNITED KINGDOM<br />
paulette.luff@anglia.ac.uk<br />
162 ENAC 2008
Maier, Uwe<br />
University of Education Schw. Gmünd<br />
Ostalbstrasse 8<br />
73529 Schwäbisch Gmünd<br />
GERMANY<br />
uwe.maier@ph-gmuend.de<br />
McDowell, Liz<br />
University of Northumbria<br />
CETL Hub D121, Ellison Building,<br />
Ellison Place<br />
NE1 8ST Newcastle Upon Tyne<br />
UNITED KINGDOM<br />
liz.mcdowell@unn.ac.uk<br />
Mellor, Antony<br />
Northumbria University<br />
School of Applied Sciences<br />
NE1 8ST newcastle Upon Tyne<br />
UNITED KINGDOM<br />
antony.mellor@unn.ac.uk<br />
Naumann, Johannes<br />
German Institute for International<br />
Educational Research (DIPF)<br />
Schloßstraße 29<br />
60486 Frankfurt am Main<br />
GERMANY<br />
naumann@dipf.de<br />
O'Donovan, Berry<br />
Oxford Brookes University<br />
150 Marlborough Road<br />
OX1 4LS Oxford<br />
UNITED KINGDOM<br />
bodonovan@brookes.ac.uk<br />
Martens, Thomas<br />
German Institute for International<br />
Educational Research (DIPF)<br />
Postfach 900270<br />
60442 Frankfurt am Main<br />
GERMANY<br />
m@rtens.net<br />
Meddings, Fiona<br />
Division of Midwifery &<br />
Reproductive Health<br />
University of Bradford<br />
55 Crowther Avenue<br />
LS28 5SA Leeds<br />
UNITED KINGDOM<br />
f.s.meddings@bradford.ac.uk<br />
Montgomery, Catherine<br />
Northumbria University<br />
CETL AfL, Ellison Building<br />
Ellison Place, Newcastle<br />
NE1 8ST Newcastle<br />
UNITED KINGDOM<br />
c.montgomery@unn.ac.uk<br />
O'Brien, Patrice<br />
Faculty of Education<br />
University of Auckland<br />
111 Blockhouse Bay Rd, Avondale<br />
1026 Auckland<br />
NEW ZEALAND<br />
pa.obrien@auckland.ac.nz<br />
Oehler, Raphaela<br />
Humboldt-Universität zu Berlin<br />
IQB<br />
Unter den Linden 6<br />
10099 Berlin<br />
GERMANY<br />
raphaela.oehler@iqb.hu-berlin.de<br />
McCabe, Michael<br />
University of Portsmouth<br />
Lion Terrace<br />
PO1 3HF Portsmouth<br />
UNITED KINGDOM<br />
michael.mccabe@port.ac.uk<br />
Meeus, Wil<br />
Universiteit Antwerpen<br />
Venusstraat 35<br />
2000 Antwerpen<br />
BELGIUM<br />
wil.meeus@ua.ac.be<br />
Nachshon, Michal<br />
Ministry of Education<br />
Vardiya st. 24<br />
34657 Haifa<br />
ISRAEL<br />
michaln@tx.technion.ac.il<br />
O'Doherty, Michelle<br />
Liverpool Hope University<br />
Hope Park<br />
L16 9JD Liverpool<br />
UNITED KINGDOM<br />
odoherm@hope.ac.uk<br />
Ooms, Ann<br />
Kingston University<br />
19 Woodlands - 4 South Bank<br />
KT6 6DB Surbiton<br />
UNITED KINGDOM<br />
a.ooms@kingston.ac.uk<br />
ENAC 2008 163
Orr, Susan<br />
York St John University<br />
Lord Mayor's Walk<br />
YO317EX York<br />
UNITED KINGDOM<br />
s.orr@yorksj.ac.uk<br />
Peltenburg, Marjolijn<br />
FIsme<br />
Utrecht University<br />
Aidadreef 12<br />
3561 GE Utrecht<br />
NETHERLANDS<br />
M.Peltenburg@fi.uu.nl<br />
Plichart, Patrick<br />
CRP Henri Tudor<br />
Avenue John F. Kennedy L, 29<br />
1855 Luxembourg - Kirchberg<br />
LUXEMBOURG<br />
patrick.plichart@tudor.lu<br />
Remesal, Ana<br />
Universidad de Barcelona<br />
Paseo del Valle Hebrón, 171<br />
E-08035 Barcelona<br />
SPAIN<br />
aremesal@ub.edu<br />
Roberts, Trudie<br />
University of Leeds<br />
Level 7, Worsley Building,<br />
Clarendon Way<br />
LS2 9NL Leeds<br />
UNITED KINGDOM<br />
t.e.roberts@leeds.ac.uk<br />
Pat-El, Ron<br />
Leiden University<br />
Catharinaland 69<br />
2591 CG Den Haag<br />
NETHERLANDS<br />
rpatel@fsw.leidenuniv.nl<br />
Pickworth, Glynis<br />
University of Pretoria<br />
90 Wenning Street<br />
181 Pretoria<br />
SOUTH AFRICA<br />
glynis.pickworth@up.ac.za<br />
Price, Margaret<br />
Oxford Brookes University<br />
2 Hearne Road<br />
W4 3NJ London<br />
UNITED KINGDOM<br />
meprice@brookes.ac.uk<br />
Richardson, Mary<br />
Roehampton University<br />
Froebel College<br />
Roehampton Lane<br />
SW15 5PJ London<br />
UNITED KINGDOM<br />
mary.richardson@roehampton.ac.uk<br />
Robinson, Jon<br />
Northumbria University<br />
CETL AfL<br />
NE1 8ST Newcastle Upon Tyne<br />
UNITED KINGDOM<br />
john.robinson@unn.ac.uk<br />
Pell, Godfrey<br />
University of Leeds<br />
CSSME, EC Stoner Building<br />
LS2 9JT Leeds<br />
UNITED KINGDOM<br />
G.Pell@leeds.ac.uk<br />
Pilkington, Ruth<br />
University of Central Lancashire<br />
67 Lower Bank Road<br />
PR2 8NU Preston<br />
UNITED KINGDOM<br />
RMHPilkington@uclan.ac.uk<br />
Pryor, John<br />
University of Sussex<br />
9 Wellington Road<br />
BN2 3AB Brighton<br />
UNITED KINGDOM<br />
j.b.pryor@sussex.ac.uk<br />
Ridgway, Jim<br />
University of Durham<br />
School of Education<br />
Leazes Road<br />
Durham DH1 1TA UK<br />
UNITED KINGDOM<br />
Jim.Ridgway@durham.ac.uk<br />
Robitzsch, Alexander<br />
Humboldt-Universität zu Berlin<br />
IQB<br />
Unter den Linden 6<br />
10099 Berlin<br />
GERMANY<br />
alexander.robitzsch@iqb.hu-berlin.de<br />
164 ENAC 2008
Ruedel, Cornelia<br />
University of Zurich<br />
E-Learning Center<br />
Hirschengraben 84<br />
8001 Zurich<br />
SWITZERLAND<br />
Cornelia.Ruedel@access.uzh.ch<br />
Schaap, Lydia<br />
Erasmus University Rotterdam<br />
Institute of Psychology<br />
Haagdijk 51A<br />
4811 TP Breda<br />
NETHERLANDS<br />
l.schaap@fsw.eur.nl<br />
Schwieler, Elias<br />
Stockholm University<br />
UPC Frescativ. 28<br />
106 91 Stockholm<br />
SWEDEN<br />
elias.schwieler@upc.su.se<br />
Sjo, Anne Kristin<br />
Stord/Haugesund University College<br />
PB 5000<br />
5409 Stord<br />
NORWAY<br />
aks@hsh.no<br />
Smith, Kari<br />
University of Bergen<br />
Post box 7800<br />
5120 Bergen<br />
NORWAY<br />
kari.smith@iuh.uib.no<br />
Sandal, Ann Karin<br />
Sogn and Fjordane University College<br />
Stedjeåsen 24<br />
6856 Sogndal<br />
NORWAY<br />
ann.karin.sandal@hisf.no<br />
Scherer, Petra<br />
University of Bielefeld<br />
Faculty of Mathematics<br />
Athener Weg 9<br />
44269 Dortmund<br />
GERMANY<br />
petra.scherer@uni-bielefeld.de<br />
Shannon, Lee<br />
Liverpool Hope University<br />
6 Leda Grove<br />
L17 8XL Liverpool<br />
UNITED KINGDOM<br />
leeroyshannon@hotmail.co.uk<br />
Sluijsmans, Dominique<br />
Open Universiteit Nederland<br />
PO Box 2960<br />
6401 DL Heerlen<br />
NETHERLANDS<br />
dominique.sluijsmans@ou.nl<br />
Stein, Margit<br />
Lehrstuhl für Sozialpädagogik und<br />
Gesundheitspädagogik<br />
Kath. Universität Eichstätt-Ingolstadt<br />
Schießstättberg 5<br />
85072 Eichstätt<br />
GERMANY<br />
margit.stein@gmx.net<br />
Saniter, Andreas<br />
ITB Uni Bremen<br />
Am Fallturm 1 Pf. 330440<br />
28334 Bremen<br />
GERMANY<br />
asaniter@uni-bremen.de<br />
Schofield, Mark<br />
Edge Hill Univesity<br />
St Helens Road<br />
L39 4QP Lancashire<br />
UNITED KINGDOM<br />
schom@edgehill.ac.uk<br />
Sit, Pou Seong<br />
Faculty of Education<br />
University of Macau<br />
J520<br />
Taipa Macao<br />
CHINA<br />
pssit@umac.mo<br />
Smee, Sydney<br />
Medical Council of Canada<br />
2283 St. Laurent Blvd<br />
K1G 5A2 Ottawa<br />
CANADA<br />
sydney@mcc.ca<br />
Stern, Thomas<br />
University of Klagenfurt<br />
Schottenfeldg. 29<br />
1070 Wien<br />
AUSTRIA<br />
thomas.stern@uni-klu.ac.at<br />
ENAC 2008 165
Strijbos, Jan-Willem<br />
Universiteit Leiden<br />
Fac. Sociale Wetenschappen<br />
Postbus 9555<br />
2300 RB Leiden<br />
NETHERLANDS<br />
jwstrijbos@fsw.leidenuniv.nl<br />
Tigelaar, Dineke<br />
ICLON-Leiden University<br />
Graduate School of Teaching<br />
PO Box 9555<br />
2300 RB Leiden<br />
NETHERLANDS<br />
DTigelaar@iclon.leidenuniv.nl<br />
van der Pol, Coosje<br />
Tilburg University<br />
Retiesheike 16<br />
2460 Kasterlee<br />
THE NETHERLANDS<br />
j.a.vdrpol@uvt.nl<br />
Walker, Mirabelle<br />
The Open University<br />
Communication & Systems Dept.<br />
MCT Faculty<br />
Walton Hall<br />
MK7 6AA Milton Keynes<br />
UNITED KINGDOM<br />
c.m.walker@open.ac.uk<br />
Wilhelm, Oliver<br />
Humboldt-Universität zu Berlin<br />
IQB<br />
Unter den Linden 6<br />
10099 Berlin<br />
GERMANY<br />
oliver.wilhelm@rz.hu-berlin.de<br />
Strivens, Janet<br />
The University of Liverpool<br />
Y Graig, Llandegla<br />
LL11 3BG Wrexham<br />
UNITED KINGDOM<br />
strivens@liv.ac.uk<br />
van den Boogaard, Sylvia<br />
FIsme<br />
Utrecht University<br />
Aidadreef 12<br />
3561 GE Utrecht<br />
NETHERLANDS<br />
s.vandenboogaard@fi.uu.nl<br />
van Zundert, Marjo<br />
Open Universiteit Nederland<br />
Postbus 2960<br />
6401 DL Heerlen<br />
THE NETHERLANDS<br />
marjo.vanzundert@ou.nl<br />
Webb, David<br />
University of Colorado at Boulder<br />
249 UCB<br />
80309 Boulder<br />
THE UNITED STATES<br />
dcwebb@colorado.edu<br />
Wiliam, Dylan<br />
University of London<br />
20 Bedford Way<br />
WC1H 0AL London<br />
UNITED KINGDOM<br />
d.wiliam@ioe.ac.uk<br />
Tal, Tali<br />
Technion –<br />
Israel Institute of Technology<br />
30 Ella st<br />
25147 Kefar Veradim<br />
ISRAEL<br />
rtal@technion.ac.il<br />
van den Heuvel-Panhuizen, Marja<br />
FIsme, Utrecht University<br />
Aidadreef 12, 3561 GE Utrecht<br />
NETHERLANDS<br />
m.vandenheuvel@fi.uu.nl<br />
IQB, Humboldt-Universität zu Berlin<br />
Unter den Linden 6, 10099 Berlin<br />
GERMANY<br />
heuvelpm@IQB.hu-berlin.de<br />
Vernon, Julia<br />
The University of Northampton<br />
Park Campus, Boughton Green Rd<br />
Northampton NN2 7AL<br />
UNITED KINGDOM<br />
julia.vernon@northampton.ac.uk<br />
Whitelock, Denise<br />
The Open University<br />
Institute of Educational Technology<br />
Walton Hall<br />
MK7 6AA Milton Keynes<br />
UNITED KINGDOM<br />
d.m.whitelock@open.ac.uk<br />
Wirtz, Markus Antonius<br />
University of Education<br />
Department of Psychology<br />
Kunzenweg 21<br />
79117 Freiburg<br />
GERMANY<br />
markus.wirtz@ph-freiburg.de<br />
166 ENAC 2008