Comparative judgement could help examiners assess students using real-world activities

Education technology is often thought of in terms of classroom interventions such as interactive whiteboards and tablet-equipped classrooms. But new technology also has the potential to transform the way we think about assessment – and indirectly to improve student outcomes in the process. In particular, comparative judgement, a technologically-enhanced form of assessment, could help to solve one of the trickiest dilemmas in assessing student work.Many teachers and examiners would like to assess students using more authentic and real-world types of activities, such as essays and projects. However, such tasks are fiendishly hard to mark reliably. It is surprisingly tricky to ensure that a set of essays on Romeo and Juliet, for example, will be awarded the same grade by a group of different markers. One way of trying to ensure reliability is to create very detailed rubrics and mark schemes which aim to standardise the judgements of different markers.

The problem with such rubrics is that they can end up stereotyping pupils’ responses to the task. Genuinely brilliant and original responses to the task fail because they don’t meet the rubric, while responses that have been heavily coached achieve top grades because they tick all the boxes. Relying on rubrics helps to address the problem of unreliable markers, but it does so at the expense of meaning. We get more reliable scores, but they don’t allow us to make valid inferences about the things we really care about.

Is there a way of solving this dilemma – of creating original assessments that can be marked reliably without compromising their authenticity? Comparative judgment offers an elegant solution to this problem. It simply asks a marker to make a series of judgements about pairs of tasks. Take the example of an essay on Romeo and Juliet: with comparative judgment, the examiner looks at two essays, and decides which one is better. Then they look at another pair, and decide which one is better. And so on. It is relatively quick and easy to make such judgments – much easier and quicker than marking one individual essay.  All of these decisions can then be combined to create a score for each student.

The comparative judgement engine designed by No More Marking allows for this process to happen quickly and efficiently – far more quickly than the typical moderation session, and with greater reliability too. Even better than that, this process allows teachers to focus on genuine quality when they are teaching, rather than tick-box mark schemes.


Author: Daisy Christodoulou
Head of Assessment at Ark Schools
Author of Seven Myths about Education
@daisychristo | Blog