"Inter-rater variability as mutual disagreement: identifying raters’ divergent points of view" by A. Gingerich

Selected Works of Susan E Ramlo

Follow Contact

Article

Inter-rater variability as mutual disagreement: identifying raters’ divergent points of view

Advances in Health Sciences Education (2016)

A. Gingerich
Susan E Ramlo

Download

Abstract
Whenever multiple observers provide ratings, even of the same performance,
inter-rater variation is prevalent. The resulting ‘idiosyncratic rater variance’ is considered
to be unusable error of measurement in psychometric models and is a threat to the
defensibility of our assessments. Prior studies of inter-rater variation in clinical assessments
have used open response formats to gather raters’ comments and justifications. This
design choice allows participants to use idiosyncratic response styles that could result in a
distorted representation of the underlying rater cognition and skew subsequent analyses. In
this study we explored rater variability using the structured response format of Q
methodology. Physician raters viewed video-recorded clinical performances and provided
Mini Clinical Evaluation Exercise (Mini-CEX) assessment ratings through a web-based
system. They then shared their assessment impressions by sorting statements that described
the most salient aspects of the clinical performance onto a forced quasi-normal distribution
ranging from ‘‘most consistent with my impression’’ to ‘‘most contrary to my impression’’.
Analysis of the resulting Q-sorts revealed distinct points of view for each performance
shared by multiple physicians. The points of view corresponded with the ratings physicians
assigned to the performance. Each point of view emphasized different aspects of the
performance with either rapport-building and/or medical expertise skills being most salient.
It was rare for the points of view to diverge based on disagreements regarding the
interpretation of a specific aspect of the performance. As a result, physicians’ divergent
points of view on a given clinical performance cannot be easily reconciled into a single
coherent assessment judgment that is impacted by measurement error. If inter-rater variability
does not wholly reflect error of measurement, it is problematic for our current measurement models and poses challenges for how we are to adequately analyze performance assessment ratings.

Keywords

inter-rater reliability,
Q methodology,
work place assessment,
rater-based assessment

Disciplines

Publication Date

2016

DOI

10.1007/s10459-016-9711-8

Citation Information

Gingerich, A., Ramlo, S., van der Vleuten, C.P.M., Eva, K. W., & Regehr,G. (2017). Inter-rater variability as mutual disagreement: Identifying raters' divergent points of view. Advances in Health Sciences Education, 22, pp 819–838. DOI:10.1007/s10459-016-9711-8. Available as view only at http://rdcu.be/ks51.