Scientific reasoning and writing skills are ubiquitous processes in science and therefore common goals of science curricula, particularly in higher education. Providing the individualized feedback necessary for the development of these skills is often costly in terms of faculty time, particularly in large science courses common at research universities. Past educational research literature suggests that the use of peer review may accelerate students’ scientific reasoning skills without a concurrent demand on faculty time per student. Peer review contains many elements of effective pedagogy such as peer-peer collaboration, repeated practice at evaluation and critical thinking, formative feedback, multiple contrasting examples, and extensive writing. All of these pedagogies may contribute to improvement in students’ scientific reasoning.
The effect of peer review on scientific reasoning was assessed using three major data sources: student performance on written lab reports, student performance on an objective Scientific Reasoning Test (Lawson, 1978) and student perceptions of the process of peer review in the scientific community as well as the classroom. In addition, the need to measure student performance across multiple science classes resulted in the development of a Universal Rubric for Laboratory Reports. The reliability of this instrument and its effect on the grading consistency of graduate teaching assistants were also tested. A spplication of the Universal Rubric to student laboratory reports across multiple biology classes revealed that the Rubric is further useful as a programmatic assessment tool. The Rubric highlighted curricular gaps and strengths as well as measuring student achievement over time.
This study demonstrated that even university freshman were effective and consistent peer reviewers and produced feedback that resulted in meaningful improvement in their science writing. Use of peer review accelerated the development of students’ scientific reasoning abilities as measured both by laboratory reports (n = 142) and by the Scientific Reasoning Test (n= 389 biology majors) and this effect was stronger than the impact of several years of university coursework. The structure of the peer review process and the structure of the assignments used to generate the science laboratory reports had notable influence on student performance however. Improvements in laboratory reports were greatest when the peer review process emphasized the generation of concrete and evaluative written feedback and when assignments explicitly incorporated the rubric criteria. The rubric was found to be reliable in the hands of graduate student teaching assistants (using generalizability analysis, g = 0.85) regardless of biological course content (three biology courses, total n = 142 student papers). Reliability increased as the number of criteria incorporated into the assignment increased. Consistent use of Universal Rubric criteria in undergraduate courses taught by graduate teaching assistants produced laboratory report scores with reliability values similar to those reported for other published rubrics and well above the reliabilities reported for professional peer review.
Lastly, students were overwhelmingly positive about peer review (83% average positive response, n = 1,026) reporting that it improved their writing, editing, researching and critical thinking skills. Interestingly, students reported that the act of giving feedback was equally useful to receiving feedback. Students connected the use of peer review in the classroom to its role in the scientific community and characterized peer review as a valuable skill they wished to acquire in their development as scientists.
Peer review is thus an effective pedagogical strategy for improving student scientific reasoning skills. Specific recommendations for classroom implementation and use of the Universal Rubric are provided. Use of laboratory reports for assessing student scientific reasoning and application of the Universal Rubric across multiple courses, especially for programmatic assessment, is also recommended.